如何在R中检查两个对象(例如dataframes)是否相等?
所谓“值相等”,指的是一个dataframe中每行每列的数值与另一个dataframe相应位置的数值相等。
如何在R中检查两个对象(例如dataframes)是否相等?
所谓“值相等”,指的是一个dataframe中每行每列的数值与另一个dataframe相应位置的数值相等。
测试两个dataframe是否"值相等"并不清楚,但是为了测试值是否相同,这里有两个非完全相同但值相等的dataframe示例:
a <- data.frame(x = 1:10)
b <- data.frame(y = 1:10)
测试所有值是否相等:
all(a == b) # TRUE
测试对象是否相同(它们不相同,它们有不同的列名):
identical(a,b) # FALSE: class, colnames, rownames must all match.
identical
要返回 true,不仅必须匹配值和列名,还必须匹配行号/名称。(当我使用 subset() 时遇到此问题; 结果发现 all 才是我想要的。) - Darren Cookidentical(sort(a), sort(b))
。 - Abe此外,相同仍然有用,并支持实际目标:
identical(a[, "x"], b[, "y"]) # TRUE
compare
来测试对象的名称和值是否相同,只需一步即可。a <- data.frame(x = 1:10)
b <- data.frame(y = 1:10)
library(compare)
compare(a, b)
#FALSE [TRUE]#objects are not identical (different names), but values are the same.
如果我们只关心值的相等性,我们可以设置ignoreNames=TRUE
compare(a, b, ignoreNames=T)
#TRUE
# dropped names
compareEqual
和compareIdentical
。arsenal
包中的comparedf
函数。df1 <- data.frame(id = paste0("person", 1:3),
a = c("a", "b", "c"),
b = c(1, 3, 4))
> df1
id a b
1 person1 a 1
2 person2 b 3
3 person3 c 4
df2 <- data.frame(id = paste0("person", 4:1),
a = c("c", "b", "a", "f"),
b = c(1, 3, 4, 4),
d = paste0("rn", 1:4))
> df2
id a b d
1 person4 c 1 rn1
2 person3 b 3 rn2
3 person2 a 4 rn3
4 person1 f 4 rn4
library(arsenal)
comparedf(df1, df2)
Compare Object
Function Call:
comparedf(x = df1, y = df2)
Shared: 3 non-by variables and 3 observations.
Not shared: 1 variables and 0 observations.
Differences found in 2/3 variables compared.
0 variables compared have non-identical attributes.
有可能获取更详细的摘要
。
summary(comparedf(df1, df2))
all.equal(df1, df2)
。[1] "Attributes: < Component “row.names”: Numeric: lengths (3, 4) differ >"
[2] "Length mismatch: comparison on first 3 components"
[3] "Component “id”: Lengths (3, 4) differ (string compare on first 3)"
[4] "Component “id”: 3 string mismatches"
[5] "Component “a”: Lengths (3, 4) differ (string compare on first 3)"
[6] "Component “a”: 2 string mismatches"
[7] "Component “b”: Numeric: lengths (3, 4) differ"
structure_df1 <- sapply(df1, function(x) paste(class(x), attributes(x), collapse = ""))
structure_df2 <- sapply(df2, function(x) paste(class(x), attributes(x), collapse = ""))
all(structure_df1 == structure_df2)