这是我的数据:
group <- c(1,1,1,1,2,2,2,3,3,4,4,4,4)
X1 <- c("A","A","A","A","B","A","B","A","A","B","B","B","B")
X2 <- c("A","A","A","A","B","B","B","A","A","B","B","A","A")
X3 <- c("B","A","A","A","B","B","B","B","B","B","B","B","B")
X4 <- c("A","A","A","B","B","B","A","A","A","B","A","B","B")
X5 <- c("A","A","A","A","B","B","B","A","A","A","B","B","B")
X6 <- c("A","A","A","A","B","A","B","A","A","B","B","A","A")
mydf <- data.frame (group, X1, X2, X3, X4, X5, X6)
因此数据是:
group X1 X2 X3 X4 X5 X6
1 1 A A B A A A
2 1 A A A A A A
3 1 A A A A A A
4 1 A A A B A A
5 2 B B B B B B
6 2 A B B B B A
7 2 B B B A B B
8 3 A A B A A A
9 3 A A B A A A
10 4 B B B B A B
11 4 B B B A B B
12 4 B A B B B A
13 4 B A B B B A
现在我需要将第一行与组内的其余行进行比较。
group X1 X2 X3 X4 X5 X6
1 1 A A B A A A
2 1 A A A A A A
TRUE TRUE FALSE TRUE TRUE TRUE
这里只有X3不匹配。6个中的1个 = 1/6 = 17%
同样地,将3与第一组中的第一个进行比较。
group X1 X2 X3 X4 X5 X6
1 1 A A B A A A
3 1 A A A A A A
匹配率 = 17%
同时将第四个元素与第一组的第一个元素进行比较。
group X1 X2 X3 X4 X5 X6
1 1 A A B A A A
4 1 A A A B A A
不匹配 = 2/6 = 34%
对于第2组(即具有行号为5和6的第1行组),同样如此。
group X1 X2 X3 X4 X5 X6
5 2 B B B B B B
6 2 A B B B B A
不匹配 = 2/6 = 34%
同理:
group X1 X2 X3 X4 X5 X6
5 2 B B B B B B
7 2 B B B A B B
不匹配率 = 1/6 = 17%
我的试验:
match (mydf[1,], mydf[2,])
match (mydf[1,], mydf[3,])