如何在两列中查找重复数据，而第三列的数据不同。

Question

如何在两列中查找重复数据，而第三列的数据不同。

rdataframeduplicatessubsetconditional-formatting

5

我想要查找具有相同x，y（位置）但不同ID的两行（或更多行）。

在下表中，我只想了解最后两行。

x	y	id
1	2	1
1	2	1
1	3	4
2	3	1
2	3	2

# example data
x <- read.table(text = "x   y   id
1   2   1
1   2   1
1   3   4
2   3   1
2   3   2", header = TRUE)

- Emu

3个回答

4

另一种方法是使用dplyr：

x %>% 
  group_by(x, y) %>% 
  filter(n_distinct(id) > 1)

# A tibble: 2 x 3
# Groups:   x, y [1]
      x     y    id
  <int> <int> <int>
1     2     3     1
2     2     3     2

- Maël

2

使用 data.table

library(data.table)
i1 <- setDT(x)[, .I[uniqueN(id) > 1], .(x, y)]$V1
x[i1]
       x     y    id
   <int> <int> <int>
1:     2     3     1
2:     2     3     2

- akrun

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- zx8754 · Accepted Answer

按两列分组，在第三列上计算唯一值的数量，如果超过1，则进行子集筛选：

x[ ave(x[, "id"], x[, c("x", "y") ], FUN = function(i) length(unique(i))) > 1, ]
#   x y id
# 4 2 3  1
# 5 2 3  2