Java中ArrayList的交集和并集

Question

Java中ArrayList的交集和并集

javalistunionintersection

167

有没有任何方法可以做到这一点？我在搜索，但找不到任何内容。

另一个问题：我需要这些方法以便过滤文件。有些是 AND 过滤器，有些是 OR 过滤器（就像集合理论中的那样），因此我需要根据所有文件和保存这些文件的联合/交集 ArrayLists 进行筛选。

我应该使用不同的数据结构来保存这些文件吗？是否有其他东西能够提供更好的运行时？

- yotamoo

1

如果您不想创建一个新的列表，Vector.retainAll(Vector) 可以将原始向量修剪为仅与第二个向量相交的部分。 - user2808054

@user2808054 为什么用 Vector？自从Java 1.2以后，该类已经不再被推荐使用了。 - dimo414

@dimo414 我正在使用的一个接口（我没有选择）将东西返回为向量。我不知道它已经被弃用了！谢谢你的信息...被谁弃用了？我没有看到任何关于它被弃用的说明，所以这是一个惊喜。 - user2808054

1

从Javadocs中：“自Java 2平台v1.2起...建议使用ArrayList代替Vector。”。你唯一可能需要Vector的时候是用于跨线程交互，但对于这些用例，有更安全的数据结构可供选择。另请参见此问题。在2016年仍在使用Vector的任何库在我看来都非常可疑。 - dimo414

@dimo414 这是 IBM 的一个库，哈哈！（Lotus Domino 数据 API）。谢谢提供信息，非常有帮助。 - user2808054

24个回答

136

Collection（因此也包括ArrayList）具有以下特点：

col.retainAll(otherCol) // for intersection
col.addAll(otherCol) // for union

如果允许重复，使用List实现；如果不允许，则使用Set实现：

Collection<String> col1 = new ArrayList<String>(); // {a, b, c}
// Collection<String> col1 = new TreeSet<String>();
col1.add("a");
col1.add("b");
col1.add("c");

Collection<String> col2 = new ArrayList<String>(); // {b, c, d, e}
// Collection<String> col2 = new TreeSet<String>();
col2.add("b");
col2.add("c");
col2.add("d");
col2.add("e");

col1.addAll(col2);
System.out.println(col1); 
//output for ArrayList: [a, b, c, b, c, d, e]
//output for TreeSet: [a, b, c, d, e]

- lukastymo

3

有一个建议修改称这个集合“是不正确的，因为它会包含重复元素”。该编辑建议使用 HashSet 替代。 - Kos

5

实际上它已被编辑过，参见："如果允许重复，请使用列表实现；如果不允许，请使用集合实现：" - lukastymo

7

不，retainAll不是用于列表的交集。在上述例子中，col中所有不在otherCol中的元素都被删除。比如，如果otherCol是{a,b,b,c}，col是{b,b,b,c,d}，那么col最终会变成{b,b,b,c}，这并不严格等于两个列表的交集，我期望得到的结果是{b,b,c}。实际上执行的是一种不同的操作。 - demongolem

2

我也不明白 addAll() 如何对列表进行并集操作；它只是将第二个列表连接到第一个列表的末尾。如果第一个列表已经包含元素，那么并集操作会避免添加该元素。 - dimo414

91

这篇文章有些陈旧，但是在谷歌搜索这个话题时它是第一个弹出来的。

我想要提供一个使用Java 8流（基本上）以单行完成相同操作的更新：

List<T> intersect = list1.stream()
    .filter(list2::contains)
    .collect(Collectors.toList());

List<T> union = Stream.concat(list1.stream(), list2.stream())
    .distinct()
    .collect(Collectors.toList());

如果有更好/更快的解决方案，请告诉我，但这个解决方案是一个漂亮的一行代码，可以轻松地包含在方法中，而不需要添加不必要的帮助类/方法，并仍然保持可读性。

- steilerDev

26

这可能是一个不错的一行代码，但它需要 O(n^2) 的时间。将其中一个列表转换为 Set，然后使用 set 的 contains 方法。并非所有事情都必须使用流来完成。 - dimo414

36

list1.retainAll(list2) - is intersection

使用 removeAll 方法移除集合中的元素，再使用 addAll 方法将需要添加的元素加入集合。

在 collection 的文档中可以找到更多信息（ArrayList 是一个 collection） http://download.oracle.com/javase/1.5.0/docs/api/java/util/Collection.html

- The GiG

2

retainAll()和removeAll()在列表上都是O(n^2)的操作。我们可以做得更好。 - dimo414

2

我点了赞，但现在我有一个问题。{1, 2, 2, 3, 4, 5} 的 retainAll 在 {1, 2, 3} 上的结果是 {1, 2, 2, 3}。难道交集不应该是 {1, 2, 3} 吗？ - ghchoi

@ghchoi 现在问题在于列表和集合背后的语义。使用列表 [1, 2, 2, 3, 4, 5]，我们接受重复元素，但是对于集合 {1, 2, 3}，不允许重复元素。另外两种表示法一般不同，但不是固定的，对于列表[...允许重复...]，对于集合{...不允许重复...}。 - Jaja

22

只有集合才能定义并执行并集和交集操作，而不是列表。就像您所提到的那样。

请检查guava库以获取过滤器。此外，guava还提供了真正的交集和并集

 static <E> Sets.SetView<E >union(Set<? extends E> set1, Set<? extends E> set2)
 static <E> Sets.SetView<E> intersection(Set<E> set1, Set<?> set2)

- Stan Kurilin

15

你可以使用来自apache commons的CollectionUtils工具类。

- bluefoot

13

如果有人觉得这个答案太简短了，'CollectionUtils.containsAny'和'CollectionUtils.containsAll'是方法。 - Sebastian

3

很奇怪，Apache Commons的CollectionUtils不支持泛型。 - Vasyl Sarzhynskyi

1

针对并集使用 CollectionUtils.union(collection1, collection2);，针对交集使用 CollectionUtils.intersection(collection1, collection2);。 - Prasannjeet Singh

8

标记的解决方案不是高效的。它的时间复杂度为O(n^2)。我们可以做的是对两个列表进行排序，然后执行下面的交集算法。

private  static ArrayList<Integer> interesect(ArrayList<Integer> f, ArrayList<Integer> s) { 
    ArrayList<Integer> res = new ArrayList<Integer>();

    int i = 0, j = 0; 
    while (i != f.size() && j != s.size()) { 

        if (f.get(i) < s.get(j)) {
            i ++;
        } else if (f.get(i) > s.get(j)) { 
            j ++;
        } else { 
            res.add(f.get(i)); 
            i ++;  j ++;
        }
    }


    return res; 
}

这个算法的复杂度为O(n log n + n)，即O(n log n)。联合操作以类似的方式完成。只需确保在if-elseif-else语句上进行适当修改即可。如果需要，您也可以使用迭代器（我知道它们在C++中更有效率，但我不知道在Java中是否也是如此）。

- AJed

1

不够通用，T 可能不是可比较的，在某些情况下比较是昂贵的... - Boris Churzin

不是泛泛而谈，我完全同意。比较操作很耗费资源，你有什么解决方法？ - AJed

很遗憾，你没有回答我的问题。让我重新表述一下，如果有一个成本函数c(n)，那么O(n^2)怎么会更好呢？ - AJed

为什么是c(n)？以ArrayList<List<String>>为例，它的运行时间复杂度为O(n*m)*c(n)。 - Boris Churzin

1

将一个输入转换为集合并在循环中调用contains()（如Devenv所建议的）需要O(n + m)时间。排序是不必要的复杂操作，需要O(n log n + m log n + n)时间。尽管这可以简化为O(n log n)时间，但仍然比线性时间更差，并且更加复杂。 - dimo414

显示剩余3条评论

5

JAVA 8以来的一行代码

合并（Union）

如果没有重复项：

  return concat(a.stream(), b.stream()).collect(toList());

联合和去重：

  return concat(a.stream(), b.stream()).distinct().collect(toList());

如果集合/集合是返回类型，则使用union和distinct：

  return concat(a.stream(), b.stream()).collect(toSet());

交集

如果没有重复项：

  return a.stream().filter(b::contains).collect(toList());

性能: 如果集合 b 很大并且不是 O(1)，则在 return 之前添加 1 行代码，通过将其复制到 HasSet (import java.util.Set;) 来预先优化过滤性能：

... b = Set.copyOf(b);

交集和去重:

  return a.stream().distinct().filter(b::contains).collect(toList());

- 导入

导入静态类：java.util.stream.Stream.concat;
导入静态方法：java.util.stream.Collectors.toList;
导入静态方法：java.util.stream.Collectors.toSet;

- epox

4

你可以使用commons-collections4的CollectionUtils。请参考CollectionUtils文档。

Collection<Integer> collection1 = Arrays.asList(1, 2, 4, 5, 7, 8);
Collection<Integer> collection2 = Arrays.asList(2, 3, 4, 6, 8);

Collection<Integer> intersection = CollectionUtils.intersection(collection1, collection2);
System.out.println(intersection); // [2, 4, 8]

Collection<Integer> union = CollectionUtils.union(collection1, collection2);
System.out.println(union); // [1, 2, 3, 4, 5, 6, 7, 8]

Collection<Integer> subtract = CollectionUtils.subtract(collection1, collection2);
System.out.println(subtract); // [1, 5, 7]

- xxg

4

这里有一个使用流进行交集操作的方法（请记住需要使用Java 8的流）：

List<foo> fooList1 = new ArrayList<>(Arrays.asList(new foo(), new foo()));
List<foo> fooList2 = new ArrayList<>(Arrays.asList(new foo(), new foo()));
fooList1.stream().filter(f -> fooList2.contains(f)).collect(Collectors.toList());

一个不同类型列表的示例。如果 foo 和 bar 之间存在关系，并且您可以从 foo 获取 bar 对象，则可以修改您的流：

List<foo> fooList = new ArrayList<>(Arrays.asList(new foo(), new foo()));
List<bar> barList = new ArrayList<>(Arrays.asList(new bar(), new bar()));

fooList.stream().filter(f -> barList.contains(f.getBar()).collect(Collectors.toList());

- Deutro

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- adarshr · Accepted Answer

这里是一种不使用任何第三方库的简单实现。相较于retainAll、removeAll和addAll方法，其主要优点在于这些方法不会修改传入方法的原始列表。

public class Test {

    public static void main(String... args) throws Exception {

        List<String> list1 = new ArrayList<String>(Arrays.asList("A", "B", "C"));
        List<String> list2 = new ArrayList<String>(Arrays.asList("B", "C", "D", "E", "F"));

        System.out.println(new Test().intersection(list1, list2));
        System.out.println(new Test().union(list1, list2));
    }

    public <T> List<T> union(List<T> list1, List<T> list2) {
        Set<T> set = new HashSet<T>();

        set.addAll(list1);
        set.addAll(list2);

        return new ArrayList<T>(set);
    }

    public <T> List<T> intersection(List<T> list1, List<T> list2) {
        List<T> list = new ArrayList<T>();

        for (T t : list1) {
            if(list2.contains(t)) {
                list.add(t);
            }
        }

        return list;
    }
}