从ArrayList的ArrayList中删除重复项

Question

从ArrayList的ArrayList中删除重复项

7

我遇到了一个问题，我相信解决方案非常简单，但是我找不到它。我有一个包含小的ArrayList 的 ArrayList，这些列表包含String类型的元素。我想将这些小的列表合并成一个，并删除重复项。让我清楚地说明一下。

我有这个：

[[USA, Maine], [USA, Maine, Kennebunk], [USA, Maine, North Berwick], 
[USA, New Hampshire], [USA, Keene, New Hampshire], [USA, Keene, New 
Hampshire, Main Street], [USA, New Hampshire, Swanzey]].

这是我的主列表，里面有一些小列表。我想要一个最终的 ArrayList，它将合并所有小列表并删除重复项。

我的需求是：

[USA, Maine, Kennebunk, North Berwick, New Hampshire , Keene, Main Street, Swanzey]

非常感谢您的帮助。谢谢。

- hristoforidisc

6个回答

3

使用集合（Set）很容易实现（Set不允许重复的值）

public List<String> merge(List<List<String>> list) {
    Set<String> uniques = new HashSet<>();
    for(List<String> sublist : list) {
        uniques.addAll(sublist);
    }
    return new ArrayList<>(uniques);
}

p.s. 如果您希望合并的列表是有序的，请将HashSet更改为TreeSet，如下所示：Set<String> uniques = new TreeSet<>();

- fxrbfg

请注意，TreeSet 使用元素的自然顺序（或在构造时传递的其他Comparator），而不是插入顺序。这意味着结果按字母顺序排序，而不是原始列表中的顺序。 - siegi

1

传统解决方案：

Set<String> result = new LinkedHashSet<>();
for (List<String> innerList : filmingLocations) result.addAll(innerList);

作为result是一个LinkedHashSet，它保留插入顺序，因此元素的顺序将与内部列表中的顺序相同。

您还可以使用等效的Java 8解决方案：

Set<String> result = new LinkedHashSet<>();
filmingLocations.forEach(result::addAll);

甚至可以使用基于Java 8流的解决方案：

Set<String> result = filmingLocations.stream()
    .flatMap(List::stream)
    .collect(Collectors.toCollection(LinkedHashSet::new));

- fps

1

谢谢，我知道JDK中有一些保留插入顺序的Set，但我从来没有记住它的名字（而且在Collection或Set的Javadoc中也没有提到它，尽管对于HashSet、TreeSet甚至是SortedSet都有提到）:-P - siegi

0

如果你的目标是< Java 8，你可以创建您最终的ArrayList实例，让我们称其为“resultList”。然后遍历每个内部ArrayLists并仅添加这些Strings，其中包含()方法返回false。这只是一种解决方案，如果您必须使用ArrayList作为最终集合。否则，您应该考虑使用HashSet，它自动保存唯一值并摆脱任何重复的对象。如果您需要将ArrayList用作结果集合，则以下代码可能会对您有所帮助：

ArrayList<ArrayList<String>> sourceList = new ArrayList<>();
        // Adding sample ArrayLists ("a" and "b") of Strings to sourceList:
        ArrayList<String> a = new ArrayList<>();
        a.add("USA");
        a.add("Maine");
        sourceList.add(a);
        ArrayList<String> b = new ArrayList<>();
        b.add("USA");
        b.add("Maine");
        b.add("Kennebunk");
        sourceList.add(b);
        ArrayList<String> resultList = new ArrayList<>();
        for(ArrayList<String> outerList : sourceList) {
            for(String str : outerList) {
                // If resultList doesn't contain currently checked string...
                if(!(resultList.contains(str))) {
                    // Add this string to resultList...
                    resultList.add(str);
                }
            }
        }
        System.out.println(resultList.toString());

你得到的输出：[美国，缅因州，肯尼邦克]

- Przemysław Moskal

是的，我必须使用ArrayList作为我的最终集合。关于代码，你有什么想法吗？ - hristoforidisc

你可以使用两个嵌套的foreach循环。在外部循环中，你迭代外部ArrayList，在内部循环中，你迭代每个内部列表。在内部循环体中，你检查最终的ArrayList是否已经包含了这个字符串，如果没有，就把它添加到ArrayList中。 - Przemysław Moskal

0

解决方案：
循环遍历您的ArrayList of ArrayLists中的每个字符串，并使用ArrayList的.contains()方法将该字符串添加到另一个ArrayList中，如果它尚未在此列表中。

代码：

  public ArrayList<String> merge(ArrayList<ArrayList<String>> startArrayList) {
    ArrayList<String> finalArrayList = new ArrayList<String>();
    //Iterate over each element
    for (ArrayList<String> innerList:startArrayList) {
      for (String value:innerList) {
        //add the String if it is missing
        if (!finalArrayList.contains(value))
          finalArrayList.add(value);
      }
    }
    return finalArrayList;
  }

- ProfBits

0

我看到了这篇帖子，不得不回答一下，Berwick/Kennebunk是我住过的城镇。你是本地人吗？

无论如何，最简单的方法是使用上面提到的集合操作。这可以保证一些O(log n)的搜索。

public List<String> mergeTowns (List<List<String>> list) {
    Set<String> uniques = new HashSet<>();
    for(List<String> sublist : list) {
        uniques.addAll(sublist);
    }
    return new ArrayList<>(uniques);
}

如果你需要一个更加动态的数据结构，可以使用一个以国家为键，城镇为值的映射。这样，如果你决定通过不同的国家来建立一个大型的城镇数据库，你就可以通过国家来搜索映射并显示城镇。也许可以使用州名代替国家作为键。

生成的数据结构将会像这样：

[USA = [berwick, kennebunk, north berwick, wells], CANADA = [berwick, kennebunk, north berwick, wells], MEXICO = [berwick, kennebunk, north berwick, wells]]

这种数据结构的构建方式可以防止在同一国家/州中出现重复的城镇条目。

public class Merge {


    private static ArrayList<String> mergeMap(HashMap<String, Set> map) {
        ArrayList<String> data = new ArrayList();
        for(Entry<String, Set> entries : map.entrySet()){
            String country = entries.getKey();
            Set<String> towns = entries.getValue();
            data.add(country+" = "+towns);
        }
        return data;
    }



    public static void main(String[] args) {
        //Mock data
        String[] countrys = {"USA", "CANADA", "MEXICO"};

        //Try this way of building your data structure instead of an array list of array list. 
        HashMap<String,Set> map = new HashMap<String,Set>();
        TreeSet<String> towns = new TreeSet<String>();

        // Add a couple towns to your set of towns
        towns.add("berwick");
        towns.add("north berwick");
        towns.add("kennebunk");
        towns.add("kennebunk");
        towns.add("kennebunk");
        towns.add("kennebunk");
        towns.add("wells");
        towns.add("wells");

        //With a map you could push a different set of towns to different countries
        for(String country: countrys){
            map.put(country, towns);
        }

        //Pass in your map<Country, Towns>
        ArrayList<String> mergedValues = mergeMap(map);
    }
}

- John Hanewich

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- siegi · Accepted Answer

这是一个简洁的解决方案，使用 Stream 类：

listOfLists.stream().flatMap(List::stream).collect(Collectors.toSet())

注意，结果的类型为Set。这会处理删除重复项。

如果你需要一个List，可以使用以下代码：

listOfLists.stream()
           .flatMap(List::stream)
           .distinct()
           .collect(Collectors.toList())

请注意，这甚至保证元素的顺序是稳定的，即[["foo","bar"],["bar","abc","foo"]]将始终以此顺序产生["foo","bar","abc"]。大多数使用Set的解决方案不能保证这一点，因为它们中的大多数都没有排序。