查找与另一个键至少有n个元素相同的键，包含列表

Question

查找与另一个键至少有n个元素相同的键，包含列表

10

我有一个LinkedHashMap，其中包含<String, List<T>>。我正在构建Map，因此可能有更好的方法来组织所有数据。

我试图获取具有共同列表的键，每个列表中至少有2个共同元素。

例如：

Map
----------------------
| Key | Values       |
----------------------
| M1  | [A1, A3]     |
| M2  | [A1, A2, A3] |
| M3  | [A1, A2]     |
| M4  | [A2, A3]     |
----------------------

最终，我希望得到这个列表：[ [M2, M3], [M2, M4], [M1, M2] ] - M2和M3都包含A1和A2 - M2和M4都包含A2和A3 - M1和M2都包含A1和A3

我陷入了困境，试图找出如何将我的第一个条目的值与所有其他条目的值进行比较。一直重复此过程，直到到达映射的末端（就像列表的双重for循环一样）。

目前我的解决方案是（但我绝对觉得可能有更好的方法）。

List<String> keyList = new ArrayList<>(myMap.keySet());
for(int i = 0 ; i < keyList.size()-1 ; i++) {
    String keyA = keyList.get(i);
    List<T> valuesA = myMap.get(keyA);

    for(int j = 1 ; j < keyList.size() ; j++) {
        String keyB = keyList.get(j);
        List<T> valuesB = myMap.get(keyB);

        // compare both lists here
    }
}

使用Map是正确的方法吗？

目前性能不是问题。但获得更平滑的东西始终是更好的。

- kanadianDri3

这些值是特定的还是随机的？ - Shubham Bhewanewala

@ShubhamBhewanewala：这有什么关系吗？ - Nikolas Charalambidis

如果值的数量有限，它就会。 - Shubham Bhewanewala

@ShubhamBhewanewala 很抱歉，我的电脑崩溃了并且离开了度假... 值是具体的。我有一份文件，其中包含1篇文章中的x个单词（单词为Mx，文章为Ay）。因此，对于一些选定的文章Ay，我有Mx。或者反过来，我有Mx个单词，正在Ay文章中搜索（希望这样说得清楚）。 - kanadianDri3

5个回答

0

您可以采用以下方法：

1)迭代映射中每个键的值（这将是一个列表）

2)从上述迭代的下一个索引开始另外进行一次迭代，直到结束

3)对于＃1列表中的每个元素，使用contains方法检查它是否在＃2列表中

a) Finish iterating list #3 as soon as two identical objects are found 

b) Iterate list in #3 till last but one if no element one

c) Iterate list in #3 till last if one element found

希望能对你有所帮助

- Aman Chhabra

你是如何确保“从下一个索引开始另一次迭代”，因为没有Map实现允许使用索引迭代？你指的是：https://dev59.com/JWYr5IYBdhLWcg3wkK6w吗？ - Nikolas Charalambidis

0

使用 Map<String, Set<T>> 而不是 List。这样你自己的算法就能完美运行。

Set<T> common = new HashSet<>(valuesA);
common.retainAll(valuesB);
if (common.size() > 1) ...

列表可用于地图，甚至common也可以，但不够高效，不够合适。

- Joop Eggen

0

好的，这是你需要的。我使用了String而不是Generic：

static Set<String> getCommonKeySet() {
        Map<String, List<String>> map = new HashMap<>();
        Set<String> listOfKeysWithCommonValueSet = new HashSet<>();

        map.forEach((key, value) -> {
            map.entrySet().forEach(entry1 -> {
                List<String> list = new ArrayList<>(value);
                List<String> list1 = new ArrayList<>(entry1.getValue());
                list.retainAll(list1);
                if (list.size() >= 2)
                    listOfKeysWithCommonValueSet.add(key);
            });
        });
        return listOfKeysWithCommonValueSet;
    }

编辑：

要返回 Set<Set<String>>，请使用以下代码：

static Set<Set<String>> getCommonKeySet() {

    Map<String, List<String>> map = new LinkedHashMap<>();
    map.put("M1", Arrays.asList("A1", "A3"));
    map.put("M2", Arrays.asList("A1", "A2", "A3"));
    map.put("M3", Arrays.asList("A1", "A2"));
    map.put("M4", Arrays.asList("A2", "A3"));
    //Map<String, List<String>> map = new HashMap<>();
    Set<Set<String>> listOfKeysWithCommonValueSet = new HashSet<>();

    map.forEach((key, value) -> {
        map.entrySet().forEach(entry1 -> {
            if(!entry1.getKey().equals(key))
            {
                List<String> list = new ArrayList<>(value);
                List<String> list1 = new ArrayList<>(entry1.getValue());
                list.retainAll(list1);
                if (list.size() >= 2)
                {
                    Set<String> set = new HashSet<>();
                    set.add(key);
                    set.add(entry1.getKey());
                    listOfKeysWithCommonValueSet.add(set);
                }
            }
        });
    });

    System.out.println(listOfKeysWithCommonValueSet);
    return listOfKeysWithCommonValueSet;
}

输出：

[[M1，M2]，[M2，M3]，[M2，M4]]

- Shubhendu Pramanik

即使您使用OP的Map作为输入，该解决方案的输出为[M1，M2，M3，M4]，与OP的期望[[M2，M3]，[M2，M4]，[M1，M2]]显着不同。 - Nikolas Charalambidis

@Nikolas 我只是给了一个提示，而不是完整的解决方案。无论如何，我们需要使用 Set<Set>。我会更新它。 - Shubhendu Pramanik

仍然不正确，会产生[[M1]，[M2]，[M3]，[M4]，[M1，M2]，[M2，M3]，[M2，M4]]。内部应该是if (set.size() > 1) { listOfKeysWithCommonValueSet.add(set); }。 - Nikolas Charalambidis

@Nikolas，没错，我需要跳过相同的键：）这不是经过测试的代码。谢谢...我会更新它的:) - Shubhendu Pramanik

0

基本思路是执行以下步骤：

1）原始输入映射。

+------+--------------+
| Keys | Values       |
+------+--------------+
| M1   | [A1, A3]     |
+------+--------------+
| M2   | [A1, A2, A3] |
+------+--------------+
| M3   | [A1, A2]     |
+------+--------------+
| M4   | [A2, A3]     |
+------+--------------+

2) 以以下方式扩展映射。思路是将值分成子集，以便我们可以根据值对键进行分组。例如：[A1，A3] 在 M1 和 M2 中都存在，表示 [A1，A3] 在两个列表中都是共同的。

+------+--------------+-----------------------------+
| Keys | Values       |                             |
+------+--------------+-----------------------------+
| M1   | [A1, A3]     |                             |
+------+--------------+-----------------------------+
| M2   | [A1, A2, A3] | Expanding this entry        |
+------+--------------+ to create a mapping         +
| M2   | [A1, A2]     | of this key i.e. M2 with    |
+------+--------------+ all the possible            +
| M2   | [A1, A3]     | combinations of the         |
+------+--------------+ original value [A1, A2, A3] +
| M2   | [A2, A3]     |                             |
+------+--------------+-----------------------------+
| M3   | [A1, A2]     |                             |
+------+--------------+-----------------------------+
| M4   | [A2, A3]     |                             |
+------+--------------+-----------------------------+

3) 反转上述映射以便在新键上进行 groupBy 操作。

+--------------+--------+
| Keys         | Values |
+--------------+--------+
| [A1, A3]     | M1     |
+--------------+--------+
| [A1, A2, A3] | M2     |
+--------------+--------+
| [A1, A2]     | M2     |
+--------------+--------+
| [A1, A3]     | M2     |
+--------------+--------+
| [A2, A3]     | M2     |
+--------------+--------+
| [A1, A2]     | M3     |
+--------------+--------+
| [A2, A3]     | M4     |
+--------------+--------+

4) 一个以共同元素为键，对应键为其值的反转映射。

+--------------+---------+
| Keys         | Values  |
+--------------+---------+
| [A1, A3]     | [M1,M2] |
+--------------+---------+
| [A1, A2, A3] | [M2]    |
+--------------+---------+
| [A1, A2]     | [M2,M3] |
+--------------+---------+
| [A2, A3]     | [M2,M4] |
+--------------+---------+

代码：

然后以下代码会在第四步中创建上述提到的反向映射。

Map<List<Integer>, List<String>> invertedMap = inputMap.entrySet().stream()
    .flatMap(test::getAllCombinationsOfAList)
    .collect(Collectors.groupingBy(Entry::getKey,
        Collectors.mapping(Entry::getValue, Collectors.toList())));

其中test::getAllCombinationsOfAList是（使用{{link1：最多2^list_len次迭代}}）：

static Stream<Entry<ArrayList<Integer>, String>> getAllCombinationsOfAList(Entry<String,List<Integer>> set)
{
    int n = set.getValue().size();
    Builder<Entry<ArrayList<Integer>,String>> entryStream = Stream.builder();
    for (int i = 0; i < (1<<n); i++)
    {
        ArrayList<Integer> integers = new ArrayList<>();
        for (int j = 0; j < n; j++) {
            if ((i & (1 << j)) > 0)
                integers.add(set.getValue().get(j));
        }

        if(integers.size() >=2 ) {
            entryStream.accept(new AbstractMap.SimpleEntry<>(integers,set.getKey()));
        }

    }
    return entryStream.build();
}

那么res.entrySet().forEach(System.out::println);的输出结果是：

[1, 2, 3]=[M2]
[2, 3]=[M2, M4]
[1, 2]=[M2, M3]
[1, 3]=[M1, M2]

这里，key_list_length 表示匹配所需的最小公共元素数量。

value_list_length >=2 表示确实存在匹配。

因此，key_list_length >= 2 和 value_list_length >= 2 是您期望的输出。例如：

Map<List<Integer>, List<String>> filteredMap = invertedMap.entrySet()
    .stream()
    .filter(e -> e.getKey().size() >=2 && e.getValue().size() >= 2)
    .collect(Collectors.toMap(Entry::getKey,Entry::getValue));

filteredMap.entrySet().forEach(System.out::println);

输出：

[1, 2]=[M2, M3]
[2, 3]=[M2, M4]
[1, 3]=[M1, M2]

- Pankaj Singhal

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Nikolas Charalambidis · Accepted Answer

我注意到您需要 List<List<Output>>，其对应的结构为 [ [M2, M3], [M2, M4], [M1, M2] ]。

考虑相同的输入：

Map<String, List<String>> map = new LinkedHashMap<>();      
map.put("M1", Arrays.asList("A1", "A3"));
map.put("M2", Arrays.asList("A1", "A2", "A3"));
map.put("M3", Arrays.asList("A1", "A2"));
map.put("M4", Arrays.asList("A2", "A3"));

这里是可行的解决方案：

List<List<String>> output = new ArrayList<>();   // The output List
Set<String> keys = new HashSet<>();              // Key storage used to avoid comparison                            
                                                 // of the keys twice (M1-M2, M2-M1)

for (Entry<String, List<String>> entryOuter: map.entrySet()) {               // First iteration
    if (keys.add(entryOuter.getKey())) {                                     // Adds a new key
        for (Entry<String, List<String>> entryInner: map.entrySet()) {       // Second iteration 
            if (!keys.contains(entryInner.getKey())) {                       // To compare?
                List<String> common = new ArrayList<>(entryOuter.getValue());
                common.retainAll(new ArrayList<>(entryInner.getValue()));    // The common items
                if (common.size() > 1) {                                     // At least 2 common?
                    output.add(Arrays.asList(
                        entryOuter.getKey(), entryInner.getKey()));          // Add these keys
                }
            }
        }
    }       
}

调用System.out.println(output);会打印所需的结果:

[[M1，M2]，[M2，M3]，[M2，M4]]

简要描述了想法：

目标是仅一次使用每个键迭代不同的键-实现6次迭代。
使用Set<String> keys存储“已检查”的键。
当发生唯一组合时，找到共同值。
如果共同值的数量为2或更多，请将键作为一对添加到输出List中。
完成任务。

你已经标记了java-8，所以我建议您可能想使用java-stream，但这里并没有提供真正的好处。linkedhashmap将无法帮助您更轻松地使用索引进行迭代，除非您实现一个解决方法：如何基于索引而不是键从LinkedHashMap中获取值？