在Python中如何在 < n^2的时间内查找任意大小的列表的重复项？

Question

在Python中如何在 < n^2的时间内查找任意大小的列表的重复项？

3

如何在Python中找到$n^2$以下的列表中的重复项？我不能像对于所有标准类型那样使用字典以线性时间完成。我只能想到以下解决方案：

arr = [[1,2], [1,2,4], [1,2], [5,6], [8], [8]]
unique_arr = []
dups = []

for item in arr:
    for item2 in unique_arr:
         if (item == item2).all():
            dups.append(item)
            continue
    unique_arr.append(item)

dups 期望的结果为 [[1,2], [8]]

谢谢

- YohanRoth

我只能想到以下的解决方案：- 为什么？ - RomanPerekhrest

2

你可以将列表转换为可哈希的元组，因为它们只包含整数。元组是可哈希的。 - user3483203

排一下这个列表怎么样？ - Norhther

@Tyger 是吗？在那个问题中，大多数答案都是针对具有此格式 [value，counter] 的情况。我有一个任意大小的列表... - YohanRoth

4个回答

0

我不明白为什么你需要遍历内部列表... 你可以直接遍历外部列表。

arr = [[1,2], [1,2,4], [1,2], [5,6], [8], [8]]
unique_arr = []
dups = []

for item in arr:
    if item not in unique_arr:
        unique_arr.append(item)
    else:
        unique_arr.append(item)

- jdub

我认为“不在”功能也是通过循环实现的。因此它就是相同的东西。 - YohanRoth

1

这仍然具有相同的复杂度，只是速度更快，因为列表比较可能是用C而不是Python编写的，并使用正确的算法。 - Cireo

@Cireo 哦，我明白了。但实际的时间复杂度是相同的，而我正在寻找 smth < n^2。 - YohanRoth

0

这里还有一个解决方案：

arr = [[1,2], [1,2,4], [1,2], [5,6], [8], [8]]

dic = {}
dups = []

for ele in arr:
    try:
        if dic[str(ele)] is 1:
            dups.append(ele)
    except:
        dic[str(ele)] = 1


print(dups)

输出：

[[1, 2], [8]]

- Er. Harsh Rathore

从列表前面弹出的操作是O(n)，所以这将是O(n^2)吗？ - Manvir

它仍然是O(n^2)，因为if popped in arr:是O(n)。 - Manvir

@Tyger 现在复杂度怎么样？ - Er. Harsh Rathore

@Tyger 访问或插入字典数据的复杂度为O(1)。 - Er. Harsh Rathore

我很欣赏您的毅力，你可能想看一下这个链接：https://dev59.com/XnI-5IYBdhLWcg3weoPR - Manvir

好的，我同意你的说法。被接受代码的时间是24.07005401099923，而我的代码时间是26.193061662999753。 - Er. Harsh Rathore

0

虽然你不能使用列表作为字典键，但你可以使用元组。

arr = [[1,2], [1,2,4], [1,2], [5,6], [8], [8]]
dups = []
found = set()
for item in arr:
    tup = tuple(item)
    if tup in found:
        dups.append(list(tup))
    found.add(tup)
print(dups)

- Malcolm

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Andrej Kesely · Accepted Answer

使用 collections.Counter 的一种可能解决方案：

arr = [[1,2], [1,2,4], [1,2], [5,6], [8], [8]]

from collections import Counter

print([[*i] for i,c in Counter(map(tuple, arr)).items() if c > 1])

输出：

[[1, 2], [8]]

使用 itertools.groupby 和 sorted 的版本：

from itertools import groupby
print([v for v, g in groupby(sorted(arr, key=len)) if any(i > 0 for i, _ in enumerate(g))])

输出：

[[8], [1, 2]]