删除具有连续重复元素的元素

Question

删除具有连续重复元素的元素

66

我想到的解决方案是：

list = [1,1,1,1,1,1,2,3,4,4,5,1,2]
i = 0

while i < len(list)-1:
    if list[i] == list[i+1]:
        del list[i]
    else:
        i = i+1

输出：

[1, 2, 3, 4, 5, 1, 2]

我想这样也没问题。

于是我很好奇，想试试能否删除连续重复的元素并得到以下输出：

[2, 3, 5, 1, 2]

为此我做了这个：

list = [1,1,1,1,1,1,2,3,4,4,5,1,2]
i = 0
dupe = False

while i < len(list)-1:
    if list[i] == list[i+1]:
        del list[i]
        dupe = True
    elif dupe:
        del list[i]
        dupe = False
    else:
        i += 1

不过这似乎有点笨拙，不太符合 Python 风格，你有更聪明/更优雅/更高效的方法来实现吗？

- Trufa

1

对于非常长的列表，请考虑使用NumPy：在NumPy数组中删除重复项。 - Georgy

9个回答

31

使用纯Python编写的一行代码

[v for i, v in enumerate(your_list) if i == 0 or v != your_list[i-1]]

- Ulf Aslak

15

如果您使用Python 3.8+版本，您可以使用赋值表达式:=:

list1 = [1, 2, 3, 3, 4, 3, 5, 5]

prev = object()
list1 = [prev:=v for v in list1 if prev!=v]

print(list1)

打印：

[1, 2, 3, 4, 3, 5]

- Andrej Kesely

4

一种“懒惰”的方法是使用itertools.groupby。

import itertools

list1 = [1, 2, 3, 3, 4, 3, 5, 5]
list1 = [g for g, _ in itertools.groupby(list1)]
print(list1)

输出

[1, 2, 3, 4, 3, 5]

- DeepSpace

3

你可以使用zip_longest()和列表推导式来实现这个功能。

from itertools import zip_longest 
list1 = [1, 2, 3, 3, 4, 3, 5, 5].
     # using zip_longest()+ list comprehension       
     res = [i for i, j in zip_longest(list1, list1[1:]) 
                                                            if i != j] 
        print ("List after removing consecutive duplicates : " +  str(res))

- Geno C

2

这里有一个不依赖外部包的解决方案：

最初的回答

list = [1,1,1,1,1,1,2,3,4,4,5,1,2] 
L = list + [999]  # append a unique dummy element to properly handle -1 index
[l for i, l in enumerate(L) if l != L[i - 1]][:-1] # drop the dummy element

然后我注意到Ulf Aslak的解决方案更加简洁 :)，最初的回答。

- Oleg Melnikov

1

为了消除列表元素的连续重复项，您可以使用itertools.zip_longest()和列表推导式作为替代方法：

>>> from itertools import zip_longest

>>> my_list = [1,1,1,1,1,1,2,3,4,4,5,1,2]
>>> [i for i, j in zip_longest(my_list, my_list[1:]) if i!=j]
[1, 2, 3, 4, 5, 1, 2]

- Moinuddin Quadri

1

以上有很多更好/更符合Python风格的答案，然而也可以使用list.pop()来完成这个任务：

my_list = [1, 2, 3, 3, 4, 3, 5, 5]
for x in my_list[:-1]:
    next_index = my_list.index(x) + 1
    if my_list[next_index] == x:
        my_list.pop(next_index)

输出

[1, 2, 3, 4, 3, 5]

- plum 0

0

另一种可能的单行代码，使用functools.reduce（不包括导入） - 缺点是字符串和列表需要稍微不同的实现：

>>> from functools import reduce

>>> reduce(lambda a, b: a if a[-1:] == [b] else a + [b], [1,1,2,3,4,4,5,1,2], [])
[1, 2, 3, 4, 5, 1, 2]

>>> reduce(lambda a, b: a if a[-1:] == b else a+b, 'aa  bbb cc')
'a b c'

- Yuri Feldman

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- John La Rooy · Accepted Answer

>>> L = [1,1,1,1,1,1,2,3,4,4,5,1,2]
>>> from itertools import groupby
>>> [key for key, _group in groupby(L)]
[1, 2, 3, 4, 5, 1, 2]

对于第二部分

>>> [k for k, g in groupby(L) if len(list(g)) < 2]
[2, 3, 5, 1, 2]

如果你不想创建临时列表只是为了获取长度，你可以在生成器表达式上使用sum函数。

>>> [k for k, g in groupby(L) if sum(1 for i in g) < 2]
[2, 3, 5, 1, 2]