基于元组中的一个值，从元组列表中删除重复项

Question

基于元组中的一个值，从元组列表中删除重复项

3

我有一个格式为(float,string)的元组列表。如何从具有相同float值的列表中删除重复项？

该列表按浮点数降序排序。我想保留顺序。

[(0.10507038451969995,
  'Deadly stampede in Shanghai - Emergency personnel help victims.'),
 (0.078586381821416265,
  'Deadly stampede in Shanghai - Police and medical staff help injured people after the stampede.'),
 (0.072031446647399661, '- Emergency personnel help victims.'),
 (0.072031446647399661, 'Emergency personnel help victims.')]

请看最后两个。

- Abhishek Bhatia

哦，为什么会有负评呢？如果这个问题已经在其他地方被问过，请告诉我。 - Abhishek Bhatia

3个回答

4

您可以创建一组已知值，并仅在该值不在seen中时添加元组:

>>> lst
[(0.10507038451969995,
 'Deadly stampede in Shanghai - Emergency personnel help victims.'),
 (0.078586381821416265,
 'Deadly stampede in Shanghai - Police and medical staff help injured people after the stampede.'),
 (0.072031446647399661, '- Emergency personnel help victims.'),
 (0.072031446647399661, 'Emergency personnel help victims.')]

>>> seen = set()
>>> result = []
>>> for a, b in lst:
...    if not a in seen:
...        seen.add(a)
...        result.append((a, b))
>>> print result

[(0.10507038451969995, 'Deadly stampede in Shanghai - Emergency personnel help victims.'), 
 (0.07858638182141627, 'Deadly stampede in Shanghai - Police and medical staff help injured people after the stampede.'),  
 (0.07203144664739966, '- Emergency personnel help victims.')]

这里有另一种使用推导式的方法：

```python 代码示例 ```

这种方法可以简化代码并使其更易于阅读。

>>> seen = set()
>>> [(a, b) for a, b in lst if not (a in seen or seen.add(a))]

- Ozgur Vatansever

2

>>> L = [(0.10507038451969995, 'Deadly stampede in Shanghai - Emergency personnel help victims.'),
...  (0.078586381821416265, 'Deadly stampede in Shanghai - Police and medical staff help injured people after the stampede.'),
...  (0.072031446647399661, '- Emergency personnel help victims.'),
...  (0.072031446647399661, 'Emergency personnel help victims.')]

>>> from collections import OrderedDict
>>> OrderedDict(L).items()
[(0.10507038451969995, 'Deadly stampede in Shanghai - Emergency personnel help victims.'),
 (0.07858638182141627, 'Deadly stampede in Shanghai - Police and medical staff help injured people after the stampede.'),
 (0.07203144664739966, 'Emergency personnel help victims.')]

- John La Rooy

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Russia Must Remove Putin · Accepted Answer

由于您已经对这些值进行了排序，因此可以使用itertools.groupby。以下是数据：

>>> lot
[(0.10507038451969995, 'Deadly stampede in Shanghai - Emergency personnel help victims.'), 
(0.07858638182141627, 'Deadly stampede in Shanghai - Police and medical staff help injured people after the stampede.'), 
(0.07203144664739966, '- Emergency personnel help victims.'), 
(0.07203144664739966, 'Emergency personnel help victims.')]

演示：

>>> import itertools
>>> [next(t) for _, t in itertools.groupby(lot, lambda x: x[0])]
[(0.10507038451969995,
  'Deadly stampede in Shanghai - Emergency personnel help victims.'),
 (0.07858638182141627,
  'Deadly stampede in Shanghai - Police and medical staff help injured people after the stampede.'),
 (0.07203144664739966, '- Emergency personnel help victims.')]

这将给你一组中的第一个值。