基于元素比较，从嵌套列表中删除项目（子列表）。

Question

基于元素比较，从嵌套列表中删除项目（子列表）。

pythonfor-loopnested-lists

3

这是我的第一篇帖子，所以我希望我没有重复提出任何问题（但我已经检查过了）。

以下是问题：

我有一个列表，其中包含4个元素的子列表，例如[[10,1,3,6],[22,3,5,7],[2,1,4,7],[44,3,1,0]]

我想要做的是：

1）删除所有第四个元素等于零的元素，例如 [44,3,1,0] （容易部分）

2）删除具有相同第二个元素的项目，仅保留具有最大第一个元素的项目，例如[[10,1,3,6],[2,1,4,7]] -> [10,1,3,6]

我一直在尝试使用嵌套循环和第二个列表来保存我想要保留的元素，但我似乎无法做到。

是否有优雅的解决方案可供我使用？

- Orestis

2

在您的第一个情况中，列表没有第四个子元素等于0。 - Rohit Jain

我不知道你的确切解决方案是什么，但我有一种感觉itertools会有所帮助。 - Burhan Khalid

1

不是的。在你原始的列表中，没有第4个元素等于0的。 - sizzzzlerz

4个回答

2

你可以使用 itertools.groupby：

from itertools import groupby
from operator import itemgetter as ig

data = [[10,1,3,6],[22,3,5,7],[2,1,4,7],[44,3,1,0]]

# filter and sort by main key
valid_sorted = sorted((el for el in data if el[3] != 0), key=ig(1))
# ensure identical keys have highest first element first
valid_sorted.sort(key=ig(0), reverse=True)
# group by second element
grouped = groupby(valid_sorted, ig(1))
# take first element for each key
selected = [next(item) for group, item in grouped]
print selected
# [[22, 3, 5, 7], [10, 1, 3, 6]]

或者使用一个dict:

d = {}
for el in valid_sorted: # doesn't need to be sorted - just excluding 4th == 0
    d[el[1]] = max(d.get(el[1], []), el)
print d.values()
# [[10, 1, 3, 6], [22, 3, 5, 7]]

- Jon Clements

现在我正在重新审视我的代码……我们需要排序吗？它的结果不是被后面的排序所抵消了吗？ - Orestis

没有“following sort” - 你能解释一下吗？ - Jon Clements

valid_sorted是通过按指定键排序“data”列表中没有0作为第四个数字的元素来构建的。下一个排序是根据不同的键以相反的顺序就地完成的。我可能漏掉了什么，但我们不是根据不同的键对已经排序（在上一步中）的列表进行排序吗？在我的看来，如果我在sorted语句中使用key=ig(0)，reverse=True，并完全跳过sort语句，结果似乎是相同的。我可能错了，sort可能正在进行第二级别的排序。 - Orestis

1

如果您不关心最终列表的顺序，可以按第二项排序，并使用生成器查找第一项的最大值：

l = [[10,1,3,6],[22,3,5,7],[2,1,4,7],[44,3,1,0]]

remove_zeros_in_last = filter(lambda x: x[3] != 0, l)

ordered_by_2nd = sorted(remove_zeros_in_last, key=lambda x: x[1])

def group_equal_2nd_by_largest_first(ll):
    maxel = None
    for el in ll:
        if maxel is None:
            maxel = el  # Start accumulating maximum
        elif el[1] != maxel[1]:
            yield maxel
            maxel = el
        elif el[0] > maxel[0]:
            maxel = el  # New maximum
    if maxel is not None:
        yield maxel     # Don't forget the last item!

print list(group_equal_2nd_by_largest_first(ordered_by_2nd))

# gives [[10, 1, 3, 6], [22, 3, 5, 7]]

- JohnJ

1

这是第二部分：

from itertools import product

lis = [[10, 1, 3, 6], [22, 3, 5, 7], [2, 1, 4, 7]]
lis = set(map(tuple, lis))   #create a set of items of lis
removed = set()             #it will store the items to be removed

for x, y in product(lis, repeat=2):
    if x != y:
        if x[1] == y[1]:
            removed.add(y if x[0] > y[0] else x)

print "removed-->",removed

print lis-removed       #final answer

输出：

removed--> set([(2, 1, 4, 7)])
set([(22, 3, 5, 7), (10, 1, 3, 6)])

- Ashwini Chaudhary

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- bentrevor · Accepted Answer

如果listA是您的原始列表，而listB是您的新列表，那么似乎可以通过遍历listA来解决第二部分问题，检查当前元素（嵌套列表）是否包含重复的第二个元素，如果包含，则比较第一个元素，以确定哪个嵌套列表保留在listB中。因此，伪代码如下：

如果 listA 是原始列表，listB 是新列表，则：对于每个 nestedList in listA: 如果 nestedList 的第二个元素有重复：比较这些 nestedList 的第一个元素，保留第一个元素更大的 nestedList 否则：将 nestedList 添加到 listB 中

sizeOfListA = # whatever the original size is
sizeOfListB = 0

for i in (sizeOfListA):
  for j in (sizeOfListB):
    if listA[i][1] == listB[j][1]:  # check if second element is a duplicate
      if listA[i][0] > listB[j][0]: # check which has the bigger first element
        listB[j] = listA[i]
    else:   # if second element is unique, append nested list and increment size
      listB.append(listA[i])
      sizeOfListB += 1

这仅适用于第二部分。像Burhan的评论一样，我相信有更优雅的方法来完成此操作，但我认为这可以完成工作。此外，问题没有说明当第一个元素相等时会发生什么，因此也需要考虑到这一点。