在Python中仅比较多个列表中的每个元素一次。

5

我的一些代码有问题。我想让我的代码比较一个包含多个列表的列表中的2个列表,但每个列表只比较一次。

resultList = [
    ['Student1', ['Sport', 'History']],
    ['Student2', ['Math', 'Spanish']],
    ['Student3', ['French', 'History']],
    ['Student4', ['English', 'Sport']],
]

for list1 in resultList:
    for list2 in resultList:
        i = 0
        for subject in list1[1]:
            if subject in list2[1]:
                if list2[1].index(subject) >= list1[1].index(subject):
                    i+=1
                else:
                    i+=2
        print(list1[0] + ' - ' + list2[0] + ' : ' + str(i))

这将打印:

Student1 - Student1 : 2
Student1 - Student2 : 0
Student1 - Student3 : 1
Student1 - Student4 : 1
Student2 - Student1 : 0
Student2 - Student2 : 2
Student2 - Student3 : 0
Student2 - Student4 : 0
Student3 - Student1 : 1
Student3 - Student2 : 0
Student3 - Student3 : 2
Student3 - Student4 : 0
Student4 - Student1 : 2
Student4 - Student2 : 0
Student4 - Student3 : 0
Student4 - Student4 : 2

And i would like this result :

Student1 - Student1 : 2
Student1 - Student2 : 0
Student1 - Student3 : 1
Student1 - Student4 : 1
Student2 - Student2 : 2
Student2 - Student3 : 0
Student2 - Student4 : 0
Student3 - Student3 : 2
Student3 - Student4 : 0
Student4 - Student4 : 2

感谢您的帮助!涉及IT技术的相关内容。
6个回答

4

我会使用 itertools.combinations_with_replacement 或者 itertools.combinations 来实现:

In [1]: resultList = [
   ...:     ['Student1', ['Sport', 'History']],
   ...:     ['Student2', ['Math', 'Spanish']],
   ...:     ['Student3', ['French', 'History']],
   ...:     ['Student4', ['English', 'Sport']],
   ...: ]
   ...:

In [2]: import itertools
In [3]: new_result = itertools.combinations_with_replacement(resultList, 2)
In [4]: for lists_tuple in new_result:
    ...:     list1, list2 = lists_tuple
    ...:     i = 0
    ...:     for subject in list1[1]:
    ...:         if subject in list2[1]:
    ...:             if list2[1].index(subject) >= list1[1].index(subject):
    ...:                 i+=1
    ...:             else:
    ...:                 i+=2
    ...:     print(list1[0] + ' - ' + list2[0] + ' : ' + str(i))
    ...:
    ...:
Student1 - Student1 : 2
Student1 - Student2 : 0
Student1 - Student3 : 1
Student1 - Student4 : 1
Student2 - Student2 : 2
Student2 - Student3 : 0
Student2 - Student4 : 0
Student3 - Student3 : 2
Student3 - Student4 : 0
Student4 - Student4 : 2

combinations

如果你决定不想将每个列表与自身进行比较(Student1 - Student1),请将combinations_with_replacement更改为combinations,这样您就可以得到列表中不同元素之间的比较:

Student1 - Student2 : 0
Student1 - Student3 : 1
Student1 - Student4 : 1
Student2 - Student3 : 0
Student2 - Student4 : 0
Student3 - Student4 : 0

1
谢谢你,作为一个初学者,这似乎有点难,所以我采用了@Proyag的解决方案。 - user11510021
没问题。itertools 是 Python 中非常有用的模块,你可以随时尝试它的一些方法。 - AdamGold

2
你可以使用 set.intersection 来比较这些列表:
resultList = [
    ['Student1', ['Sport', 'History']],
    ['Student2', ['Math', 'Spanish']],
    ['Student3', ['French', 'History']],
    ['Student4', ['English', 'Sport']],
]

s = set()
for list1 in resultList:
    for list2 in resultList:
        i = tuple(sorted([list1[0], list2[0]]))
        if i in s:
            continue
        s.add(i)
        print(list1[0], list2[0], len(set(list1[1]).intersection(list2[1])))

输出:

Student1 Student1 2
Student1 Student2 0
Student1 Student3 1
Student1 Student4 1
Student2 Student2 2
Student2 Student3 0
Student2 Student4 0
Student3 Student3 2
Student3 Student4 0
Student4 Student4 2

2
这个想法与@yatu的回答类似,但不是手动计数,而是使用enumerate函数,并且仅迭代list1中当前索引之后的list2部分。如果您想避免1-1 2-2对,只需使用resultList[idx+1:]替代resultList[idx:]即可。"最初的回答"
resultList = [                                                                                                                 
    ['Student1', ['Sport', 'History']],                                                                                        
    ['Student2', ['Math', 'Spanish']],                                                                                         
    ['Student3', ['French', 'History']],                                                                                       
    ['Student4', ['English', 'Sport']],                                                                                        
]                                                                                                                              

for idx, list1 in enumerate(resultList):                                                                                       
    for list2 in resultList[idx:]:                                                                                             
        i = 0                                                                                                                  
        for subject in list1[1]:                                                                                               
            if subject in list2[1]:                                                                                            
                if list2[1].index(subject) >= list1[1].index(subject):                                                         
                    i+=1                                                                                                       
                else:                                                                                                          
                    i+=2                                                                                                       
        print(list1[0] + ' - ' + list2[0] + ' : ' + str(i))

1

相对于原始版本,这个版本有两行修改和两行新增:

resultList = [
['Student1', ['Sport', 'History']],
['Student2', ['Math', 'Spanish']],
['Student3', ['French', 'History']],
['Student4', ['English', 'Sport']],
]

for index1 in range(0,len(resultList)):
    for index2 in range(index1, len(resultList)):
        i = 0
        list1 = resultList[index1]
        list2 = resultList[index2]
        for subject in list1[1]:
            if subject in list2[1]:
                if list2[1].index(subject) >= list1[1].index(subject):
                    i+=1
                else:
                    i+=2
        print(list1[0] + ' - ' + list2[0] + ' : ' + str(i))

谢谢你的帮助,我发现@Proyag的解决方案非常简单。 - user11510021

1

您应该更清晰地表达,

只需使用2个for循环和3行代码:

for i in resultList:
    for j in resultList[resultList.index(i):]:
            print(str(i[0]) + '-' + str(j[0]) + ' : ' + str((np.isin(i[1], j[1]) == True).sum()))

结果:

Student1-Student1 : 2
Student1-Student2 : 0
Student1-Student3 : 1
Student1-Student4 : 1
Student2-Student2 : 2
Student2-Student3 : 0
Student2-Student4 : 0
Student3-Student3 : 2
Student3-Student4 : 0
Student4-Student4 : 2

你好


1
然而,在目标系统上可能无法使用 numpy。它可以用简单的列表推导式中的 in 构造替换,例如 [x in j[1] for x in i[1]] - trolley813
是的,你说得完全正确,如果numpy不可用,使用in是很好的选择。 - Hippolyte BRINGER

-1

我认为,你应该更清晰地表达 :)

也许可以尝试使用面向对象编程的方式:

import itertools.combinations

class Student:
    def __init__(self, name, subjects):
        self.name = name
        self.subjects = subjects

    def compare_to(self, another_student):
        result = 0
        for subject_my in self.subjects:
            for subject_he in another_student.subjects:
                if self.subjects.index(subject_my) >= \   
                        another_student.subjects.index(subject_he):
                    result+=1
                else:
                    result+=2
       return result

resultList = [
    Student('Student1', ['Sport', 'History']),
    Student('Student2', ['Math', 'Spanish']),
    Student('Student3', ['French', 'History']),
    Student('Student4', ['English', 'Sport'])
    ]

for first, secound in itertools.combinations(resultList):
    print("{} - {} : {}".format(first.name, secound.name, first.compare_to(secound)))

我认为,代码应该更加清晰易读 :). 当然,选择权在你手中。 清晰的代码将简化你发现问题的机会。


非常感谢您的回答!这是一个非常好的方法。 - user11510021
我的错。我使用了组合而不是另一个带有重复的itertools,限制为2个元素。无论如何,如果你改变... itertools.combinations_with_replacement(resultList, 2),这将工作得很好 :) - Guaz

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接