如何查找两个字典列表之间的差异并检查键值对。

Question

如何查找两个字典列表之间的差异并检查键值对。

3

我已经搜索了解决我的问题的方法，但没有成功。我的问题的一部分解决方案在这里，但这并不能完全解决问题。

我有两个字典列表，每个字典都写入了一个csv文件，但我将内容读取到以下变量中：

list1 = [{a:1, b:2, c:3}, {a:4, b:5, c:6}, {a:7, b:8, c:9}]
list2 = [{b:2, a:1, c:3}, {c:6, b:5, a:4}, {b:8, a:7, c:9}]

使用上面链接中提供的解决方案，即：

>>> import itertools

>>> a = [{'a': '1'}, {'c': '2'}]
>>> b = [{'a': '1'}, {'b': '2'}]
>>> intersec = [item for item in a if item in b]
>>> sym_diff = [item for item in itertools.chain(a,b) if item not in intersec]

我没有任何匹配结果，因为字典的顺序不同。但实际上，两个列表是相同的。我该如何检查这一点？我需要在写入csv文件之前对字典进行排序吗？这可能是一个解决方法吗？

这是我目前的主要问题，但我还有另一个问题。如果能够忽略我定义的一个或多个键进行匹配检查将会很好。这也可能吗？

编辑：我有一个csv文件中的字典，并使用以下代码读取它们：

def read_csv_file(self, filename):
    '''Read CSV file and return its content as a Python list.'''
    f = open(filename, 'r')
    csvfile = csv.reader(f)
    f.close
    return [row for row in csvfile]

这非常重要，因为我认为问题在于从csv读取值后它们不再是字典，所以顺序必须保持一致。

编辑2：csv文件示例（3行，它创建了一个空行，但这不是问题...）

"{u'Deletion': '0', u'Source': 'Not Applicable', u'Status': ''}"

"{u'Deletion': '0', u'Source': 'Not Applicable', u'Status': ''}"

- zephirus

你能否提供一个csv文件的示例，以便查看其格式？ - Iron Fist

如果您正在从此CSV文件中读取每一行，则list2是用于什么的？ - Iron Fist

所以如果我理解正确的话，您正在从两个CSV文件中读取并将字典存储在列表中，然后您想要获取这两个列表之间的差异和交集，对吗？ - Iron Fist

抱歉！我的问题出在我在两台不同的机器上分别创建了CSV文件。在创建第二个文件后，我读取了两个文件并进行了比较。其余的故事你已经知道了... - zephirus

让我们在聊天中继续这个讨论。点击此处进入聊天室。 - zephirus

显示剩余2条评论

3个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Leb · Answer 1

你需要仔细检查你的代码。我没有遇到你提出的问题。

list1 = [{a:1, b:2, c:3}, {a:4, b:5, c:6}, {a:7, b:8, c:9}]
list2 = [{b:2, a:1, c:3}, {c:6, b:5, a:4}, {b:8, a:7, c:9}]

list1 = [{'a':1, 'b':2, 'c':3}, {'a':4, 'b':5, 'c':7}, {'a':7, 'b':8, 'c':9}]
list2 = [{'b':2, 'a':1, 'c':3}, {'c':6, 'b':2, 'a':4}, {'b':8, 'a':7, 'c':9}]
intersec = [item for item in list1 if item in list2]
sym_diff = [item for item in itertools.chain(list1,list2) if item not in intersec]

print(intersec)
print(sym_diff)

>>>[{'a': 1, 'c': 3, 'b': 2}, {'a': 4, 'c': 6, 'b': 5}, {'a': 7, 'c': 9, 'b': 8}]
>>>>[]

如果我改变list1和list2（中间的字典）：

list1 = [{'a':1, 'b':2, 'c':3}, {'a':7, 'b':5, 'c':2}, {'a':7, 'b':8, 'c':9}]
list2 = [{'b':2, 'a':1, 'c':3}, {'c':6, 'b':5, 'a':4}, {'b':8, 'a':7, 'c':9}]

运行相同的代码：

[{'a': 1, 'c': 3, 'b': 2}, {'a': 7, 'c': 9, 'b': 8}]
[{'a': 7, 'c': 2, 'b': 5}, {'a': 4, 'c': 6, 'b': 5}]

提供的链接中的代码似乎运行良好。在Python中，字典或列表的顺序并不重要。

- Iron Fist · Answer 2

这个解决方案的一部分是在我们上次的聊天对话中由 OP 发现的，它是使用 ast 模块将字符串转换为字典。

现在，使用这个模块将 csv.reader() 读取的每一行都转换成字典，因为它返回一个字符串列表，在 OP 的 CVS 文件情况下可能是一个字符串列表，然后将这个字典添加到列表中。之后使用带有 itertools.chain 的列表推导式，我们可以得到两个列表之间的差异。

import csv
import ast
import itertools

def csvToList(myCSVFile):

    '''This function is used to convert strings returned by csv.reader() into List of dictionaries'''

        f = open(myCSVFile, 'r')
        l = []
        try:
            reader = csv.reader(f)
            for row in reader:
                if row: #as you mentioned in your 2nd edit that you could have empty rows.
                    l.append(ast.literal_eval(row[0]))
        finally:
            f.close()        
        return l

list1 = csvToList('myCSV1.csv')
list2 = csvToList('myCSV2.csv')

l1_sub_l2  = [d for d in list1 if d not in list2]
l2_sub_l1  = [d for d in list2 if d not in list1]
list_difference = list(itertools.chain(l1_sub_l2, l2_sub_l1))

- CodingPenguins · Answer 3

-1

在返回值中使用字典推导式而不是列表推导式。

- CodingPenguins

我认为我已经在Python中把它做好了。不幸的是，我也需要它能在jython上运行，但它给我这个错误：java.lang.ClassFormatError: Invalid method Code length 240222 in class file org/python/pycode/_pyx2java.lang.ClassFormatError: java.lang.ClassFormatError: Invalid method Code length 240222 in class file org/python/pycode/_pyx2 - zephirus

我已经成功解决了Jython问题，但是比较方面仍然有困难。我尝试在将它们添加到CSV之前按字母顺序对字典进行排序，但没有成功。 - zephirus