比较两个字典列表的特定字段

Question

比较两个字典列表的特定字段

4

我有两个包含字典的列表。我想比较这些字典中的某些字段。

current_list = [{"name": "Bill","address": "Home", "age": 23, "accesstime":11:14:01}, 
            {"name": "Fred","address": "Home", "age": 26, "accesstime":11:57:43},
            {"name": "Nora","address": "Home", "age": 33, "accesstime":11:24:14}]

backup_list = [{"name": "Bill","address": "Home", "age": 23, "accesstime":13:34:24}, 
           {"name": "Fred","address": "Home", "age": 26, "accesstime":13:34:26},
           {"name": "Nora","address": "Home", "age": 33, "accesstime":13:35:14}]

清单/字典应按顺序相同，我只想比较某些键值对，例如姓名、地址、年龄，并忽略访问时间，但目前的比较是每个键/对都在比较。所以我只想比较

current_list:dictionary[0][name] -> backup_list:dictionary[0][name] and then 
current_list:dictionary[0][address] -> backup_list:dictionary[0][address]

以及其他相关内容。

for x in current_list:
    for y in backup_list:
        for k, v in x.items():
            for kk, vv in y.items():
                if k == kk:
                    print("Match: {0}".format(kk))
                    break
                elif k != kk:
                    print("No match: {0}".format(kk))

当前输出

Match name with name
No Match address with name
Match address with address
No Match age with name
No Match age with address
Match age with age
No Match dateRegistered with name
No Match dateRegistered with address
No Match dateRegistered with age
Match dateRegistered with dateRegistered

首选输出

Match name with name
Match address with address
Match age with age

由于需求变更，我的列表成为了Elementtree xml元素的列表。

因此，与上面的列表不同，它变成了：

backup_list =  ["<Element 'New' at 0x0000000002698C28>, <Element 'Update' at 0x0000000002698CC8>, <Element 'New' at 0x0000000002698CC8>"]

在这里，ElementTree是包含xml元素的一个对象：

{"name": "Nora", "address": "Home", "age": 33, "dateRegistered": 20140812}"

所以下面的答案似乎已经基本满足了我的要求：

value_to_compare = ["name", "address", "age"]
for i, elem in enumerate(current_list):
    backup_dict = backup_list[i]
    if elem.tag == "New":
        for key in value_to_compare:
            try:
                print("Match {0} {1} == {2}:".format(key, backup_dict.attrib[key], elem.attrib[key]))
            except KeyError:
                print("key {} not found".format(key))
            except:
                raise
    else:
        continue

- John

我刚刚发现我不能使用字典列表，因为还有其他的条件要考虑。所以它实际上必须是一个XML元素列表，例如：[<Element 'New' at 0x000000000267BE08>, <Element 'Update' at 0x000000000267BEA8>, <Element 'New' at 0x000000000267AE08>]。RomainL的解决方案接近我需要的。 - John

7个回答

2

已经有人制作了一个叫做 deepdiff 的模块，可以完成这个任务，而且还能做很多其他事情！请参考这个答案以获得详细的解释！

首先 - 安装它

pip install deepdiff

然后 - 享受

#of course import it
from deepdiff import DeepDiff

current_list, backup_list = [...], [...] #values stated in question.

for c, b in zip(current_list, backup_list):
    dif = DeepDiff(c, b)
    for key in ["name", "age", "address"]:
        try:
            assert dif['values_changed'][f"root['{key}'"]
            #pass the below line to exclude any non-matching values like your desired output has
            print(f"No Match {key} with {key}")
        except KeyError:
            print(f"Match {key} with {key}")

结果：- 如预期

Match name with name
Match address with address
Match age with age
Match name with name
Match address with address
Match age with age
Match name with name
Match address with address
Match age with age

最终提示

这个模块还有很多其他的用途，例如type更改、key更改/删除/添加、广泛的text比较和搜索等。绝对值得一看。

祝你的项目顺利！

- Jab

1

简单地与此进行比较-

for current in current_list:
    for backup in backup_list:
        for a in backup:
            for b in current:
                if a == b:
                    if a == "name" or a== "age" or a== "address" :
                        if backup[a] == current[b]:
                            print (backup[a])
                            print (current[b])

- vipul gangwar

0

你可以用这段代码来比较所有对应的字段：

for dct1, dct2 in zip(current_list, backup_list):
    for k, v in dct1.items():
        if k == "accesstime":
            continue
        if v == dct2[k]:
            print("Match: {0} with {0}".format(k))
        else:
            print("No match: {0} with {0}".format(k))

请注意，您的"accesstime"键的值不是有效的Python对象！

- Ma0

0

我不理解你的数据结构的逻辑，但我认为那会起作用：

value_to_compare = ["name", "address", "age"]

for i, elem in enumerate(current_list):
    backup_dict = backup_list[i]
    for key in value_to_compare:
        try:
            print("Match {}: {} with {}".format(key, elem[key], backup_dict[key]))
        except KeyError:
            print("key {} not found".format(key))
            # may be a raise here.
        except:
            raise

- RomainL.

我刚刚发现我不能使用字典列表，因为还有其他条件要考虑。所以实际上必须是一个XML元素列表，例如：[<Element 'New' at 0x000000000267BE08>, <Element 'Update' at 0x000000000267BEA8>, <Element 'New' at 0x000000000267AE08>]。 - John

这应该是print("Match {}: {} with {}".format(key, elem[i], backup_dict[i]))吗？因为key是value_to_compare中的一个值，而i将是一个整数。 - John

如果我理解正确，您想要比较XML元素？但只是在某些字段上？我不确定是否理解您的新数据，您能否更新您的问题？或者提出一个新的问题？ - RomainL.

根据您的解决方案更新了我的答案。谢谢 :) - John

0

如果您愿意使用第三方库，这种任务可以通过Pandas更高效地实现，并以更结构化的方式呈现：

import pandas as pd

res = pd.merge(pd.DataFrame(current_list),
               pd.DataFrame(backup_list),
               on=['name', 'address', 'age'],
               how='outer',
               indicator=True)

print(res)

  accesstime_x address  age  name accesstime_y _merge
0     11:14:01    Home   23  Bill     13:34:24   both
1     11:57:43    Home   26  Fred     13:34:26   both
2     11:24:14    Home   33  Nora     13:35:14   both

每行的结果_merge = 'both'表示在两个列表中都存在['name', 'address', 'age']的组合，而且您还可以看到每个输入的accesstime。

- jpp

0

您可以使用zip方法同时迭代多个列表。

elements_to_compare = ["name", "age", "address"]
for dic1, dic2 in zip(current_list, backup_list):
    for element in elements_to_compare :
        if dic1[element] == dic2[element]:
            print("Match {0} with {0}".format(element))

- shivam bhatnagar

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Woltan · Accepted Answer

我不确定我是否完全理解了你的问题，但我认为以下代码应该能解决问题：

compare_arguments = ["name", "age", "address"]
for cl, bl in zip(current_list, backup_list):
    for ca in compare_arguments:
        if cl[ca] == bl[ca]:
            print("Match {0} with {0}".format(cl[ca]))
    print("-" * 10)

在上面的代码中，所做的是对两个列表进行zip迭代。使用另一个列表指定要比较的字段。在主循环中，您遍历可比较的字段并相应地打印它们。