在Python中深度合并字典的字典

Question

在Python中深度合并字典的字典

215

我需要合并多个字典，这是我目前的例子：

dict1 = {1:{"a":{"A"}}, 2:{"b":{"B"}}}

dict2 = {2:{"c":{"C"}}, 3:{"d":{"D"}}}

有 A B C 和 D 是树的叶子节点，就像 {"info1":"value", "info2":"value2"} 这样。

有一个未知层级（深度）的字典，可能是 {2:{"c":{"z":{"y":{C}}}}}。

在我的情况下，它代表一个目录/文件结构，其中节点是文档，叶子是文件。

我想要合并它们以获得：

 dict3 = {1:{"a":{"A"}}, 2:{"b":{"B"},"c":{"C"}}, 3:{"d":{"D"}}}

我不确定如何用Python轻松地做到这一点。

- fdhex

请查看我的NestedDict类：http://stackoverflow.com/a/16296144/2334951 它可以管理嵌套字典结构，如合并等操作。 - SzieberthAdam

3

提醒所有寻找解决方案的人：本问题仅涉及嵌套字典。大多数答案无法正确处理结构中包含字典列表的更复杂情况。如果您需要此功能，请尝试@Osiloke的答案：https://dev59.com/IWw05IYBdhLWcg3wfx80#25270947 - SHernandez

参见：python dpath merge - dreftymac

@andrew cooke的解决方案的一个陷阱是，即使存在冲突错误，更改也会影响第一个字典。为了避免这个陷阱，可以使用@andrew cooke的源代码创建一个递归辅助函数，并添加一个参数，该参数具有第一个字典的克隆。该参数将被更改并返回，而不是第一个字典。请参见：https://dev59.com/IWw05IYBdhLWcg3wfx80#71700270 - diogo

可以使用Addict来合并字典：d = Dict({1:{"a":{'A'}}, 2:{"b":{'B'}}}); d.update({2:{"c":{'C'}}, 3:{"d":{'D'}}}); d => {1: {'a': {'A'}}, 2: {'b': {'B'}, 'c': {'C'}}, 3: {'d': {'D'}}} - bartolo-otrit

显示剩余2条评论

36个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- James Sapam · Answer 1

我能想到的最简单的方法是：

#!/usr/bin/python

from copy import deepcopy
def dict_merge(a, b):
    if not isinstance(b, dict):
        return b
    result = deepcopy(a)
    for k, v in b.iteritems():
        if k in result and isinstance(result[k], dict):
                result[k] = dict_merge(result[k], v)
        else:
            result[k] = deepcopy(v)
    return result

a = {1:{"a":'A'}, 2:{"b":'B'}}
b = {2:{"c":'C'}, 3:{"d":'D'}}

print dict_merge(a,b)

输出：

{1: {'a': 'A'}, 2: {'c': 'C', 'b': 'B'}, 3: {'d': 'D'}}

- themeasure43 · Answer 2

如果您愿意让第二个字典中的值优先于第一个字典中的值，那么可以轻松地按照以下方式完成。

def merge_dict(bottom: dict, top: dict) -> dict:

    ret = {}

    for _tmp in (bottom, top):
        for k, v in _tmp.items():
            if isinstance(v, dict):
                if k not in ret:
                    ret[k] = v
                else:
                    ret[k] = merge_dict(ret[k], v)
            else:
                ret[k] = _tmp[k]
    return ret

例子：

d_bottom = {
    'A': 'bottom-A',
    'B': 'bottom-B',
    'D': {
        'DA': 'bottom-DA',
        'DB': 'bottom-DB',
        'DC': {
            "DCA": "bottom-DCA",
            "DCB": "bottom-DCB",
            "DCC": "bottom-DCC",
            "DCD": {
                "DCDA": 'bottom-DCDA',
                "DCDB": 'bottom-DCDB'
            }
        }
    }
}

d_top = {
    'A': 'top-A',
    'B': 'top-B',
    'D': {
        'DA': 'top-DA',
        'DB': 'top-DB',
        'DC': {
            'DCA': 'top-DCA',
            "DCD": {
                "DCDA": "top-DCDA"
            }
        },
        'DD': 'top-DD'
    },
    'C': {
        'CA': 'top-CA',
        'CB': {
            'CBA': 'top-CBA',
            'CBB': 'top-CBB',
            'CBC': {
                'CBCA': 'top-CBCA',
                'CBCB': {
                    'CBCBA': 'top-CBCBA'
                }
            }
        }
    }
}

print(json.dumps(merge_dict(d_bottom, d_top), indent=4))

输出：

{
    "A": "top-A",
    "B": "top-B",
    "D": {
        "DA": "top-DA",
        "DB": "top-DB",
        "DC": {
            "DCA": "top-DCA",
            "DCB": "bottom-DCB",
            "DCC": "bottom-DCC",
            "DCD": {
                "DCDA": "top-DCDA",
                "DCDB": "bottom-DCDB"
            }
        },
        "DD": "top-DD"
    },
    "C": {
        "CA": "top-CA",
        "CB": {
            "CBA": "top-CBA",
            "CBB": "top-CBB",
            "CBC": {
                "CBCA": "top-CBCA",
                "CBCB": {
                    "CBCBA": "top-CBCBA"
                }
            }
        }
    }
}

如果你有超过两个字典需要合并，你可以使用functools.reduce()来完成。

- Slava · Answer 3

我这里有另外一个稍微不同的解决方案：

def deepMerge(d1, d2, inconflict = lambda v1,v2 : v2) :
''' merge d2 into d1. using inconflict function to resolve the leaf conflicts '''
    for k in d2:
        if k in d1 : 
            if isinstance(d1[k], dict) and isinstance(d2[k], dict) :
                deepMerge(d1[k], d2[k], inconflict)
            elif d1[k] != d2[k] :
                d1[k] = inconflict(d1[k], d2[k])
        else :
            d1[k] = d2[k]
    return d1

默认情况下，它会优先解决第二个字典中的值冲突，但您可以轻松地覆盖它，通过一些巫术，您甚至可以将异常抛出。

- user176105 · Answer 4

我没有全面测试过这个，所以欢迎您提供反馈。

from collections import defaultdict

dict1 = defaultdict(list)

dict2= defaultdict(list)

dict3= defaultdict(list)


dict1= dict(zip(Keys[ ],values[ ]))

dict2 = dict(zip(Keys[ ],values[ ]))


def mergeDict(dict1, dict2):

    dict3 = {**dict1, **dict2}

    for key, value in dict3.items():

        if key in dict1 and key in dict2:

           dict3[key] = [value , dict1[key]]

    return dict3

dict3 = mergeDict(dict1, dict2)

#sort keys alphabetically.

dict3.keys()

合并两个字典并添加相同键的值

- Dorcioman · Answer 5

from collections import defaultdict
from itertools import chain

class DictHelper:

@staticmethod
def merge_dictionaries(*dictionaries, override=True):
    merged_dict = defaultdict(set)
    all_unique_keys = set(chain(*[list(dictionary.keys()) for dictionary in dictionaries]))  # Build a set using all dict keys
    for key in all_unique_keys:
        keys_value_type = list(set(filter(lambda obj_type: obj_type != type(None), [type(dictionary.get(key, None)) for dictionary in dictionaries])))
        # Establish the object type for each key, return None if key is not present in dict and remove None from final result
        if len(keys_value_type) != 1:
            raise Exception("Different objects type for same key: {keys_value_type}".format(keys_value_type=keys_value_type))

        if keys_value_type[0] == list:
            values = list(chain(*[dictionary.get(key, []) for dictionary in dictionaries]))  # Extract the value for each key
            merged_dict[key].update(values)

        elif keys_value_type[0] == dict:
            # Extract all dictionaries by key and enter in recursion
            dicts_to_merge = list(filter(lambda obj: obj != None, [dictionary.get(key, None) for dictionary in dictionaries]))
            merged_dict[key] = DictHelper.merge_dictionaries(*dicts_to_merge)

        else:
            # if override => get value from last dictionary else make a list of all values
            values = list(filter(lambda obj: obj != None, [dictionary.get(key, None) for dictionary in dictionaries]))
            merged_dict[key] = values[-1] if override else values

    return dict(merged_dict)



if __name__ == '__main__':
  d1 = {'aaaaaaaaa': ['to short', 'to long'], 'bbbbb': ['to short', 'to long'], "cccccc": ["the is a test"]}
  d2 = {'aaaaaaaaa': ['field is not a bool'], 'bbbbb': ['field is not a bool']}
  d3 = {'aaaaaaaaa': ['filed is not a string', "to short"], 'bbbbb': ['field is not an integer']}
  print(DictHelper.merge_dictionaries(d1, d2, d3))

  d4 = {"a": {"x": 1, "y": 2, "z": 3, "d": {"x1": 10}}}
  d5 = {"a": {"x": 10, "y": 20, "d": {"x2": 20}}}
  print(DictHelper.merge_dictionaries(d4, d5))

输出：

{'bbbbb': {'to long', 'field is not an integer', 'to short', 'field is not a bool'}, 
'aaaaaaaaa': {'to long', 'to short', 'filed is not a string', 'field is not a bool'}, 
'cccccc': {'the is a test'}}

{'a': {'y': 20, 'd': {'x1': 10, 'x2': 20}, 'z': 3, 'x': 10}}

- Tadeck · Answer 6

这应该有助于将所有项从dict2合并到dict1中：

for item in dict2:
    if item in dict1:
        for leaf in dict2[item]:
            dict1[item][leaf] = dict2[item][leaf]
    else:
        dict1[item] = dict2[item]

请测试一下并告诉我们这是否是您想要的。

编辑：

上述解决方案仅合并一个级别，但正确解决了OP提供的示例。要合并多个级别，应使用递归。