在Python中深度合并字典的字典

Question

在Python中深度合并字典的字典

215

我需要合并多个字典，这是我目前的例子：

dict1 = {1:{"a":{"A"}}, 2:{"b":{"B"}}}

dict2 = {2:{"c":{"C"}}, 3:{"d":{"D"}}}

有 A B C 和 D 是树的叶子节点，就像 {"info1":"value", "info2":"value2"} 这样。

有一个未知层级（深度）的字典，可能是 {2:{"c":{"z":{"y":{C}}}}}。

在我的情况下，它代表一个目录/文件结构，其中节点是文档，叶子是文件。

我想要合并它们以获得：

 dict3 = {1:{"a":{"A"}}, 2:{"b":{"B"},"c":{"C"}}, 3:{"d":{"D"}}}

我不确定如何用Python轻松地做到这一点。

- fdhex

请查看我的NestedDict类：http://stackoverflow.com/a/16296144/2334951 它可以管理嵌套字典结构，如合并等操作。 - SzieberthAdam

3

提醒所有寻找解决方案的人：本问题仅涉及嵌套字典。大多数答案无法正确处理结构中包含字典列表的更复杂情况。如果您需要此功能，请尝试@Osiloke的答案：https://dev59.com/IWw05IYBdhLWcg3wfx80#25270947 - SHernandez

参见：python dpath merge - dreftymac

@andrew cooke的解决方案的一个陷阱是，即使存在冲突错误，更改也会影响第一个字典。为了避免这个陷阱，可以使用@andrew cooke的源代码创建一个递归辅助函数，并添加一个参数，该参数具有第一个字典的克隆。该参数将被更改并返回，而不是第一个字典。请参见：https://dev59.com/IWw05IYBdhLWcg3wfx80#71700270 - diogo

可以使用Addict来合并字典：d = Dict({1:{"a":{'A'}}, 2:{"b":{'B'}}}); d.update({2:{"c":{'C'}}, 3:{"d":{'D'}}}); d => {1: {'a': {'A'}}, 2: {'b': {'B'}, 'c': {'C'}}, 3: {'d': {'D'}}} - bartolo-otrit

显示剩余2条评论

36个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- kemri · Answer 1

还有另一个答案怎么样？这个答案也避免了变异/副作用:

def merge(dict1, dict2):
    output = {}

    # adds keys from `dict1` if they do not exist in `dict2` and vice-versa
    intersection = {**dict2, **dict1}

    for k_intersect, v_intersect in intersection.items():
        if k_intersect not in dict1:
            v_dict2 = dict2[k_intersect]
            output[k_intersect] = v_dict2

        elif k_intersect not in dict2:
            output[k_intersect] = v_intersect

        elif isinstance(v_intersect, dict):
            v_dict2 = dict2[k_intersect]
            output[k_intersect] = merge(v_intersect, v_dict2)

        else:
            output[k_intersect] = v_intersect

    return output

dict1 = {1:{"a":{"A"}}, 2:{"b":{"B"}}}
dict2 = {2:{"c":{"C"}}, 3:{"d":{"D"}}}
dict3 = {1:{"a":{"A"}}, 2:{"b":{"B"},"c":{"C"}}, 3:{"d":{"D"}}}

assert dict3 == merge(dict1, dict2)

- singingwolfboy · Answer 2

当然，代码将取决于您解决合并冲突的规则。这是一个版本，它可以接受任意数量的参数，并递归地将它们合并到任意深度，而不使用任何对象变异。它使用以下规则来解决合并冲突：

字典优先于非字典值（{"foo": {...}} 优先于 {"foo": "bar"}）
后面的参数优先于前面的参数（如果按顺序合并 {"a": 1}、{"a", 2} 和 {"a": 3}，结果将是 {"a": 3}）

try:
    from collections import Mapping
except ImportError:
    Mapping = dict

def merge_dicts(*dicts):                                                            
    """                                                                             
    Return a new dictionary that is the result of merging the arguments together.   
    In case of conflicts, later arguments take precedence over earlier arguments.   
    """                                                                             
    updated = {}                                                                    
    # grab all keys                                                                 
    keys = set()                                                                    
    for d in dicts:                                                                 
        keys = keys.union(set(d))                                                   

    for key in keys:                                                                
        values = [d[key] for d in dicts if key in d]                                
        # which ones are mapping types? (aka dict)                                  
        maps = [value for value in values if isinstance(value, Mapping)]            
        if maps:                                                                    
            # if we have any mapping types, call recursively to merge them          
            updated[key] = merge_dicts(*maps)                                       
        else:                                                                       
            # otherwise, just grab the last value we have, since later arguments    
            # take precedence over earlier arguments                                
            updated[key] = values[-1]                                               
    return updated

- user16779014 · Answer 3

这是我制作的一种解决方案，可以无限递归地合并字典。传递给函数的第一个字典是主字典 - 其中的值将覆盖第二个字典中相同键的值。

def merge(dict1: dict, dict2: dict) -> dict:
    merged = dict1

    for key in dict2:
        if type(dict2[key]) == dict:
            merged[key] = merge(dict1[key] if key in dict1 else {}, dict2[key])
        else:
            if key not in dict1.keys():
                merged[key] = dict2[key]

    return merged

- conmak · Answer 4

这是一个基于纯Python3集合的深度更新函数变体。它通过逐级循环遍历来更新嵌套字典，并调用自身以更新每个下一级字典值：

def deep_update(dict_original, dict_update):
    if isinstance(dict_original, dict) and isinstance(dict_update, dict):
        output=dict(dict_original)
        keys_original=set(dict_original.keys())
        keys_update=set(dict_update.keys())
        similar_keys=keys_original.intersection(keys_update)
        similar_dict={key:deep_update(dict_original[key], dict_update[key]) for key in similar_keys}
        new_keys=keys_update.difference(keys_original)
        new_dict={key:dict_update[key] for key in new_keys}
        output.update(similar_dict)
        output.update(new_dict)
        return output
    else:
        return dict_update

一个简单的例子：

x={'a':{'b':{'c':1, 'd':1}}}
y={'a':{'b':{'d':2, 'e':2}}, 'f':2}

print(deep_update(x, y))
>>> {'a': {'b': {'c': 1, 'd': 2, 'e': 2}}, 'f': 2}

- wong steve · Answer 5

class Utils(object):

    """

    >>> a = { 'first' : { 'all_rows' : { 'pass' : 'dog', 'number' : '1' } } }
    >>> b = { 'first' : { 'all_rows' : { 'fail' : 'cat', 'number' : '5' } } }
    >>> Utils.merge_dict(b, a) == { 'first' : { 'all_rows' : { 'pass' : 'dog', 'fail' : 'cat', 'number' : '5' } } }
    True

    >>> main = {'a': {'b': {'test': 'bug'}, 'c': 'C'}}
    >>> suply = {'a': {'b': 2, 'd': 'D', 'c': {'test': 'bug2'}}}
    >>> Utils.merge_dict(main, suply) == {'a': {'b': {'test': 'bug'}, 'c': 'C', 'd': 'D'}}
    True

    """

    @staticmethod
    def merge_dict(main, suply):
        """
        获取融合的字典，以main为主,suply补充,冲突时以main为准
        :return:
        """
        for key, value in suply.items():
            if key in main:
                if isinstance(main[key], dict):
                    if isinstance(value, dict):
                        Utils.merge_dict(main[key], value)
                    else:
                        pass
                else:
                    pass
            else:
                main[key] = value
        return main

if __name__ == '__main__':
    import doctest
    doctest.testmod()

- SlackSpace · Answer 6

嘿，我也遇到了同样的问题，但我想出了一个解决方案，并在这里发布，以防它对其他人也有用。基本上是合并嵌套字典并添加值，对我来说，我需要计算一些概率，所以这个方法非常好用：

#used to copy a nested dict to a nested dict
def deepupdate(target, src):
    for k, v in src.items():
        if k in target:
            for k2, v2 in src[k].items():
                if k2 in target[k]:
                    target[k][k2]+=v2
                else:
                    target[k][k2] = v2
        else:
            target[k] = copy.deepcopy(v)

通过使用上述方法，我们可以合并：

target = {'6,6': {'6,63': 1}, '63,4': {'4,4': 1}, '4,4': {'4,3': 1}, '6,63': {'63,4': 1}}

src = {'5,4': {'4,4': 1}, '5,5': {'5,4': 1}, '4,4': {'4,3': 1}}

这将变为： {'5,5': {'5,4': 1}, '5,4': {'4,4': 1}, '6,6': {'6,63': 1}, '63,4': {'4,4': 1}, '4,4': {'4,3': 2}, '6,63': {'63,4': 1}}

还要注意这里的更改：

target = {'6,6': {'6,63': 1}, '6,63': {'63,4': 1}, '4,4': {'4,3': 1}, '63,4': {'4,4': 1}}

src = {'5,4': {'4,4': 1}, '4,3': {'3,4': 1}, '4,4': {'4,9': 1}, '3,4': {'4,4': 1}, '5,5': {'5,4': 1}}

merge = {'5,4': {'4,4': 1}, '4,3': {'3,4': 1}, '6,63': {'63,4': 1}, '5,5': {'5,4': 1}, '6,6': {'6,63': 1}, '3,4': {'4,4': 1}, '63,4': {'4,4': 1}, '4,4': {'4,3': 1, '4,9': 1}}

别忘了还要添加复制的导入：

import copy

- diogo · Answer 7

返回合并后的字典，而不影响输入字典。

def _merge_dicts(dictA: Dict = {}, dictB: Dict = {}) -> Dict:
    # it suffices to pass as an argument a clone of `dictA`
    return _merge_dicts_aux(dictA, dictB, copy(dictA))


def _merge_dicts_aux(dictA: Dict = {}, dictB: Dict = {}, result: Dict = {}, path: List[str] = None) -> Dict:

    # conflict path, None if none
    if path is None:
        path = []

    for key in dictB:

        # if the key doesn't exist in A, add the B element to A
        if key not in dictA:
            result[key] = dictB[key]

        else:
            # if the key value is a dict, both in A and in B, merge the dicts
            if isinstance(dictA[key], dict) and isinstance(dictB[key], dict):
                _merge_dicts_aux(dictA[key], dictB[key], result[key], path + [str(key)])

            # if the key value is the same in A and in B, ignore
            elif dictA[key] == dictB[key]:
                pass

            # if the key value differs in A and in B, raise error
            else:
                err: str = f"Conflict at {'.'.join(path + [str(key)])}"
                raise Exception(err)

    return result

受到@andrew cooke解决方案的启发

- Soudipta Dutta · Answer 8

def m(a,b):
    aa = {
        k : dict(a.get(k,{}), **v) for k,v in b.items()
        }
    aap = print(aa)
    return aap

d1 = {1:{"a":"A"}, 2:{"b":"B"}}

d2 = {2:{"c":"C"}, 3:{"d":"D"}}

dict1 = {1:{"a":{1}}, 2:{"b":{2}}}

dict2 = {2:{"c":{222}}, 3:{"d":{3}}}

m(d1,d2)

m(dict1,dict2)

"""
Output :

{2: {'b': 'B', 'c': 'C'}, 3: {'d': 'D'}}


{2: {'b': {2}, 'c': {222}}, 3: {'d': {3}}}

"""

- Asclepius · Answer 9

下面的merge函数是对Ali的答案的更专业版本，它避免了多次获取值的浪费。它是原地操作的。

下面的merge_new函数不是原地操作的。它返回一个新的字典。它不依赖于copy.deepcopy。

def merge(base: dict, update: dict) -> None:
    """Recursively merge `update` into `base` in-place."""
    for k, update_v in update.items():
        base_v = base.get(k)
        if isinstance(base_v, dict) and isinstance(update_v, dict):
            merge(base_v, update_v)
        else:
            base[k] = update_v

def merge_new(base: dict, update: dict) -> dict:
    """Return the updated result after recursively merging `update` into `base`."""
    result = base.copy()
    for k, update_v in update.items():
        base_v = result.get(k)
        if isinstance(base_v, dict) and isinstance(update_v, dict):
            result[k] = merge_new(base_v, update_v)
        else:
            result[k] = update_v
    return result

测试案例：

test_data_base = {
    'a': 1,
    'b': {'c': 1, 'd': 2},
    'c': {'d': {'e': 0, 'f': 1, 'p': {'q': 4}}},
    'x': 0,
    'y': {'x': 3},
}

test_data_update = {
    'a': 9,
    'b': {'d': 3, 'e': 3},
    'c': {'d': {'e': 1, 'g': 8, 'p': {'r': 5, 's': 6}}, 'h': 7},
    'd': 6,
    'e': {'f': 10, 'g': 10},
}

test_expected_updated_data = {
    'a': 9,
    'b': {'c': 1, 'd': 3, 'e': 3},
    'c': {'d': {'e': 1, 'f': 1, 'p': {'q': 4, 'r': 5, 's': 6}, 'g': 8}, 'h': 7},
    'x': 0,
    'y': {'x': 3},
    'd': 6,
    'e': {'f': 10, 'g': 10},
}

# Test merge_new (not in-place)
import copy
test_data_base_copy = copy.deepcopy(test_data_base)
test_actual_updated_data = merge_new(test_data_base, test_data_update)
assert(test_actual_updated_data == test_expected_updated_data)
assert(test_data_base == test_data_base_copy)

# Test merge in-place
merge(test_data_base, test_data_update)
assert(test_data_base == test_expected_updated_data)

- mentatkgs · Answer 10

我已经测试了你的解决方案，并决定在我的项目中使用它：

def mergedicts(dict1, dict2, conflict, no_conflict):
    for k in set(dict1.keys()).union(dict2.keys()):
        if k in dict1 and k in dict2:
            yield (k, conflict(dict1[k], dict2[k]))
        elif k in dict1:
            yield (k, no_conflict(dict1[k]))
        else:
            yield (k, no_conflict(dict2[k]))

dict1 = {1:{"a":"A"}, 2:{"b":"B"}}
dict2 = {2:{"c":"C"}, 3:{"d":"D"}}

#this helper function allows for recursion and the use of reduce
def f2(x, y):
    return dict(mergedicts(x, y, f2, lambda x: x))

print dict(mergedicts(dict1, dict2, f2, lambda x: x))
print dict(reduce(f2, [dict1, dict2]))

将函数作为参数传递是扩展jterrace解决方案以表现为所有其他递归解决方案的关键。