如何递归地替换嵌套字典键中的字符？

Question

如何递归地替换嵌套字典键中的字符？

36

我正在尝试创建一个通用函数，用于替换嵌套字典中的键中的点号。我有一个非通用函数，可以实现三级深度，但一定有一种通用的方法。欢迎任何帮助！

output = {'key1': {'key2': 'value2', 'key3': {'key4 with a .': 'value4', 'key5 with a .': 'value5'}}} 

def print_dict(d):
    new = {}
    for key,value in d.items():
        new[key.replace(".", "-")] = {}
        if isinstance(value, dict):
            for key2, value2 in value.items():
                new[key][key2] = {}
                if isinstance(value2, dict):
                    for key3, value3 in value2.items():
                        new[key][key2][key3.replace(".", "-")] = value3
                else:
                    new[key][key2.replace(".", "-")] = value2
        else:
            new[key] = value
    return new

print print_dict(output)

更新：为了回答我的问题，我使用了json object_hooks创建了一个解决方案：

import json

def remove_dots(obj):
    for key in obj.keys():
        new_key = key.replace(".","-")
        if new_key != key:
            obj[new_key] = obj[key]
            del obj[key]
    return obj

output = {'key1': {'key2': 'value2', 'key3': {'key4 with a .': 'value4', 'key5 with a .': 'value5'}}}
new_json = json.loads(json.dumps(output), object_hook=remove_dots) 

print new_json

- Bas Tichelaar

9

回答自己的问题，应该是回答自己的问题，而不是编辑它。 - Oleh Prypin

使用我的解决方案，因为它比其他解决方案快十倍。 - horejsek

做得非常好。object_hook确实简化了整个过程，特别是在我的情况下，我使用一个名为“include”的“键”，需要递归加载额外的JSON文件以形成一个多维字典。 - Talk2

1

由于某些无法解释的原因，使用以上remove_dots（）方法的object_hook方法只替换了部分键名。我有一些保留了点的键名。这可能与obj.keys（）函数中的某些奇怪的排序问题有关吗？我需要制作有序字典吗？我以为Python3没有字典排序问题？ - Craig Jackson

9个回答

22

实际上，所有的答案都包含了一个错误，可能会导致结果中出现错误的输入。

我会采用@ngenain的答案，并稍作改进。我的解决方案将处理源自dict（OrderedDict、defaultdict等）以及不仅限于list，而是包括set和tuple类型的类型。我还在函数开始时进行了简单的类型检查，针对最常见的类型来减少比较次数（可以在大量数据中提高速度）。

适用于Python 3。对于Py2，请使用obj.iteritems()替换obj.items()。

def change_keys(obj, convert):
    """
    Recursively goes through the dictionary obj and replaces keys with the convert function.
    """
    if isinstance(obj, (str, int, float)):
        return obj
    if isinstance(obj, dict):
        new = obj.__class__()
        for k, v in obj.items():
            new[convert(k)] = change_keys(v, convert)
    elif isinstance(obj, (list, set, tuple)):
        new = obj.__class__(change_keys(v, convert) for v in obj)
    else:
        return obj
    return new

如果我理解需求正确的话，大多数用户希望将这些键转换为能够与mongoDB一起使用的格式，因为mongoDB不允许在键名中使用点号。

- baldr

1

这个是最好的。它支持Python2和Python3，但不需要“if isinstance(obj, (str, int, float))”这一行。即使没有这一行也可以工作。 - F.Tamy

6

不错的回答。为了完整起见，我会添加一个转换函数到你的回答中： def convert(k): return k.replace('.', '-') - John

@F.Tamy...在处理大型字典时，它可以节省时间。 - ZF007

你可以写成 type(obj)(something) 而不是 obj.class(something)。 - funnydman

8

我使用了@horejsek的代码，但对其进行了改进，以便接受包含列表的嵌套字典和替换字符串的函数。

我有一个类似的问题要解决：我想要将下划线小写约定的键替换为驼峰约定的键，反之亦然。

def change_dict_naming_convention(d, convert_function):
    """
    Convert a nested dictionary from one convention to another.
    Args:
        d (dict): dictionary (nested or not) to be converted.
        convert_function (func): function that takes the string in one convention and returns it in the other one.
    Returns:
        Dictionary with the new keys.
    """
    new = {}
    for k, v in d.iteritems():
        new_v = v
        if isinstance(v, dict):
            new_v = change_dict_naming_convention(v, convert_function)
        elif isinstance(v, list):
            new_v = list()
            for x in v:
                new_v.append(change_dict_naming_convention(x, convert_function))
        new[convert_function(k)] = new_v
    return new

- jllopezpino

除非d不是一个字典，否则它会有效，因此您无法调用d.items()。我的字典包含一个字符串数组，当递归时失败。检查函数的根部是否为isinstance(d,dict)，如果为false，则只需返回d。然后它应该对任何内容都有效。 - Mnebuerquo

7

这里提供了一个简单的递归解决方案，可以处理嵌套的列表和字典。

def change_keys(obj, convert):
    """
    Recursivly goes through the dictionnary obj and replaces keys with the convert function.
    """
    if isinstance(obj, dict):
        new = {}
        for k, v in obj.iteritems():
            new[convert(k)] = change_keys(v, convert)
    elif isinstance(obj, list):
        new = []
        for v in obj:
            new.append(change_keys(v, convert))
    else:
        return obj
    return new

- ngenain

不错的观点，但它会强制将从dict派生的类转换回dict。例如，您可能会失去OrderedDict的键顺序。我已经基于您的答案发布了一个改进的答案。 - baldr

对于Python3，请使用obj.items()代替。 - DropItLikeItsHot

2

你需要删除原始键，但是在循环体中这样做会抛出RunTimeError: dictionary changed size during iteration错误。

为解决此问题，通过遍历原始对象的副本来进行修改:

def change_keys(obj):
    new_obj = obj
    for k in new_obj:
            if hasattr(obj[k], '__getitem__'):
                    change_keys(obj[k])
            if '.' in k:
                    obj[k.replace('.', '$')] = obj[k]
                    del obj[k]

>>> foo = {'foo': {'bar': {'baz.121': 1}}}
>>> change_keys(foo)
>>> foo
{'foo': {'bar': {'baz$121': 1}}}

- bk0

它在 if hasattr(obj[k], '__getitem__'): 行中给出以下错误 TypeError: string indices must be integers。 - Gürol Canbek

不要使用hasattr(...)，尝试使用from collection import Mapping，然后使用if isinstance(obj[k], Mapping)...。这个改变的目的相同（尝试确定值是否为[嵌套]字典），但应该更加稳定。 - lnNoam

1

你可以将所有内容转换为JSON格式，替换整个字符串并重新加载JSON。

def nested_replace(data, old, new):
    json_string = json.dumps(data)
    replaced = json_string.replace(old, new)
    fixed_json = json.loads(replaced)
    return fixed_json

或者使用一行代码

def short_replace(data, old, new):
    return json.loads(json.dumps(data).replace(old, new))

- Ariel Voskov

这将替换值和键中的字符串出现。原始答案要求解决方案适用于键。如果将替换切换为RegEx方法，则可能仅适用于键。这是一种蛮力方法，但不太内存有效。 - ingyhere

在我的情况下，我正在将XML转换为json并尝试剥离@符号，并且不必担心影响值。对我来说，这个解决方案简洁而足够。 - fish

0

我猜你和我一样遇到了同样的问题，即在将字典插入MongoDB集合时，尝试插入具有带点（.）的键的字典时遇到异常。

这个解决方案本质上与其他大多数答案相同，但它更加紧凑，可能不太易读，因为它使用了一个语句并递归调用自身。适用于Python 3。

def replace_keys(my_dict):
    return { k.replace('.', '(dot)'): replace_keys(v) if type(v) == dict else v for k, v in my_dict.items() }

- Mats Bengtsson

确实很难读。我不会使用它。 - Guido van Steen

0

虽然jllopezpino的答案只适用于以字典开头的情况，但我这个可以适用于原始变量是列表或字典的情况。

def fix_camel_cases(data):
    def convert(name):
        # https://dev59.com/IXM_5IYBdhLWcg3w6HvT
        s1 = re.sub('(.)([A-Z][a-z]+)', r'\1_\2', name)
        return re.sub('([a-z0-9])([A-Z])', r'\1_\2', s1).lower()

    if isinstance(data, dict):
        new_dict = {}
        for key, value in data.items():
            value = fix_camel_cases(value)
            snake_key = convert(key)
            new_dict[snake_key] = value
        return new_dict

    if isinstance(data, list):
        new_list = []
        for value in data:
            new_list.append(fix_camel_cases(value))
        return new_list

    return data

- James Lin

0

这是一个一行代码的变体，使用字典推导式，适合那些喜欢简洁代码的人：

def print_dict(d):
    return {k.replace('.', '-'): print_dict(v) for k, v in d.items()} if isinstance(d, dict) else d

我只在Python 2.7中进行了测试

- ecoe

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- horejsek · Accepted Answer

是的，存在更好的方法：

def print_dict(d):
    new = {}
    for k, v in d.iteritems():
        if isinstance(v, dict):
            v = print_dict(v)
        new[k.replace('.', '-')] = v
    return new

(编辑：这是递归，更多信息请参考维基百科。)