遍历任意深度的嵌套字典（该字典表示目录树）。

Question

遍历任意深度的嵌套字典（该字典表示目录树）。

7

我是一个刚开始学习Python的新手。

这个问题出现是因为我想让用户能够从目录中选择一组文件（包括任何子目录），但不幸的是，在Windows 7上，Tkinter默认的文件对话框中选择多个文件的功能存在问题 (http://bugs.python.org/issue8010)。

因此，我正在尝试用一种替代方法（仍然使用Tkinter）来表示目录结构：构建一个由标记和缩进复选框（按树形结构组织）构成的目录结构模拟。例如，像这样的目录：

\SomeRootDirectory
    \foo.txt
    \bar.txt
    \Stories
        \Horror
            \scary.txt
            \Trash
                \notscary.txt
        \Cyberpunk
    \Poems
        \doyoureadme.txt

将会长成这样（其中#代表复选框）：

SomeRootDirectory
    # foo.txt
    # bar.txt
    Stories
        Horror
            # scary.txt
            Trash
                # notscary.txt
        Cyberpunk
    Poems
        # doyoureadme.txt

使用我在ActiveState找到的某个配方，从目录结构中构建原始字典很容易（见下文），但是当我尝试迭代留下来的嵌套字典时，我遇到了困难。

- floer32

5

当您在迭代字典键值对时（在接受字典作为参数的函数内部），可以检查该值是否为字典类型。如果是，则再次调用您的函数，即在此处使用递归，并将该值作为字典传递给函数；否则处理该值。这应该解决可变深度迭代问题。 - avasal

5个回答

4

这是一份初步的代码，请仔细查看并告诉我您遇到了哪些问题。

Parents={-1:"Root"}
def add_dir(level, parent, index, k):
    print "Directory"
    print "Level=%d, Parent=%s, Index=%d, value=%s" % (level, Parents[parent], index, k)
def add_file(parent, index, k):
    print "File"
    print "Parent=%s, Index=%d, value=%s" %  (Parents[parent], index, k)
def f(level=0, parent=-1, index=0, di={}):
    for k in di:
        index +=1
        if di[k]:
            Parents[index]=k
            add_dir(level, parent, index, k)
            f(level+1, index, index, di[k])
        else:
            add_file(parent, index, k)

a={
    'SomeRootDirectory': {
        'foo.txt': None,
        'bar.txt': None,
        'Stories': {
            'Horror': {
                'scary.txt' : None,
                'Trash' : {
                    'notscary.txt' : None,
                    },
                },
            'Cyberpunk' : None
            },
        'Poems' : {
            'doyoureadme.txt' : None
        }
    }
}

f(di=a)

- spicavigo

哇，这太棒了。我更新了我的问题，包括我的解决方案（一个稍作调整的版本，基本上已经准备好使用Tkinter了，我想）。 - floer32

2

我知道这是一个老问题，但我正在寻找一种简单、干净的方法来遍历嵌套字典，这是我有限的搜索中最接近的东西。如果你想要更多的信息而不仅仅是文件名，oadams的答案就不够用了，而spicavigo的答案看起来很复杂。

我最终自己编写了一个类似于os.walk处理目录的函数，但它返回所有键/值信息。

它返回一个迭代器，对于嵌套字典"tree"中的每个目录，迭代器返回(path, sub-dicts, values)，其中：

- path是字典的路径 - sub-dicts是该字典中每个子字典的(key, dict)对的元组 - values是该字典中每个(非字典)项的(key, value)对的元组

def walk(d):
    '''
    Walk a tree (nested dicts).
    
    For each 'path', or dict, in the tree, returns a 3-tuple containing:
    (path, sub-dicts, values)
    
    where:
    * path is the path to the dict
    * sub-dicts is a tuple of (key,dict) pairs for each sub-dict in this dict
    * values is a tuple of (key,value) pairs for each (non-dict) item in this dict
    '''
    # nested dict keys
    nested_keys = tuple(k for k in d.keys() if isinstance(d[k],dict))
    # key/value pairs for non-dicts
    items = tuple((k,d[k]) for k in d.keys() if k not in nested_keys)
    
    # return path, key/sub-dict pairs, and key/value pairs
    yield ('/', [(k,d[k]) for k in nested_keys], items)
    
    # recurse each subdict
    for k in nested_keys:
        for res in walk(d[k]):
            # for each result, stick key in path and pass on
            res = ('/%s' % k + res[0], res[1], res[2])
            yield res

以下是我用来测试的代码，虽然其中还有一些无关但很棒的东西:

import simplejson as json
from collections import defaultdict

# see https://gist.github.com/2012250
tree = lambda: defaultdict(tree)

def walk(d):
    '''
    Walk a tree (nested dicts).
    
    For each 'path', or dict, in the tree, returns a 3-tuple containing:
    (path, sub-dicts, values)
    
    where:
    * path is the path to the dict
    * sub-dicts is a tuple of (key,dict) pairs for each sub-dict in this dict
    * values is a tuple of (key,value) pairs for each (non-dict) item in this dict
    '''
    # nested dict keys
    nested_keys = tuple(k for k in d.keys() if isinstance(d[k],dict))
    # key/value pairs for non-dicts
    items = tuple((k,d[k]) for k in d.keys() if k not in nested_keys)
    
    # return path, key/sub-dict pairs, and key/value pairs
    yield ('/', [(k,d[k]) for k in nested_keys], items)
    
    # recurse each subdict
    for k in nested_keys:
        for res in walk(d[k]):
            # for each result, stick key in path and pass on
            res = ('/%s' % k + res[0], res[1], res[2])
            yield res

# use fancy tree to store arbitrary nested paths/values
mem = tree()

root = mem['SomeRootDirectory']
root['foo.txt'] = None
root['bar.txt'] = None
root['Stories']['Horror']['scary.txt'] = None
root['Stories']['Horror']['Trash']['notscary.txt'] = None
root['Stories']['Cyberpunk']
root['Poems']['doyoureadme.txt'] = None

# convert to json string
s = json.dumps(mem, indent=2)

#print mem
print s
print

# json.loads converts to nested dicts, need to walk them
for (path, dicts, items) in walk(json.loads(s)):
    # this will print every path
    print '[%s]' % path
    for key,val in items:
        # this will print every key,value pair (skips empty paths)
        print '%s = %s' % (path+key,val)
    print

输出的结果如下所示：

{
  "SomeRootDirectory": {
    "foo.txt": null,
    "Stories": {
      "Horror": {
        "scary.txt": null,
        "Trash": {
          "notscary.txt": null
        }
      },
      "Cyberpunk": {}
    },
    "Poems": {
      "doyoureadme.txt": null
    },
    "bar.txt": null
  }
}

[/]

[/SomeRootDirectory/]
/SomeRootDirectory/foo.txt = None
/SomeRootDirectory/bar.txt = None

[/SomeRootDirectory/Stories/]

[/SomeRootDirectory/Stories/Horror/]
/SomeRootDirectory/Stories/Horror/scary.txt = None

[/SomeRootDirectory/Stories/Horror/Trash/]
/SomeRootDirectory/Stories/Horror/Trash/notscary.txt = None

[/SomeRootDirectory/Stories/Cyberpunk/]

[/SomeRootDirectory/Poems/]
/SomeRootDirectory/Poems/doyoureadme.txt = None

- bj0

0

你可以使用递归遍历嵌套的字典

def walk_dict(dictionary):
    for key in dictionary:
        if isinstance(dictionary[key], dict):
           walk_dict(dictionary[key])
        else:
           #do something with dictionary[k]
           pass

希望这能帮到你 :)

- FintanH

0

a={
    'SomeRootDirectory': {
        'foo.txt': None,
        'bar.txt': None,
        'Stories': {
            'Horror': {
                'scary.txt' : None,
                'Trash' : {
                    'notscary.txt' : None,
                    },
                },
            'Cyberpunk' : None
            },
        'Poems' : {
            'doyoureadme.txt' : None
        }
    }
}

def dict_paths(dictionary, level=0, parents=[], paths=[]):
  for key in dictionary:
    parents = parents[0:level]
    paths.append(parents + [key])
    if dictionary[key]:
      parents.append(key)
      dict_paths(dictionary[key], level+1, parents, paths)
  return paths

dp = dict_paths(a)
for p in dp:
    print '/'.join(p)

- wagner

最好能有一个解释！ - gsamaras

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- oadams · Accepted Answer

这里有一个打印所有文件名的函数。它遍历字典中的所有键，如果它们映射到的不是字典中的内容（在您的情况下是文件名），则打印出名称。否则，我们调用映射到的字典上的函数。

def print_all_files(directory):

    for filename in directory.keys():
        if not isinstance(directory[filename], dict):
            print filename
        else:
            print_all_files(directory[filename])

因此，这段代码可以被修改为任何你想要的内容，但它只是一个避免通过递归固定深度的示例。

需要理解的关键是，每次调用print_all_files时，它都不知道自己在树中有多深。它只查看那些直接存在的文件，并打印出名称。如果有目录，它会在这些目录上运行自身。