如何构建一个目录路径列表的正确算法?

3

我的情况:

我有一个元组列表。这些元组的第一项表示目录中文件夹的级别,第二项表示文件夹的名称。这些元组还按照它们与目录的关系进行排序。

以下是列表的样子:

    single_paths = [
                      [0, "1st Top Level Folder"], 
                      [1, "1st Child To 1st Top Level Folder"],
                      [2, "1st Grandchild To 1st Child Folder"],
                      [2, "2nd Grandchild To 1st Child Folder"],
                      [1, "2nd Child To 1st Top Level Folder"],
                      [2, "1st Grandchild To 2nd Child Folder"],
                      [0, "2nd Top Level Folder"],
                      [1, "1st Child To 2nd Top Level Folder"],
                      [0, "3rd Top Level Folder"],
                   ]

目录树的可视化表示:

enter image description here

我想要实现的目标: 得到一个所有可能路径的列表,其外观如下:

possible_paths = [
                    ["1st Top Level Folder"],
                    ["1st Top Level Folder", "1st Child To 1st Top Level Folder"],
                    ["1st Top Level Folder", "1st Child To 1st Top Level Folder", "1st Grandchild To 1st Child Folder"],
                    ["1st Top Level Folder", "1st Child To 1st Top Level Folder", "2nd Grandchild To 1st Child Folder"],
                    ["1st Top Level Folder", "2nd Child To 1st Top Level Folder"],
                    ["1st Top Level Folder", "2nd Child To 1st Top Level Folder", "1st Grandchild To 2nd Child Folder"],
                    ["2nd Top Level Folder"],
                    ["2nd Top Level Folder", "1st Child To 2nd Top Level Folder"],
                    ["3rd Top Level Folder"],
                 ]

您推荐使用哪种算法来实现这个功能?我已经花了3天的时间,但似乎无法得到正确的结果。非常感谢您提前的帮助。


我认为这是一个适合使用 Trie 的好应用。 - Andrew Holmgren
4个回答

4

由于这些级别是有序的,所以一旦级别比之前低,只需上升到特定级别:

possible_paths = []
for i, (level, name) in enumerate(single_paths):
    if level == 0:
        cur_path = []
    elif level <= single_paths[i-1][0]:
        cur_path = cur_path[:-(1 + single_paths[i-1][0] - level)]
    cur_path.append(name)
    possible_paths.append(cur_path[:])

太优雅了!非常感谢 :) - Amrovic

3

我也发表我的答案,只是因为在注意到已经有2个几乎相同的答案之前,我已经完成了它。

result = []
cur_level = -1
cur_path = []
for level, name in single_paths:
    if level<=cur_level:
        cur_path = cur_path[:level]
    cur_path.append(name)
    result.append(cur_path.copy())
    cur_level = level

啊,我认为使用 :level 更好 ;) @kosciej16 - Z Li

2
我会推荐最简单的算法 ;)
single_paths = [
    [0, "1st Top Level Folder"],
    [1, "1st Child To 1st Top Level Folder"],
    [2, "1st Grandchild To 1st Child Folder"],
    [2, "2nd Grandchild To 1st Child Folder"],
    [1, "2nd Child To 1st Top Level Folder"],
    [2, "1st Grandchild To 2nd Child Folder"],
    [0, "2nd Top Level Folder"],
    [1, "1st Child To 2nd Top Level Folder"],
    [0, "3rd Top Level Folder"],
]
stack = []
for node in single_paths:
    if stack:
        top = stack[-1]
        while stack and top[0] >= node[0]:
            top = stack.pop()
    stack.append(node)
    print(stack) # you can store it too, res.append([el[1] for el in stack])

通常我们会在栈中存储当前路径上的所有节点。如果下一个节点的级别更高,我们只需将其附加到路径中,但如果不是,则需要从路径中删除尽可能多的节点,直到停在处理节点级别以下的级别。


我也考虑过使用栈,但后来意识到它已经有序了,所以你只需要计算层之间的差距即可。 - Z Li
是的,我刚看到你的答案,真的很优雅 ;) - kosciej16

2
您可以使用递归:
from itertools import groupby
def to_tree(d, j=[]):
   k = [(a, list(b)) for a, b in groupby(d, key=lambda x:not x[0])]
   for i in range(0, len(k), 2):
     for _, p in k[i][-1]:
        yield j+p
     if i+1 < len(k):
        yield from to_tree([[a-1, b] for a, b in k[i+1][-1]],j+p)

data = [[0, '1st Top Level Folder'], [1, '1st Child To 1st Top Level Folder'], [2, '1st Grandchild To 1st Child Folder'], [2, '2nd Grandchild To 1st Child Folder'], [1, '2nd Child To 1st Top Level Folder'], [2, '1st Grandchild To 2nd Child Folder'], [0, '2nd Top Level Folder'], [1, '1st Child To 2nd Top Level Folder'], [0, '3rd Top Level Folder']]
r = list(to_tree([[a, [b]] for a, b in data]))

输出:

[['1st Top Level Folder'], 
 ['1st Top Level Folder', '1st Child To 1st Top Level Folder'], 
 ['1st Top Level Folder', '1st Child To 1st Top Level Folder', '1st Grandchild To 1st Child Folder'], 
 ['1st Top Level Folder', '1st Child To 1st Top Level Folder', '2nd Grandchild To 1st Child Folder'], 
 ['1st Top Level Folder', '2nd Child To 1st Top Level Folder'], 
 ['1st Top Level Folder', '2nd Child To 1st Top Level Folder', '1st Grandchild To 2nd Child Folder'], 
 ['2nd Top Level Folder'], 
 ['2nd Top Level Folder', '1st Child To 2nd Top Level Folder'], 
 ['3rd Top Level Folder']]

简化的解决方案:

p = {}
r = [(p:={a:[b],'r':[b]})['r'] if not a else (p:={**p,a:(v:=(p[a-1]+[b])),'r':v})['r'] 
     for a, b in data]

输出:

[['1st Top Level Folder'], 
 ['1st Top Level Folder', '1st Child To 1st Top Level Folder'], 
 ['1st Top Level Folder', '1st Child To 1st Top Level Folder', '1st Grandchild To 1st Child Folder'], 
 ['1st Top Level Folder', '1st Child To 1st Top Level Folder', '2nd Grandchild To 1st Child Folder'], 
 ['1st Top Level Folder', '2nd Child To 1st Top Level Folder'], 
 ['1st Top Level Folder', '2nd Child To 1st Top Level Folder', '1st Grandchild To 2nd Child Folder'], 
 ['2nd Top Level Folder'], 
 ['2nd Top Level Folder', '1st Child To 2nd Top Level Folder'], 
 ['3rd Top Level Folder']]

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接