Python将扁平的字典列表转换为层次结构树

3

我一直在尝试转换以下列表:

lst = [
    {"id": 0, "job": "CEO", "ManagerID": 0, "name": "John Smith"},
    {"id": 1, "job": "Medical Manager", "ManagerID": 0, "name": "Medic 1"},
    {"id": 2, "job": "Medical Assist", "ManagerID": 1, "name": "Medic 2"},
    {"id": 3, "job": "ICT Manager", "ManagerID": 0, "name": "ICT 1"},
    {"id": 4, "job": "ICT Assist", "ManagerID": 3, "name": "ICT 2"},
    {"id": 5, "job": "ICT Junior", "ManagerID": 4, "name": "ICT 3"}
]

将其转换为以下格式

output = [
    {"id": 0, "job": "CEO", "ManagerID": 0, "name": "John Smith", "children" : [
        { "id":1, "job": "Medical Manager", "name": "Medic 1", "children" : [
            {"id": 2, "job": "Medical Assist", "name": "Medic 2"}
            ]
        },
        {"id": 3, "job": "ICT Manager", "name": "ICT 1", "children":[
            {"id": 4, "job": "ICT Assist", "name": "ICT 2", "children" : [
                {"id": 5, "job": "ICT Junior", "name": "ICT 3"}
            ]}
        ]}
    ],
}]

当存在一个根节点 (ManagerID = 0) 时,所有其他节点都是从该节点分支出来的。

我尝试了从另一个问题中适应代码,但我无法产生所需的格式。

我一直在使用以下代码,但这仍然有父节点的重复。

classes = [] #everyones id
for item in lst:
    name = item['id']
    if name not in classes:
        classes.append(name)

treenodes = {}
root_node = None

for item in lst: # Create  tree nodes
    item['children'] = []
    name = item['id']
    treenodes[name] = item
    parent = item['ManagerID']
    if parent not in classes: # parent is root node, create
        if parent not in treenodes:
            node = {}
            node['ManagerID'] = 0 #set manager to root
            node['children'] = []
            node['id'] = parent
            root_node = node
            treenodes[parent] = node

# Connect parents and children
for item in lst: # Create  tree nodes
    parent = item['ManagerID']
    parent_node = treenodes[parent]
    parent_node['children'].append(item)

output = treenodes

非常感谢您的帮助。


CEO能够在输出中保留他们的“ManagerID”键,但是其他层级为什么不能呢?这背后是否有特定的原因? - Martijn Pieters
相关,可能是重复的问题:https://stackoverflow.com/questions/444296/how-to-efficiently-build-a-tree-from-a-flat-structure - undefined
2个回答

7
这里是一个递归版本用于构建层级结构。

递归版本

from pprint import pprint


def to_lookup(employees):
    employee_lookup = dict()
    for employee in employees:
        if employee["id"] != employee["ManagerID"]:
            manager_id = employee["ManagerID"]
            children = employee_lookup.get(manager_id)
            if not children:
                children = employee_lookup[manager_id] = list()
            children.append(employee.copy())
        else:
            manager = employee.copy()
    return manager, employee_lookup


def build_hierarchy(manager, employee_lookup):
    employees = employee_lookup.get(manager["id"], list())
    for employee in employees:
        build_hierarchy(employee, employee_lookup)
    if employees:
        manager['children'] = employees
    return manager


employees = [
    {"id": 0, "job": "CEO", "ManagerID": 0, "name": "John Smith"},
    {"id": 1, "job": "Medical Manager", "ManagerID": 0, "name": "Medic 1"},
    {"id": 2, "job": "Medical Assist", "ManagerID": 1, "name": "Medic 2"},
    {"id": 3, "job": "ICT Manager", "ManagerID": 0, "name": "ICT 1"},
    {"id": 4, "job": "ICT Assist", "ManagerID": 3, "name": "ICT 2"},
    {"id": 5, "job": "ICT Junior", "ManagerID": 4, "name": "ICT 3"}
]

manager, employee_lookup = to_lookup(employees)
hierarchy = build_hierarchy(manager, employee_lookup)
pprint(hierarchy)

输出

{'ManagerID': 0,
 'children': [{'ManagerID': 0,
               'children': [{'ManagerID': 1,
                             'id': 2,
                             'job': 'Medical Assist',
                             'name': 'Medic 2'}],
               'id': 1,
               'job': 'Medical Manager',
               'name': 'Medic 1'},
              {'ManagerID': 0,
               'children': [{'ManagerID': 3,
                             'children': [{'ManagerID': 4,
                                           'id': 5,
                                           'job': 'ICT Junior',
                                           'name': 'ICT 3'}],
                             'id': 4,
                             'job': 'ICT Assist',
                             'name': 'ICT 2'}],
               'id': 3,
               'job': 'ICT Manager',
               'name': 'ICT 1'}],
 'id': 0,
 'job': 'CEO',
 'name': 'John Smith'}

性能测试

hierarchy_size = 2000000

employees = [
    {"id": 0, "ManagerID": 0},
]
for idx in range(1, hierarchy_size):
    manager_id = random.randint(0, idx - 1)
    employees.append({"id": idx, "ManagerID": manager_id})

start = datetime.datetime.now()

manager, employee_lookup = to_lookup(employees)
hierarchy = build_hierarchy(manager, employee_lookup)

print(datetime.datetime.now() - start)

感谢Andre尝试解决问题。我已接受@Martjin的答案作为解决方案,因为他帮助我理解了我的结果。 - Dan Walters

1
你的代码实际上是可行的,但你需要获取 treenodes[0] 条目(CEO)。treenodes 中其余的键值对仅用于簿记,以便轻松找到给定员工条目的给定经理。
如果不能保证根节点的 ID 为 0,那么你可以利用 CEO 标记为管理自己的事实;根节点是经理 ID 指向他们自己 ID 的节点。更常见的情况是根节点根本没有父 ID。
你还将 CEO 添加到了他们自己的 children 列表中(CEO 的经理 ID 是他们自己的 ID),因此你的树中有一个递归引用。
你找到的代码不是最清晰或最高效的。我会从 id 到复制对象构建一个字典(因此你原始的 lst 字典不会改变),然后循环遍历该结构并将条目添加到它们的经理 ID 条目中。我使用“根节点自我引用”规则(因此经理 ID 等于他们自己的 ID):
employees = {}
managers = set()
root_id = None
for emp in lst:
    id, mid = emp['id'], emp['ManagerID']
    # create a copy of emp, and add a "children" list
    employees[id] = {**emp, 'children': []}
    managers.add(mid)
    if id == mid:
        # the root of the tree references itself as the manager
        root_id = id

# add empty manager entries for missing manager IDs, reporting to root ID.
for id in managers - employees.keys():
    employees[id] = {
        'id': id, 'ManagerID': root_id, 'children': [],
        'job': None, 'name': None
    }

for id, emp in employees.items():
    manager = employees[emp.pop('ManagerID')]
    if id != root_id:  # don't add the root to anything
        manager['children'].append(emp)

output = employees[root_id]

上述代码使用了一个集合来追踪已经被看到的经理ID,因此您可以轻松地添加缺失的经理条目(在这种情况下是向CEO汇报)。
对于您的输入,它会产生:
{'id': 0, 'job': 'CEO', 'name': 'John Smith', 'children':
    [{'id': 1, 'job': 'Medical Manager', 'name': 'Medic 1', 'children':
        [{'id': 2, 'job': 'Medical Assist', 'name': 'Medic 2', 'children': []}],
     },
     {'id': 3, 'job': 'ICT Manager', 'name': 'ICT 1', 'children':
        [{'id': 4, 'job': 'ICT Assist', 'name': 'ICT 2', 'children':
            [{'id': 5, 'job': 'ICT Junior', 'name': 'ICT 3', 'children': []}]
         }]
     }]
}

非常感谢您解释了我的原始代码中的树节点!我的数据源并不像我给出的示例那样整洁,但是我在第[62]个位置的完整嵌套结构中找到了它(共[2000]个位置)。 - Dan Walters

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接