如何找到联通组件?

16

我正在为类 Graph 编写一个名为 get_connected_components 的函数:

def get_connected_components(self):
    path=[]
    for i in self.graph.keys():
        q=self.graph[i]
        while q:
            print(q)
            v=q.pop(0)
            if not v in path:
                path=path+[v]
    return path

我的图表是:

{0: [(0, 1), (0, 2), (0, 3)], 1: [], 2: [(2, 1)], 3: [(3, 4), (3, 5)], \
4: [(4, 3), (4, 5)], 5: [(5, 3), (5, 4), (5, 7)], 6: [(6, 8)], 7: [], \
8: [(8, 9)], 9: []}

其中键是节点,而值则是边。我的函数给出了这个联通分量:

[(0, 1), (0, 2), (0, 3), (2, 1), (3, 4), (3, 5), (4, 3), (4, 5), (5, 3), \
(5, 4), (5, 7), (6, 8), (8, 9)]

但我会有两个不同的连通分量,例如:

[[(0, 1), (0, 2), (0, 3), (2, 1), (3, 4), (3, 5), (4, 3), (4, 5), \
(5, 3), (5, 4), (5, 7)],[(6, 8), (8, 9)]]

我不理解我哪里犯了错误。 有人能帮我吗?


3
请注意,您的表述包含冗余信息,例如在“3:[(3,4),(3,5)]”中。我们已经知道该边是从3开始的! - Quentin Pradet
你建议我更改字典中的值,仅放置连接的节点而不是边缘吗? - fege
3
你创建自己的图表有原因吗?强大的networkx库内置了连接组件算法。 - jterrace
3
是的,我希望创建自己的图表来提高我的Python编程技能。 - fege
1
作为这篇帖子和许多其他帖子的读者,我总是欢迎新的实现,即使很简单,也不要参考现有的包。 - Developer
显示剩余3条评论
4个回答

26

我喜欢这个算法:

def connected_components(neighbors):
    seen = set()
    def component(node):
        nodes = set([node])
        while nodes:
            node = nodes.pop()
            seen.add(node)
            nodes |= neighbors[node] - seen
            yield node
    for node in neighbors:
        if node not in seen:
            yield component(node)

它不仅简短而优雅,而且速度快。像这样使用它(Python 2.7):

old_graph = {
    0: [(0, 1), (0, 2), (0, 3)],
    1: [],
    2: [(2, 1)],
    3: [(3, 4), (3, 5)],
    4: [(4, 3), (4, 5)],
    5: [(5, 3), (5, 4), (5, 7)],
    6: [(6, 8)],
    7: [],
    8: [(8, 9)],
    9: []}

edges = {v for k, vs in old_graph.items() for v in vs}
graph = defaultdict(set)

for v1, v2 in edges:
    graph[v1].add(v2)
    graph[v2].add(v1)

components = []
for component in connected_components(graph):
    c = set(component)
    components.append([edge for edges in old_graph.values()
                            for edge in edges
                            if c.intersection(edge)])

print(components)

结果为:

[[(0, 1), (0, 2), (0, 3), (2, 1), (3, 4), (3, 5), (4, 3), (4, 5), (5, 3), (5, 4), (5, 7)],
 [(6, 8), (8, 9)]]

感谢aparpara发现了这个bug。


1
严格来说,它是不正确的。例如对于{1: {2}, 2: set(), 3: {2}},它给出了2个组件,而被接受的答案正确地给出了1个。但是,在应用算法之前,可以通过使图成为双向的来轻松解决这个问题。 - aparpara
@aparpara:已修复。 - pillmuncher

5

让我们简化图形表示:

myGraph = {0: [1,2,3], 1: [], 2: [1], 3: [4,5],4: [3,5], 5: [3,4,7], 6: [8], 7: [],8: [9], 9: []}

这里我们有一个返回字典的函数,其中键是根节点,值是连接组件:

def getRoots(aNeigh):
    def findRoot(aNode,aRoot):
        while aNode != aRoot[aNode][0]:
            aNode = aRoot[aNode][0]
        return (aNode,aRoot[aNode][1])
    myRoot = {} 
    for myNode in aNeigh.keys():
        myRoot[myNode] = (myNode,0)  
    for myI in aNeigh: 
        for myJ in aNeigh[myI]: 
            (myRoot_myI,myDepthMyI) = findRoot(myI,myRoot) 
            (myRoot_myJ,myDepthMyJ) = findRoot(myJ,myRoot) 
            if myRoot_myI != myRoot_myJ: 
                myMin = myRoot_myI
                myMax = myRoot_myJ 
                if  myDepthMyI > myDepthMyJ: 
                    myMin = myRoot_myJ
                    myMax = myRoot_myI
                myRoot[myMax] = (myMax,max(myRoot[myMin][1]+1,myRoot[myMax][1]))
                myRoot[myMin] = (myRoot[myMax][0],-1) 
    myToRet = {}
    for myI in aNeigh: 
        if myRoot[myI][0] == myI:
            myToRet[myI] = []
    for myI in aNeigh: 
        myToRet[findRoot(myI,myRoot)[0]].append(myI) 
    return myToRet  

让我们试试吧:

print getRoots(myGraph)

{8: [6, 8, 9], 1: [0, 1, 2, 3, 4, 5, 7]}


5

前面的回答很好。不过,我花了一点时间才明白发生了什么。因此,我重构了代码,以便更容易理解。我把代码留在这里,以防有人也觉得更容易(它在Python 3.6中运行)。

def get_all_connected_groups(graph):
    already_seen = set()
    result = []
    for node in graph:
        if node not in already_seen:
            connected_group, already_seen = get_connected_group(node, already_seen)
            result.append(connected_group)
    return result


def get_connected_group(node, already_seen):
        result = []
        nodes = set([node])
        while nodes:
            node = nodes.pop()
            already_seen.add(node)
            nodes = nodes or graph[node] - already_seen
            result.append(node)
        return result, already_seen


graph = {
     0: {0, 1, 2, 3},
     1: set(),
     2: {1, 2},
     3: {3, 4, 5},
     4: {3, 4, 5},
     5: {3, 4, 5, 7},
     6: {6, 8},
     7: set(),
     8: {8, 9},
     9: set()}

components = get_all_connected_groups(graph)
print(components)

结果:

Out[0]: [[0, 1, 2, 3, 4, 5, 7], [6, 8, 9]] 

此外,我简化了输入和输出。我认为打印出所有在一组中的节点更加清晰易懂。

3
“nodes = nodes or graph[node] - already_seen” 应改为“nodes.update(graph[node] - already_seen)”。 - pandasEverywhere
如果 graph[5] = {3, 4, 5, 8},那么我们不应该得到一个连通分量吗? - Quetzalcoatl
1
我将“nodes = nodes or graph[node] - already_seen”这行代码中的错误更正为“nodes.update(n for n in graph[node] if n not in already_seen)”。 - Rahul

1
如果您使用邻接表来表示图形,则可以使用此生成器函数(实现BFS)获取所有连接组件:
from collections import deque

def connected_components(graph):
    seen = set()

    for root in range(len(graph)):
        if root not in seen:
            seen.add(root)
            component = []
            queue = deque([root])

            while queue:
                node = queue.popleft()
                component.append(node)
                for neighbor in graph[node]:
                    if neighbor not in seen:
                        seen.add(neighbor)
                        queue.append(neighbor)
            yield component

演示:

graph = [
    [1, 2, 3],  # neighbors of node "0"
    [0, 2],     # neighbors of node "1"
    [0, 1],     # ...
    [0, 4, 5],
    [3, 5],
    [3, 4, 7],
    [8],
    [5],
    [9, 6],
    [8]
]

print(list(connected_components(graph)))  # [[0, 1, 2, 3, 4, 5, 7], [6, 8, 9]]

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接