如何高效枚举父图的所有子图?在我的特定情况下,父图是分子图,因此它将是连通的,并且通常包含少于100个顶点。
编辑:我只对连通的子图感兴趣。
如何高效枚举父图的所有子图?在我的特定情况下,父图是分子图,因此它将是连通的,并且通常包含少于100个顶点。
编辑:我只对连通的子图感兴趣。
GenerateConnectedSubgraphs(verticesNotYetConsidered, subsetSoFar, neighbors):
if subsetSoFar is empty:
let candidates = verticesNotYetConsidered
else
let candidates = verticesNotYetConsidered intersect neighbors
if candidates is empty:
yield subsetSoFar
else:
choose a vertex v from candidates
GenerateConnectedSubgraphs(verticesNotYetConsidered - {v},
subsetSoFar,
neighbors)
GenerateConnectedSubgraphs(verticesNotYetConsidered - {v},
subsetSoFar union {v},
neighbors union N(v))
class Graph(object):
def __init__(self, vertices):
self.vertices = frozenset(vertices)
# add edge logic here and to methods, etc. etc.
def subgraphs(self):
cache = set()
def helper(graph):
yield graph
for element in graph:
if {{REMOVING ELEMENT WOULD DISCONNECT GRAPH}}:
# you fill in above function; easy if
# there is 0 or 1 ring in molecule
# (keep track if molecule has ring, e.g.
# self.numRings, maybe even more data)
# if you know there are 0 rings the operation
# takes O(1) time
continue
subgraph = Graph(graph.vertices-{element})
if not subgraph in cache:
cache.add(subgraph)
for s in helper(subgraph):
yield s
for graph in helper(self):
yield graph
def __eq__(self, other):
return self.vertices == other.vertices
def __hash__(self):
return hash(self.vertices)
def __iter__(self):
return iter(self.vertices)
def __repr__(self):
return 'Graph(%s)' % repr(set(self.vertices))
G = Graph({1,2,3,4,5})
for subgraph in G.subgraphs():
print(subgraph)
结果:
Graph({1, 2, 3, 4, 5})
Graph({2, 3, 4, 5})
Graph({3, 4, 5})
Graph({4, 5})
Graph({5})
Graph(set())
Graph({4})
Graph({3, 5})
Graph({3})
Graph({3, 4})
Graph({2, 4, 5})
Graph({2, 5})
Graph({2})
Graph({2, 4})
Graph({2, 3, 5})
Graph({2, 3})
Graph({2, 3, 4})
Graph({1, 3, 4, 5})
Graph({1, 4, 5})
Graph({1, 5})
Graph({1})
Graph({1, 4})
Graph({1, 3, 5})
Graph({1, 3})
Graph({1, 3, 4})
Graph({1, 2, 4, 5})
Graph({1, 2, 5})
Graph({1, 2})
Graph({1, 2, 4})
Graph({1, 2, 3, 5})
Graph({1, 2, 3})
Graph({1, 2, 3, 4})
有一个名为gspan [1]的算法,它已被用于计算频繁子图,也可以用于枚举所有子图。您可以在这里找到它的实现[2]。
思路如下:图形由所谓的DFS代码表示。 DFS代码对应于图G上的深度优先搜索,并具有以下形式的条目 (i,j,l(v_i),l(v_i,v_j),l(v_j)),对于图的每个边缘(v_i,v_j),其中顶点下标对应于DFS发现顶点的顺序。可以在所有DFS代码集合上定义总序(如[1]中所做的那样),并且因此通过计算表示此图的所有DFS代码的最小值来获得给定图的规范图标签。这意味着如果两个图具有相同的最小DFS代码,则它们是同构的。现在,从长度为1的所有可能DFS代码开始(每个边缘一个),可以通过逐步添加一个边缘到代码中来枚举图形的所有子图,从而产生一个枚举树,其中每个节点对应于一个图形。如果枚举操作小心地完成(即与DFS代码的顺序兼容),则首先遇到最小的DFS代码。因此,每当遇到不是最小的DFS代码时,可以修剪其相应的子树。有关详细信息,请参阅[1]。[1] https://sites.cs.ucsb.edu/~xyan/papers/gSpan.pdf [2] http://www.nowozin.net/sebastian/gboost/
[1] https://sites.cs.ucsb.edu/~xyan/papers/gSpan.pdf [2] http://www.nowozin.net/sebastian/gboost/