Python中用于从A到B路径的最小割点算法

Question

Python中用于从A到B路径的最小割点算法

pythonalgorithmgraph-algorithmmathematical-optimizationpython-multithreading

4

我正在尝试解决与图论相关的问题，但似乎无法记住/找到/理解正确/最佳方法，因此我想向专家们请教...

我有一组来自两个节点（示例代码中的1和10）的路径列表。我试图找到去除节点的最小数量以切断所有路径。我也只能删除某些节点。

我目前已经实现了它（如下所示）作为一种暴力搜索。在我的测试集上，这很好用，但是在将规模扩大到具有100K路径和100可用节点的图形时，这将成为一个问题（阶乘问题）。目前，我不关心删除节点的顺序，但我会在某些时候考虑到这一点（在下面的代码中将集合切换为列表）。

我相信应该有一种使用最大流/最小割算法来解决这个问题的方法。但我读到的所有内容都以某种方式超出了我的头脑。这类工作已经过了几年（SEVERAL），我似乎什么都想不起来了。

所以我的问题是：

1）是否有更好的方法来解决这个问题，而不是测试所有组合并取最小集？

2）如果有的话，可以解释一下还是最好给出伪代码来帮助解释？我猜可能有一个库已经以某种方式执行了这项工作（我最近一直在寻找和使用networkX，但也可选其他库）

3）如果没有（或者即使是），有多线程/进程解决方案的建议吗？我想尽可能地从计算机中获得每一点性能。（我已经找到了一些好的线程关于这个问题，只是还没有时间实现，所以我想在同一时间问一下。我首先想让所有东西都正常工作，然后再进行优化。）

4）使代码更符合“Pythonic”的一般建议（可能还有助于性能）。我知道可以做出改进，并且对Python仍然很陌生。

谢谢帮助。

#!/usr/bin/env python


def bruteForcePaths(paths, availableNodes, setsTested, testCombination, results, loopId):

    #for each node available, we are going to
    # check if we have already tested set with node
    # if true- move to next node
    # if false- remove the paths effected,
    #           if there are paths left,
    #               record combo, continue removing with current combo,
    #           if there are no paths left,
    #               record success, record combo, continue to next node

    #local copy
    currentPaths = list(paths)
    currentAvailableNodes = list(availableNodes)
    currentSetsTested = set(setsTested)
    currentTestCombination= set(testCombination)

    currentLoopId = loopId+1

    print "loop ID: %d" %(currentLoopId)
    print "currentAvailableNodes:"
    for set1 in currentAvailableNodes:
        print "  %s" %(set1)

    for node in currentAvailableNodes:
        #add to the current test set
        print "%d-current node: %s current combo: %s" % (currentLoopId, node, currentTestCombination)
        currentTestCombination.add(node)
        # print "Testing: %s" % currentTestCombination
        # print "Sets tested:"
        # for set1 in currentSetsTested:
        #     print "  %s" % set1

        if currentTestCombination in currentSetsTested:
            #we already tested this combination of nodes so go to next node
            print "Already test: %s" % currentTestCombination
            currentTestCombination.remove(node)
            continue

        #get all the paths that don't have node in it
        currentRemainingPaths = [path for path in currentPaths if not (node in path)]

        #if there are no paths left
        if len(currentRemainingPaths) == 0:
            #save this combination
            print "successful combination: %s" % currentTestCombination
            results.append(frozenset(currentTestCombination))
            #add to remember we tested combo
            currentSetsTested.add(frozenset(currentTestCombination))
            #now remove the node that was add, and go to the next one
            currentTestCombination.remove(node)
        else:
            #this combo didn't work, save it so we don't test it again
            currentSetsTested.add(frozenset(currentTestCombination))
            newAvailableNodes = list(currentAvailableNodes)
            newAvailableNodes.remove(node)
            bruteForcePaths(currentRemainingPaths,
                            newAvailableNodes,
                            currentSetsTested,
                            currentTestCombination,
                            results,
                            currentLoopId)

            currentTestCombination.remove(node)

    print "-------------------"
    #need to pass "up" the tested sets from this loop
    setsTested.update(currentSetsTested)

    return None

if __name__ == '__main__':

    testPaths = [
        [1,2,14,15,16,18,9,10],
        [1,2,24,25,26,28,9,10],
        [1,2,34,35,36,38,9,10],
        [1,3,44,45,46,48,9,10],
        [1,3,54,55,56,58,9,10],
        [1,3,64,65,66,68,9,10],

        [1,2,14,15,16,7,10],
        [1,2,24,7,10],
        [1,3,34,35,7,10],

        [1,3,44,35,6,10],
        ]


    setsTested = set()
    availableNodes = [2, 3, 6, 7, 9]
    results = list()
    currentTestCombination = set()

    bruteForcePaths(testPaths, availableNodes, setsTested, currentTestCombination, results, 0)

    print "results:"
    for result in sorted(results, key=len):
        print result

更新：我使用itertools重构了代码来生成组合，这使得代码更清晰、更快(应该更容易进行多进程处理)。现在尝试实现建议中的主导节点并多进程化函数。

def bruteForcePaths3(paths, availableNodes, results):

    #start by taking each combination 2 at a time, then 3, etc
    for i in range(1,len(availableNodes)+1):
        print "combo number: %d" % i

        currentCombos = combinations(availableNodes, i)

        for combo in currentCombos:
            #get a fresh copy of paths for this combiniation
            currentPaths = list(paths)
            currentRemainingPaths = []
            # print combo

            for node in combo:
                #determine better way to remove nodes, for now- if it's in, we remove
                currentRemainingPaths = [path for path in currentPaths if not (node in path)]
                currentPaths = currentRemainingPaths

            #if there are no paths left
            if len(currentRemainingPaths) == 0:
                #save this combination
                print combo
                results.append(frozenset(combo))

    return None

- user2084330

3个回答

0

如果问题中没有提供路径，我同意通过http://en.wikipedia.org/wiki/Max-flow_min-cut_theorem以及足够聪明的网络构建来解决这个问题。然而，因为您没有给出什么是合理路径和不合理路径的提示，我担心一个恶意对手可能会找到一些奇怪的路径集合，这些路径并不来自于任何可能的网络。

最坏情况下，这可能会使你的问题像http://en.wikipedia.org/wiki/Set_cover_problem一样困难，也就是说，有人在Set Cover问题中找到一组路径和节点，可以生产一个路径割问题，其解决方案可以转化为原始的Set Cover问题。

如果是这样 - 我甚至没有尝试证明 - 那么你的问题就是NP-Complete，但由于你只有100个节点，因此可能会在Set Cover的许多论文中找到一些方法，这些方法在实践中可以工作，或者可以为您提供足够好的近似值。除了维基百科文章外，http://www.cs.sunysb.edu/~algorith/files/set-cover.shtml 还指向两个实现，并且快速搜索在http://www.ise.ufl.edu/glan/files/2011/12/EJORpaper.pdf 的一篇论文开头找到以下摘要：

SCP是强NP难题（Garey和Johnson，1979），已经开发了许多算法来解决SCP。精确算法（Fisher和Kedia，1990；Beasley和JØrnsten，1992；Balas和Carrera，1996）大多基于分支定界和分支定限。Caprara等人（2000）比较了不同的SCP精确算法。他们表明，SCP的最佳精确算法是CPLEX。由于精确方法需要大量计算工作来解决大规模SCP实例，因此通常使用启发式算法在合理的时间内找到良好或接近最优解。贪心算法可能是快速解决大型组合问题的最自然的启发式方法。至于SCP，最简单的方法是Chvatal（1979）的贪心算法。尽管简单、快速且易于编码，但贪心算法很少能生成高质量的解决方案...

- mcdowella

我只添加了路径部分，因为我认为这可以帮助聚焦问题。我有一个完整的图形（类似于 Twitter 或 Facebook 的典型社交网络图形）。这会改变你的答案吗？ - user2084330

它改变了我的答案，以至于我将其作为单独的答案提交。 - mcdowella

0

编辑：如果你想要删除实际上的所有路径，而不是给定列表中的路径，则像mcdowella所解释的最大流技术比这种方法更好。

正如mcdowella所提到的，一般情况下，该问题是NP难的。然而，从你的例子看来，一个精确的方法可能是可行的。

首先，你可以删除所有不可删除的路径上的顶点。然后通过消除优势顶点来减小实例。例如，每条包含15的路径也包含2，因此删除15永远没有意义。在这个例子中，如果所有顶点都可用，那么2、3、9和35支配所有其他顶点，所以问题只剩下4个顶点。

然后从最短路径中取一个顶点，递归地分为两种情况：删除它（删除包含它的所有路径）或者不删除它（从所有路径中删除它）。（如果该路径长度为1，则省略第二种情况。）然后再次检查优势性。

这在最坏情况下呈指数级增长，但对于你的例子可能已经足够了。

- Falk Hüffner

谢谢您的建议。我会考虑进行这些更改。我刚刚重新设计了该函数，使其不再是递归的。这样可以让代码更加简洁和快速。我很快就会发布新版本。 - user2084330

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- mcdowella · Accepted Answer

这是一个忽略路径列表的答案。它只需要一个网络、一个源节点和一个目标节点，找到网络中不是源或目标的最小节点集，以便删除这些节点会将源与目标断开连接。如果想要找出最小的边集，可以通过搜索最大流最小割来实现。请注意，在维基百科文章http://en.wikipedia.org/wiki/Max-flow_min-cut_theorem#Generalized_max-flow_min-cut_theorem 中提到存在广义最大流最小割定理，考虑了顶点容量和边容量，这至少是令人鼓舞的。此外，请注意，边容量以Cuv给出，其中Cuv是从u到v的最大容量。在图中，它们似乎被画成u/v。因此，正向方向的边容量可能不同于反向方向的边容量。

为了把最小顶点割问题伪装成最小边割问题，我建议利用这种不对称性。首先给所有现有的边一个巨大的容量——例如图中节点数的100倍。现在将每个顶点X替换为两个顶点Xi和Xo，我称之为入射和出射顶点。对于每个X和Y之间的边，在Xo和Yi之间创建一条边，向前具有现有容量，向后具有0容量——这些是单向边。现在为每个X创建一条Xi和Xo之间的边，向前容量为1，向后容量为0。

现在在结果图上运行最大流最小割算法。由于所有原始链接都具有巨大的容量，所以最小割必须由容量为1的链接组成（实际上，最小割被定义为将节点集分成两部分：你真正想要的是Xi、Xo节点对的集合，其中Xi在一半，Xo在另一半，但你可以很容易地从另一个得到一个）。如果您断开这些链接，就像标准的最大流最小割一样，您会将图形断开成两个部分，因此删除这些节点将使源与目标断开连接。由于您有最小割，这是最小的这样一组节点。

如果您能找到最大流最小割的代码，比如http://www.cs.sunysb.edu/~algorith/files/network-flow.shtml所指向的代码，我希望它能给出最小割。如果没有，例如，如果您通过解决线性规划问题来完成它，因为您恰好有一个线性规划求解器可用，那么请注意，从http://www.cse.yorku.ca/~aaw/Wang/MaxFlowMinCutAlg.html可以看出，当图被修改以减去实际使用的边容量时，最小割的一半是源可达的节点集——因此，只要给出最大流使用的边容量，就可以很容易地找到它。