如何快速计算一组集合的所有交集的包含顺序

3
这是如何快速在Python中获取所有集合的交点的后续内容:
我有一个整数有限集合Ai的有限集合A = {A1,...,Ak},我想在Python中计算以下内容:
  1. 所有子集A的交集:F = {B的交集:B是A的子集}。这是上述问题的一个相当快速的解决方案。

  2. a. 所有对(X,Y)的组合,其中X,Y是F中的集合且X是Y的子集。

    b. 所有对(X,Y)的组合,其中X,Y是F中的集合且X是Y的子集,并且没有集合Z在F中满足X子集Z子集Y。换句话说,X和Y之间不存在包含顺序中的任何集合Z。这样的一对(X,Y)被称为覆盖

为什么我想这样做?-- 我想计算10^7个多面体的面格。在我的情景中,上述集合A包含600个集合。它确实是著名的600细胞,目前的计算需要大约6秒钟,如果可能的话,我希望将其降低10倍。

获取2.a.所需的6秒钟只需执行以下操作:

# this is John Coleman's function from above question's answer
def allIntersections(frozenSets):
    universalSet = frozenset.union(*frozenSets)
    intersections = set([universalSet])
    for s in frozenSets:
        moreIntersections = set(s & t for t in intersections)
        intersections.update(moreIntersections)
    return intersections

def all_intersections(lists):
    sets = allIntersections([frozenset(s) for s in lists])
    return [list(s) for s in sets]


A = [[19, 40, 41, 48], [19, 44, 45, 49], [23, 42, 43, 50], [23, 46, 47, 51], [19, 40, 41, 52], [19, 44, 45, 53], [23, 42, 43, 54], [23, 46, 47, 55], [2, 25, 36, 56], [0, 24, 32, 56], [24, 25, 56, 57], [24, 32, 56, 57], [16, 32, 56, 57], [1, 24, 32, 57], [25, 36, 56, 57], [16, 36, 56, 57], [3, 25, 36, 57], [8, 28, 34, 58], [10, 29, 38, 58], [28, 29, 58, 59], [28, 34, 58, 59], [20, 34, 58, 59], [29, 38, 58, 59], [20, 38, 58, 59], [9, 28, 34, 59], [11, 29, 38, 59], [6, 27, 37, 60], [4, 26, 33, 60], [5, 26, 33, 61], [26, 27, 60, 61], [26, 33, 60, 61], [16, 33, 60, 61], [27, 37, 60, 61], [7, 27, 37, 61], [16, 37, 60, 61], [12, 30, 35, 62], [14, 31, 39, 62], [30, 35, 62, 63], [20, 39, 62, 63], [20, 35, 62, 63], [30, 31, 62, 63], [31, 39, 62, 63], [15, 31, 39, 63], [13, 30, 35, 63], [0, 24, 32, 64], [1, 24, 32, 64], [8, 28, 34, 65], [9, 28, 34, 65], [3, 25, 36, 66], [2, 25, 36, 66], [11, 29, 38, 67], [10, 29, 38, 67], [4, 26, 33, 68], [5, 26, 33, 68], [12, 30, 35, 69], [13, 30, 35, 69], [6, 27, 37, 70], [7, 27, 37, 70], [15, 31, 39, 71], [14, 31, 39, 71], [4, 33, 68, 72], [0, 32, 64, 72], [18, 64, 72, 73], [32, 64, 72, 73], [32, 33, 72, 73], [1, 32, 64, 73], [18, 68, 72, 73], [5, 33, 68, 73], [33, 68, 72, 73], [2, 36, 66, 74], [6, 37, 70, 74], [3, 36, 66, 75], [7, 37, 70, 75], [36, 66, 74, 75], [37, 70, 74, 75], [36, 37, 74, 75], [22, 66, 74, 75], [22, 70, 74, 75], [12, 35, 69, 76], [8, 34, 65, 76], [18, 65, 76, 77], [34, 65, 76, 77], [34, 35, 76, 77], [18, 69, 76, 77], [35, 69, 76, 77], [13, 35, 69, 77], [9, 34, 65, 77], [10, 38, 67, 78], [14, 39, 71, 78], [38, 67, 78, 79], [22, 71, 78, 79], [22, 67, 78, 79], [38, 39, 78, 79], [39, 71, 78, 79], [15, 39, 71, 79], [11, 38, 67, 79], [0, 40, 48, 80], [19, 40, 48, 80], [19, 48, 49, 80], [8, 44, 49, 80], [19, 44, 49, 80], [2, 40, 52, 81], [10, 44, 53, 81], [19, 52, 53, 81], [19, 40, 52, 81], [19, 44, 53, 81], [19, 40, 80, 81], [19, 44, 80, 81], [23, 42, 50, 82], [23, 50, 51, 82], [1, 42, 50, 82], [23, 46, 51, 82], [9, 46, 51, 82], [23, 54, 55, 83], [3, 42, 54, 83], [23, 42, 54, 83], [23, 42, 82, 83], [11, 46, 55, 83], [23, 46, 55, 83], [23, 46, 82, 83], [19, 45, 49, 84], [12, 45, 49, 84], [4, 41, 48, 84], [19, 41, 48, 84], [19, 48, 49, 84], [19, 45, 84, 85], [19, 41, 84, 85], [6, 41, 52, 85], [19, 41, 52, 85], [14, 45, 53, 85], [19, 45, 53, 85], [19, 52, 53, 85], [23, 43, 50, 86], [5, 43, 50, 86], [23, 50, 51, 86], [23, 47, 51, 86], [13, 47, 51, 86], [7, 43, 54, 87], [23, 43, 54, 87], [23, 43, 86, 87], [23, 54, 55, 87], [23, 47, 86, 87], [15, 47, 55, 87], [23, 47, 55, 87], [8, 28, 65, 88], [0, 24, 64, 88], [9, 28, 65, 89], [28, 65, 88, 89], [17, 28, 88, 89], [17, 24, 88, 89], [1, 24, 64, 89], [24, 64, 88, 89], [64, 65, 88, 89], [4, 26, 68, 90], [12, 30, 69, 90], [5, 26, 68, 91], [13, 30, 69, 91], [26, 68, 90, 91], [21, 26, 90, 91], [68, 69, 90, 91], [30, 69, 90, 91], [21, 30, 90, 91], [10, 29, 67, 92], [2, 25, 66, 92], [29, 67, 92, 93], [66, 67, 92, 93], [11, 29, 67, 93], [17, 29, 92, 93], [25, 66, 92, 93], [17, 25, 92, 93], [3, 25, 66, 93], [14, 31, 71, 94], [6, 27, 70, 94], [21, 31, 94, 95], [21, 27, 94, 95], [15, 31, 71, 95], [31, 71, 94, 95], [70, 71, 94, 95], [27, 70, 94, 95], [7, 27, 70, 95], [2, 25, 56, 96], [0, 80, 88, 96], [0, 40, 56, 96], [2, 40, 81, 96], [2, 40, 56, 96], [0, 40, 80, 96], [40, 80, 81, 96], [2, 81, 92, 96], [17, 25, 92, 96], [2, 25, 92, 96], [0, 24, 88, 96], [0, 24, 56, 96], [24, 25, 56, 96], [17, 24, 88, 96], [17, 24, 25, 96], [28, 29, 58, 97], [80, 88, 96, 97], [80, 81, 96, 97], [44, 80, 81, 97], [8, 28, 88, 97], [8, 28, 58, 97], [8, 44, 58, 97], [8, 80, 88, 97], [8, 44, 80, 97], [81, 92, 96, 97], [17, 29, 92, 97], [17, 92, 96, 97], [17, 28, 29, 97], [17, 28, 88, 97], [17, 88, 96, 97], [10, 29, 92, 97], [10, 29, 58, 97], [10, 44, 58, 97], [10, 44, 81, 97], [10, 81, 92, 97], [6, 41, 85, 98], [6, 41, 60, 98], [4, 41, 60, 98], [6, 85, 94, 98], [4, 41, 84, 98], [4, 84, 90, 98], [41, 84, 85, 98], [6, 27, 94, 98], [6, 27, 60, 98], [26, 27, 60, 98], [4, 26, 90, 98], [4, 26, 60, 98], [21, 27, 94, 98], [21, 26, 90, 98], [21, 26, 27, 98], [14, 45, 85, 99], [21, 30, 31, 99], [14, 31, 62, 99], [30, 31, 62, 99], [14, 45, 62, 99], [21, 90, 98, 99], [21, 30, 90, 99], [84, 90, 98, 99], [45, 84, 85, 99], [84, 85, 98, 99], [12, 30, 62, 99], [12, 45, 62, 99], [12, 45, 84, 99], [12, 30, 90, 99], [12, 84, 90, 99], [85, 94, 98, 99], [21, 94, 98, 99], [14, 85, 94, 99], [14, 31, 94, 99], [21, 31, 94, 99], [3, 83, 93, 100], [1, 42, 82, 100], [3, 42, 57, 100], [1, 42, 57, 100], [42, 82, 83, 100], [3, 42, 83, 100], [1, 82, 89, 100], [1, 24, 89, 100], [17, 24, 89, 100], [1, 24, 57, 100], [17, 25, 93, 100], [3, 25, 57, 100], [3, 25, 93, 100], [17, 24, 25, 100], [24, 25, 57, 100], [17, 93, 100, 101], [82, 83, 100, 101], [11, 83, 93, 101], [83, 93, 100, 101], [11, 29, 59, 101], [11, 29, 93, 101], [17, 29, 93, 101], [9, 82, 89, 101], [82, 89, 100, 101], [17, 89, 100, 101], [11, 46, 83, 101], [11, 46, 59, 101], [9, 46, 59, 101], [9, 46, 82, 101], [46, 82, 83, 101], [9, 28, 59, 101], [17, 28, 29, 101], [28, 29, 59, 101], [17, 28, 89, 101], [9, 28, 89, 101], [5, 43, 86, 102], [5, 86, 91, 102], [7, 43, 61, 102], [5, 43, 61, 102], [21, 27, 95, 102], [7, 27, 95, 102], [7, 27, 61, 102], [5, 26, 61, 102], [26, 27, 61, 102], [21, 26, 27, 102], [21, 26, 91, 102], [5, 26, 91, 102], [43, 86, 87, 102], [7, 43, 87, 102], [7, 87, 95, 102], [86, 91, 102, 103], [86, 87, 102, 103], [15, 31, 63, 103], [30, 31, 63, 103], [15, 31, 95, 103], [87, 95, 102, 103], [15, 87, 95, 103], [15, 47, 63, 103], [15, 47, 87, 103], [47, 86, 87, 103], [13, 30, 63, 103], [13, 30, 91, 103], [13, 86, 91, 103], [13, 47, 63, 103], [13, 47, 86, 103], [21, 91, 102, 103], [21, 30, 91, 103], [21, 30, 31, 103], [21, 95, 102, 103], [21, 31, 95, 103], [0, 48, 72, 104], [4, 33, 72, 104], [4, 33, 60, 104], [4, 41, 60, 104], [4, 48, 72, 104], [4, 41, 48, 104], [32, 33, 72, 104], [0, 32, 72, 104], [0, 32, 56, 104], [0, 40, 56, 104], [40, 41, 48, 104], [0, 40, 48, 104], [16, 32, 56, 104], [16, 32, 33, 104], [16, 33, 60, 104], [40, 41, 104, 105], [40, 41, 52, 105], [41, 60, 104, 105], [16, 60, 104, 105], [40, 56, 104, 105], [16, 56, 104, 105], [2, 40, 56, 105], [2, 40, 52, 105], [2, 36, 56, 105], [16, 36, 56, 105], [16, 37, 60, 105], [16, 36, 37, 105], [2, 52, 74, 105], [36, 37, 74, 105], [2, 36, 74, 105], [6, 52, 74, 105], [6, 41, 52, 105], [6, 41, 60, 105], [6, 37, 60, 105], [6, 37, 74, 105], [12, 35, 76, 106], [12, 45, 62, 106], [12, 35, 62, 106], [8, 44, 49, 106], [8, 49, 76, 106], [12, 49, 76, 106], [44, 45, 49, 106], [12, 45, 49, 106], [20, 35, 62, 106], [8, 44, 58, 106], [20, 34, 58, 106], [8, 34, 58, 106], [20, 34, 35, 106], [8, 34, 76, 106], [34, 35, 76, 106], [20, 62, 106, 107], [20, 38, 39, 107], [20, 39, 62, 107], [10, 38, 78, 107], [38, 39, 78, 107], [10, 53, 78, 107], [20, 58, 106, 107], [20, 38, 58, 107], [10, 38, 58, 107], [44, 58, 106, 107], [10, 44, 58, 107], [10, 44, 53, 107], [14, 39, 62, 107], [14, 39, 78, 107], [14, 53, 78, 107], [14, 45, 53, 107], [44, 45, 106, 107], [44, 45, 53, 107], [14, 45, 62, 107], [45, 62, 106, 107], [16, 32, 57, 108], [1, 32, 57, 108], [16, 32, 33, 108], [16, 33, 61, 108], [5, 33, 61, 108], [1, 32, 73, 108], [32, 33, 73, 108], [1, 50, 73, 108], [5, 33, 73, 108], [5, 50, 73, 108], [1, 42, 50, 108], [1, 42, 57, 108], [5, 43, 61, 108], [5, 43, 50, 108], [42, 43, 50, 108], [7, 37, 61, 109], [3, 36, 57, 109], [3, 42, 57, 109], [7, 43, 61, 109], [42, 43, 108, 109], [43, 61, 108, 109], [42, 57, 108, 109], [16, 36, 57, 109], [16, 36, 37, 109], [16, 57, 108, 109], [16, 61, 108, 109], [16, 37, 61, 109], [36, 37, 75, 109], [7, 37, 75, 109], [3, 36, 75, 109], [3, 42, 54, 109], [42, 43, 54, 109], [7, 43, 54, 109], [3, 54, 75, 109], [7, 54, 75, 109], [34, 35, 77, 110], [13, 35, 63, 110], [13, 35, 77, 110], [13, 47, 63, 110], [9, 34, 77, 110], [9, 51, 77, 110], [13, 51, 77, 110], [9, 46, 51, 110], [13, 47, 51, 110], [46, 47, 51, 110], [20, 35, 63, 110], [20, 34, 35, 110], [9, 34, 59, 110], [20, 34, 59, 110], [9, 46, 59, 110], [11, 38, 59, 111], [11, 38, 79, 111], [15, 47, 63, 111], [11, 55, 79, 111], [15, 47, 55, 111], [15, 55, 79, 111], [11, 46, 59, 111], [46, 47, 55, 111], [11, 46, 55, 111], [38, 39, 79, 111], [15, 39, 79, 111], [15, 39, 63, 111], [20, 38, 39, 111], [20, 39, 63, 111], [20, 38, 59, 111], [47, 63, 110, 111], [20, 59, 110, 111], [20, 63, 110, 111], [46, 59, 110, 111], [46, 47, 110, 111], [8, 65, 88, 112], [18, 65, 76, 112], [8, 65, 76, 112], [8, 49, 76, 112], [0, 64, 88, 112], [64, 65, 88, 112], [18, 64, 65, 112], [18, 64, 72, 112], [0, 64, 72, 112], [0, 48, 72, 112], [8, 49, 80, 112], [8, 80, 88, 112], [48, 49, 80, 112], [0, 48, 80, 112], [0, 80, 88, 112], [4, 68, 90, 113], [4, 84, 90, 113], [12, 84, 90, 113], [18, 68, 69, 113], [68, 69, 90, 113], [12, 69, 90, 113], [4, 68, 72, 113], [18, 68, 72, 113], [18, 69, 76, 113], [12, 69, 76, 113], [4, 48, 84, 113], [4, 48, 72, 113], [12, 49, 76, 113], [48, 49, 84, 113], [12, 49, 84, 113], [18, 76, 112, 113], [49, 76, 112, 113], [18, 72, 112, 113], [48, 49, 112, 113], [48, 72, 112, 113], [2, 66, 92, 114], [66, 67, 92, 114], [52, 53, 81, 114], [2, 52, 81, 114], [2, 81, 92, 114], [22, 66, 67, 114], [22, 67, 78, 114], [2, 66, 74, 114], [2, 52, 74, 114], [22, 66, 74, 114], [10, 53, 81, 114], [10, 53, 78, 114], [10, 67, 78, 114], [10, 81, 92, 114], [10, 67, 92, 114], [6, 85, 94, 115], [6, 52, 85, 115], [52, 53, 85, 115], [14, 85, 94, 115], [14, 53, 85, 115], [52, 53, 114, 115], [6, 52, 74, 115], [52, 74, 114, 115], [14, 71, 94, 115], [70, 71, 94, 115], [6, 70, 94, 115], [6, 70, 74, 115], [22, 74, 114, 115], [22, 70, 74, 115], [22, 70, 71, 115], [14, 71, 78, 115], [53, 78, 114, 115], [14, 53, 78, 115], [22, 78, 114, 115], [22, 71, 78, 115], [18, 64, 65, 116], [18, 65, 77, 116], [50, 51, 82, 116], [1, 50, 82, 116], [9, 51, 82, 116], [9, 51, 77, 116], [9, 65, 77, 116], [18, 64, 73, 116], [1, 64, 73, 116], [1, 50, 73, 116], [64, 65, 89, 116], [1, 64, 89, 116], [9, 65, 89, 116], [1, 82, 89, 116], [9, 82, 89, 116], [18, 73, 116, 117], [18, 77, 116, 117], [51, 77, 116, 117], [18, 69, 77, 117], [13, 51, 86, 117], [13, 51, 77, 117], [13, 69, 77, 117], [18, 68, 69, 117], [18, 68, 73, 117], [13, 69, 91, 117], [13, 86, 91, 117], [68, 69, 91, 117], [5, 86, 91, 117], [5, 68, 73, 117], [5, 68, 91, 117], [50, 51, 116, 117], [5, 50, 86, 117], [50, 51, 86, 117], [5, 50, 73, 117], [50, 73, 116, 117], [11, 55, 83, 118], [3, 66, 75, 118], [3, 54, 75, 118], [3, 54, 83, 118], [54, 55, 83, 118], [11, 55, 79, 118], [11, 67, 79, 118], [66, 67, 93, 118], [3, 83, 93, 118], [3, 66, 93, 118], [11, 67, 93, 118], [11, 83, 93, 118], [22, 66, 67, 118], [22, 67, 79, 118], [22, 66, 75, 118], [54, 55, 87, 119], [54, 55, 118, 119], [55, 79, 118, 119], [54, 75, 118, 119], [22, 75, 118, 119], [22, 79, 118, 119], [22, 71, 79, 119], [22, 70, 71, 119], [70, 71, 95, 119], [22, 70, 75, 119], [7, 54, 75, 119], [7, 54, 87, 119], [7, 70, 75, 119], [7, 87, 95, 119], [7, 70, 95, 119], [15, 71, 79, 119], [15, 55, 79, 119], [15, 55, 87, 119], [15, 71, 95, 119], [15, 87, 95, 119]]
from itertools import combinations
F = all_intersections(A) # all intersections: function from other question
                         # takes 415 ms
F = sorted(F,lambda x,y: cmp(len(x),len(y)))
pairs = [ (x,y) for x,y in combinations(F,2) if set(y).issuperset(x) ]
                         # takes ~6 sec

一个例子是顶点标记为{1,2,3,4}的正方形:集合A是{{1,2},{2,3},{3,4},{4,1}},交集F是{{},{1},{2},{3},{4},{1,2},{2,3},{3,4},{4,1},{1,2,3,4}},需要的对是
({},{1}),({},{2}),({},{3}),({},{4}),
({1},{1,2}),({1},{4,1}),
({2},{1,2}),({2},{2,3}),
({3},{2,3}),({3},{3,4}),
({4},{3,4}),({4},{4,1}),
({1,2},{1,2,3,4}),({2,3},{1,2,3,4}),({3,4},{1,2,3,4}),({4,1},{1,2,3,4})

一旦给定集合F,我认为最好的方法就是比较元素。但我更多地考虑使用关于刚刚相交的东西的知识同时计算(1)和(2)的算法。
根据下面David K的解决方案,给定why,还有两个可以使用的假设:
1. 结果排序是分级的,具有唯一的最小值和唯一的最大值。也就是说,每个最大链F0 < F1 < ... < Fm的覆盖关系都具有相同的长度,其中F0是空集,Fm是输入集A的并集。我们称Fi集合为i级,这是一个明确定义的概念,由于其分级性质。 2. 每个M级集合都是恰好2个M+1级集合的交集。
非常感谢!

“all_intersections(A)” 在输入上需要多长时间?我假设“all_intersections”是一个 NP 问题,最坏情况相对于输入大小呈指数级增长,对吗? - beoliver
想法#1:首先按大小和字典顺序对F的成员进行排序。 大小排序意味着您只需要测试i <j的情况下X_i是否是X_j的子集,这大约是测试的一半;每个恒定大小块内的lex排序意味着相邻的集合具有相等的前缀,因此通过预计算告诉您集合X_i和X_{i-1}的最长公共前缀的表lcp[i],每个失败的子集测试都可以潜在地排除连续的后续超集候选块。 - j_random_hacker
想法#2(仅在对于每个集合X通常存在一个适当的超集Y,使得| Y | = | X | + 1时才有效):要为给定的集合X找到“最小超集”Y(同时回答2a),您可以取X并尝试将每个可能的其他整数插入其中,并查看是否存在这个更大的集合(例如通过维护包含F中所有集合的哈希表(在Python中可能称为字典)):如果是,则必须是最小超集。(如果不是,则需要尝试添加每个可能的一对其他整数等,或者退回到当前方法。) - j_random_hacker
想法#3:将这个小代码片段重写为C++或其他编译语言。如果你只是想要一个小的常数因子改进,通常这是可行的方法。 - j_random_hacker
我的观点是,在“为什么我想要这样做”之前的问题陈述中,只说明了A是一个由整数有限集合组成的有限集合,因此解决该问题的算法必须适用于_任何_由整数有限集合组成的有限集合。并非所有这样的集合(我相信)都会导致欧拉偏序。如果您限制只接受答案为欧拉偏序的输入,则可以利用欧拉偏序的属性进行较少的计算。例如,您只需要检查一级差异的集合包含关系。 - David K
显示剩余20条评论
3个回答

2
这是一个函数,利用输入列表是抽象多面体的面的假设。该函数不是采用集合交集的所有组合,而是假定输入是M维面(秩为M的多面体)在秩为M + 1的多面体中的完整列表。然后,它执行一个循环,每次迭代都会获取一个完整的M维面列表,并生成一个完整的(M-1)维面列表,同时累积这两个面列表的所有包含关系对。
函数的主循环相交每一对M维面,并构建一个结构列表,列出每个相交点和包含它的M维面。这些相交点包括所有(M-1)维面,但也包括一些低秩面。可以通过观察每个低秩面都是(M-1)维面的子集来识别低秩面。因此,任何是另一个相交的子集的相交都将被消除。
运行时间的大致分配是40%用于对面对进行相交、40%用于跟踪哪些M维面包含了每个结果的相交部分,10%用于消除秩小于M-1的面,以及10%用于将包含关系对写入输出列表。我的电脑比你的慢(原始函数需要6或6.5秒,而新函数需要8秒左右),但新函数的最终结果是每个秩和下一个秩之间所有包含关系对的列表,比生成所有包含关系(包括跨越多个秩的)的原始函数快10x-15x。
请注意,并非每个整数列表的列表都是新函数的有效输入,因为存在点集的集合,不是抽象多面体的面。我没有编写检查输入正确性的代码。
为了检查输出的正确性,我在原始函数中添加了一些(相当慢)的代码,以查找所有满足形式为(s,u)和(u,t)的一对(s,t),并返回已删除所有这些对的修订列表。我还修改了新旧函数,通过在它们放入输出的每个整数列表上调用sorted(),以使输出列表正确比较。然后,我确认两种函数产生了相同的输出。
顺便说一句,我怀疑这个函数可能不够pythonic。欢迎提出改进建议。
from collections import defaultdict
import sys

def generatePairs(A):
    # It is assumed that A consists exactly of all the facets of an abstract
    # polytope of rank N; that is, the abstract polytope is a graded poset
    # in which the minimal element is the empty set and has rank -1, the
    # maximal element is the polytope's body, which has rank N, and A
    # contains all facets of the polytope, which have rank N - 1.
    # Then within the graded poset,
    # each element of rank 0 is a point and has cardinality 1;
    # each element of rank 1 is an edge and has cardiality 2;
    # each element of rank M (where M > 1) is a rank-M polytope and has
    # cardinality at least M + 1, but may have greater cardinality.

    # We start with the facets (rank N-1).
    rank_to_intersect = [frozenset(s) for s in A]

    # Construct the body (rank N).
    polytope_body = list(frozenset.union(*rank_to_intersect))
    body_size = len(polytope_body)

    # covering_pairs will be all the pairs of polytopes (s,t) such that
    # rank(s) + 1 == rank(t) and s is a subset of t. Initially we populate
    # it with just the pairs whose ranks are respectively N-1 and N.
    covering_pairs = [(s, polytope_body) for s in A]

    while (len(rank_to_intersect) > 0) and (len(rank_to_intersect[0]) > 2):
        # For some integer M such that M > 1, rank_to_intersect contains all
        # the polytopes of rank M. At the end of each iteration of the loop,
        # rank_to_intersect will contain all the polytopes of rank M - 1.
        # Also, all the pairs (x,y) where rank(x) = M - 1 and rank(y) = M
        # will have been added to covering_pairs.

        container_map = defaultdict(list)
        while rank_to_intersect:
            s = rank_to_intersect.pop()
            for t in rank_to_intersect:
                x = s & t
                if len(x) > 1:
                    container_map[x].extend([s, t])
                    # Note that the list container_map[x]
                    # may contain duplicates

        # The keys of container_map, consisting of all pairwise
        # intersections of polytopes of rank M, include all polytopes
        # of rank M - 1 but also some polytopes of lower ranks.
        # Any polytope of a lower rank, however, is a subset of
        # a polytope of rank M - 1 that is also in the list.

        min_size   = min([len(s) for s in container_map.keys()])
        max_size   = max([len(s) for s in container_map.keys()])
        size_range = range(min_size, max_size + 1)
        candidates = dict([(i, []) for i in size_range])
        for s in container_map.keys():
            candidates[len(s)].append(s)

        # Repopulate rank_to_intersect with the polytopes of rank M - 1.
        for set_size in size_range:
            larger_sizes = range(set_size + 1, max_size + 1)
            for s in candidates[set_size]:
                if not any(any(t >= s for t in candidates[i])
                           for i in larger_sizes):
                    # We now know that s has rank M - 1, not a lower rank.
                    rank_to_intersect.append(s)

        # Add all the (rank-(M - 1), rank-M) pairs to covering_pairs.
        for s in rank_to_intersect:
            # container_map[s] may contain duplicates; avoid them.
            containers = frozenset(container_map[s])
            covering_pairs.extend([(list(s), list(t)) for t in containers])

    # At the end of the loop, rank_to_intersect contains the rank-1
    # polytopes, that is, the edges.
    # Each edge contains each of its two endpoints.
    points_with_duplicates = []
    for e in rank_to_intersect:
        covering_pairs.extend([([p], list(e)) for p in e])
        points_with_duplicates.extend(e)

    # List the containment pairs of the empty set without duplicating points.
    points = frozenset(points_with_duplicates)
    covering_pairs.extend([([], [p]) for p in points])

    return covering_pairs

太棒了,感谢你解决了这个问题!它确实产生了正确的输出(在600个单元格上与我今天实现的另一个算法进行了比较)。我实现的算法使用了来自sagemath.org的Graph类(因此在这里并不是很合适),而且比你的慢了大约2.5倍。明天我会仔细阅读你的代码,并在需要时发表评论。 - Christian
你所做的假设有一个注释:你只假设输出是一个排名的格子,并且每个排名为M的面是两个排名为M+1的面的交集。这适用于比多面体更广泛的结构类别。例如,(不一定是你关心的)兼容定向拟阵的面格类也满足这个假设,以及非多面体三角化球体的面格类也满足这个假设。 - Christian
你所使用的而我忽略了的重大改进是,你只需要相交所有N+1级面的每一对(实际上你昨天已经写过了,但我没有意识到这是一个巨大的优势),就可以得到所有N级面。 - Christian
棘手的部分是要区分所需等级的面孔和低等级的交叉点。在尝试之前,我并不完全确定这一步骤对运行时间的影响,具体取决于如何执行该步骤(这将极大地影响运行时间)。此外,感谢您对算法可能适用的问题类别的评论 - 知道算法适用性的更精确界限从来没有坏处。 - David K
我发布了另一种相当不同的算法,如果您想看看的话--欢迎反馈和改进! - Christian

0

请问以下更改的运行时间是多少?

def all_intersections(lists):
    sets = allIntersections([frozenset(s) for s in lists])
    return list(sets)


A = ...
F = all_intersections(A) 
F.sort(key=len)
pairs = [(x,y) for x,y in combinations(F,2) if y.issuperset(x)]

感谢您的使用!与set(y).issuperset(x)相比,y.issuperset(x)提供了2倍的因子(约为3秒),所以这确实值得做。但我更希望有一种聪明的方法来构建集合F,以便在计算过程中提供这些“pairs”,从而使我可以在总共不到1秒的时间内完成整个计算。 - Christian
我知道,抱歉,我非常忙,没有太多时间去深思熟虑! - beoliver

0

这里有一个替代算法,它只使用了上面的分级假设。其思想是

  1. 除了 A 的联合部分,每个交点都有一个可快速计算的覆盖

  2. 这将产生一个从顶部元素开始的顺序生成树,由 A 的联合给出

  3. 然后按顺序计算排名,并仅比较相差1的排名元素

    from collections import defaultdict
    
    # 这是John Coleman从https://dev59.com/TZffa4cB1Zd3GeqP_rRg中提取出的函数
    # 计算包括空交集(对应于`A`中所有集合的联合)在内的所有交集。
    def all_intersections(A):
        # 使用frozensets进行交集运算
        A = map(frozenset,A)
        A.sort(key=len)
    
        # `A`的联合作为基础集合
        universalSet = frozenset.union(*A)
    
        # 逐个计算交集
        intersections = set([universalSet])
        for s in A:
            moreIntersections = set(s & t for t in intersections)
            intersections.update(moreIntersections)
        return intersections
    
    def ranked_pieces(intersections):
        # 这是为了避免下面长度测试的检查
        lens = { s:len(s) for s in intersections }
        V = sorted(intersections, lambda x,y:cmp(lens[x],lens[y]))
    
        m = len(V)
    
        lower_covs = defaultdict(set)
        # 我们首先计算生成树…
        for i,x in enumerate(V):
            for j in range(i+1,m):
                y = V[j]
                if lens[x] < lens[y] and y.issuperset(x):
                    lower_covs[y].add(x)
                    # 由于V是根据大小排序的,
                    # 我们肯定已经找到了一个覆盖并停止了。
                    break
    
        # …然后计算级别集合
        level = set([V[-1]])
        level_sets = [level]
    
        while level:
            level = set.union(*[lower_covs[v] for v in level])
            level_sets.append(level)
    
        # 级别集合现在是倒序排序的
        # 即level_sets [0]是最大的集合,
        # 而level_sets [-1]是空集
        return level_sets
    
    def ranked_pieces_to_covers(ranked_pieces):
        # 这是因为我们希望使用元组而不是集合
        # 并且我们只想计算一次
        back_ref = {f:tuple(sorted(f)) for i in range(len(ranked_pieces)) for f in ranked_pieces[i]}
        covs = []
        for i in range(len(ranked_pieces)-1):
            high = ranked_pieces[i]
            low  = ranked_pieces[i+1]
    
            for x in low:
                for y in high:
                    if y.issuperset(x):
                        covs.append((back_ref[x], back_ref[y]))
        return covs
    
    def generate_covers(A):
        return ranked_pieces_to_covers(ranked_pieces(all_intersections(A)))
    

对于600维数据的测试,David K的算法速度要快得多:

    sage: %time X = generatePairs(A)
    CPU times: user 363 ms, sys: 28.2 ms, total: 392 ms
    Wall time: 373 ms
    sage: %time Y = generate_covers(A)
    CPU times: user 1.09 s, sys: 25.3 ms, total: 1.12 s
    Wall time: 1.08 s
    sage: set(X) == set(Y)
    True

但是,集合越“密集”,另一个算法就变得越好:

10个顶点的单纯形:

    sage: m = 10; A = [ [i for i in range(m) if i != j] for j in range(m) ]
    sage: %time X = generatePairs(A)
    CPU times: user 151 ms, sys: 4.33 ms, total: 156 ms
    Wall time: 144 ms
    sage: %time Y = generate_covers(A)
    CPU times: user 118 ms, sys: 17.3 ms, total: 136 ms
    Wall time: 106 ms
    sage: set(X) == set(Y)
    True

以及14个顶点上的单纯形:

    sage: m = 14; A = [ [i for i in range(m) if i != j] for j in range(m) ]
    sage: %time X = generatePairs(A)
    CPU times: user 56.3 s, sys: 136 ms, total: 56.5 s
    Wall time: 56.6 s
    sage: %time Y = generate_covers(A)
    CPU times: user 31 s, sys: 65.4 ms, total: 31 s
    Wall time: 31 s

我主要在这里发布算法以供文档使用,但我非常感谢改进建议。


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接