一个由元组列表组成的唯一组合

Question

一个由元组列表组成的唯一组合

11

给定一个三元组列表，例如：[(1,2,3), (4,5,6), (7,8,9)]如何计算所有可能的组合和子集的组合？

在这种情况下，结果应该是这样的：

[
(1), (1,4), (1,5), (1,6), (1,7), (1,8), (1,9), (1,4,7), (1,4,8), (1,4,9), (1,5,7), (1,5,8), (1,5,9), (1,6,7), (1,6,8), (1,6,9),
(2), ...,
(3), ...,
(4), (4,7), (4,8), (4,9), 
(5), (5,7), (5,8), (5,9), 
(6), (6,7), (6,8), (6,9), 
(7), (8), (9)
]

所有元素相同的元组被视为相同
不能使用来自相同元组的组合（例如，这些不应包含在解决方案中：(1,2)，(4,6)或(7,8,9)）

- p4rch

但是，如果第二条规则不允许使用(1,2)，为什么(1)到(9)是解决方案的一部分呢？ - Drey

看起来有三组元组：1）[(x,) for x in the_list[0]]，2）[(x,y) for x in the_list[0] for y in the_list[1]]，以及3）[(x,y,z) for x in the_list[0] for y in the_list[1] for z in the_list[2]]。 - chepner

可能是Picking unordered combinations from pools with overlap的重复问题。 - Joseph Wood

5个回答

6

这里是一种非递归解决方案，使用简单的for循环。通过将set应用于输出元组列表来强制执行唯一性。

lsts = [(1,2,3), (4,5,6), (7,8,9)]

res = [[]]
for lst in lsts:
    res += [(*r, x) for r in res for x in lst]

# print({tuple(lst) for lst in res[1:]})
# {(5, 9), (4, 7), (6, 9), (1, 4, 7), (2, 6, 9), (4, 8), (3, 4, 7), (2,
# 8), (2, 6, 8), (9,), (2, 5, 8), (1, 6), (3, 6, 8), (2, 5, 9), (3, 5,
# 9), (3, 7), (2, 5), (3, 6, 9), (5, 8), (1, 6, 8), (3, 5, 8), (2, 6,
# 7), (4, 9), (6, 7), (1,), (2, 9), (1, 6, 9), (3,), (1, 5), (5,), (3,
# 6), (7,), (3, 6, 7), (1, 5, 9), (2, 6), (2, 4, 7), (1, 5, 8), (3, 4,
# 8), (8,), (3, 4, 9), (1, 4), (1, 6, 7), (3, 9), (1, 9), (2, 5, 7), (3,
# 5), (2, 7), (2, 4, 9), (6, 8), (1, 5, 7), (2,), (2, 4, 8), (5, 7), (1,
# 4, 8), (3, 5, 7), (4,), (3, 8), (1, 8), (1, 4, 9), (6,), (1, 7), (3,
# 4), (2, 4)}

- hilberts_drinking_problem

太棒了！仅三行纯Python代码，无需导入任何内容，迄今为止最快的解决方案，目前也是唯一包含空集（应该是解决方案的一部分，我个人认为）的解决方案。 - wovano

顺便提一下：您提到通过应用“set”来强制唯一性，但我认为列表已经正确（即仅包含唯一值，没有重复项）。 - wovano

3

使用 itertools:

import itertools as it

def all_combinations(groups):
    result = set()
    for prod in it.product(*groups):
        for length in range(1, len(groups) + 1): 
            result.update(it.combinations(prod, length))
    return result

all_combinations([(1,2,3), (4,5,6), (7,8,9)])

- eumiro

1

另一个版本：

from itertools import product, combinations

lst = [(1,2,3), (4,5,6), (7,8,9)]

def generate(lst):
    for idx in range(len(lst)):
        for val in lst[idx]:
            yield (val,)
            for j in range(1, len(lst)):
                for c in combinations(lst[idx+1:], j):
                    yield from tuple((val,) + i for i in product(*c))

l = [*generate(lst)]
print(l)

输出：

[(1,), (1, 4), (1, 5), (1, 6), (1, 7), (1, 8), (1, 9), (1, 4, 7), (1, 4, 8), (1, 4, 9), (1, 5, 7), (1, 5, 8), (1, 5, 9), (1, 6, 7), (1, 6, 8), (1, 6, 9), (2,), (2, 4), (2, 5), (2, 6), (2, 7), (2, 8), (2, 9), (2, 4, 7), (2, 4, 8), (2, 4, 9), (2, 5, 7), (2, 5, 8), (2, 5, 9), (2, 6, 7), (2, 6, 8), (2, 6, 9), (3,), (3, 4), (3, 5), (3, 6), (3, 7), (3, 8), (3, 9), (3, 4, 7), (3, 4, 8), (3, 4, 9), (3, 5, 7), (3, 5, 8), (3, 5, 9), (3, 6, 7), (3, 6, 8), (3, 6, 9), (4,), (4, 7), (4, 8), (4, 9), (5,), (5, 7), (5, 8), (5, 9), (6,), (6, 7), (6, 8), (6, 9), (7,), (8,), (9,)]

- Andrej Kesely

1

感谢 @wovano 澄清规则2。这使得解决方案更加简短：

from itertools import chain, combinations, product

blubb = [(1, 2, 3), (4, 5, 6), (7, 8, 9)]

set_combos = chain.from_iterable(combinations(blubb, i) for i in range(len(blubb) + 1))
result_func = list(chain.from_iterable(map(lambda x: product(*x), set_combos)))

作为额外的奖励，我们还进行了速度比较。@hilberts_drinking_problem的解决方案非常棒，但存在一些开销。

def pure_python(list_of_tuples):
    res = [tuple()]
    for lst in list_of_tuples:
        res += [(*r, x) for r in res for x in lst]
    return res


def with_itertools(list_of_tuples):
    set_combos = chain.from_iterable(combinations(list_of_tuples, i) for i in range(len(list_of_tuples) + 1))
    return list(chain.from_iterable(map(lambda x: product(*x), set_combos)))


assert sorted(with_itertools(blubb), key=str) == sorted(pure_python(blubb), key=str)

两者计算的是相同的内容，但是...

%timeit with_itertools(blubb)
7.18 µs ± 11.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit pure_python(blubb)
10.5 µs ± 46 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

尽管对于小样本，itertools 稍微快一些，但是当集合增大时，差距就很大了。

import toolz
large_blubb = list(toolz.partition_all(3, range(3*10)))
assert sorted(with_itertools(large_blubb), key=str) == sorted(pure_python(large_blubb), key=str)

%timeit with_itertools(large_blubb)
106 ms ± 307 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit pure_python(large_blubb)
262 ms ± 1.85 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

这使得itertools的解决方案快了2.5倍。

编辑：根据规则2进行了更正，加入了速度比较。编辑：修复了另一个错误 - 现在速度比较是真实的。

- Drey

看起来你没有完全理解问题。请再次检查最后一个要求：“_不能使用相同元组派生的组合_”。这意味着你应该从(1,2,3)中选择0或1个元素，从(4,5,6)中选择0或1个元素，以及从(7,8,9)中选择0或1个元素。可以参考其他答案的输出结果。你的解决方案也返回了无效的组合，例如(1,2,4)。 - wovano

感谢@wovano的解释！我编辑了答案。现在它应该按照规则运行。 - Drey

不错的改进!! 我会将第三行修改为 set_combos = chain.from_iterable(combinations(blubb, i) for i in range(len(blubb) + 1))，这样它就可以适用于任意长度的输入，而不是固定长度为3。它还会添加空列表，因此您无需手动添加（例如，删除下一行的 [[]] +）。您的解决方案对于大型输入集更快，做得好 :-) - wovano

谢谢您的建议。如果我理解正确的话，我使用了len(blubb[0])+1而不是len(blubb)+1。 - Drey

不，它确实必须是len(blubb)而不是len(blubb[0])，因为您正在创建集合的组合（数字集合，而不是Python中的set），因此必须指定集合的数量，即len(blubb)。注：为什么要使用blubb这个名称？一个更具描述性的名称可能会有所帮助;-) - wovano

哦，是的。确实在 with_itertools 中有一个错误。它仍然更快，但不是数量级的差距。 - Drey

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Ajax1234 · Accepted Answer

你可以在生成器中使用递归：

data = [(1,2,3), (4,5,6), (7,8,9)]
def combos(d, c = []):
   if len(c) == len(d):
     yield c
   else:
     for i in d:
        if i not in c:
           yield from combos(d, c+[i])

def product(d, c = []):
  if c:
    yield tuple(c)
  if d:
    for i in d[0]:
      yield from product(d[1:], c+[i])

result = sorted({i for b in combos(data) for i in product(b)})
final_result = [a for i, a in enumerate(result) if all(len(c) != len(a) or len(set(c)&set(a)) != len(a) for c in result[:i])]

输出：

[(1,), (1, 4), (1, 4, 7), (1, 4, 8), (1, 4, 9), (1, 5), (1, 5, 7), (1, 5, 8), (1, 5, 9), (1, 6), (1, 6, 7), (1, 6, 8), (1, 6, 9), (1, 7), (1, 8), (1, 9), (2,), (2, 4), (2, 4, 7), (2, 4, 8), (2, 4, 9), (2, 5), (2, 5, 7), (2, 5, 8), (2, 5, 9), (2, 6), (2, 6, 7), (2, 6, 8), (2, 6, 9), (2, 7), (2, 8), (2, 9), (3,), (3, 4), (3, 4, 7), (3, 4, 8), (3, 4, 9), (3, 5), (3, 5, 7), (3, 5, 8), (3, 5, 9), (3, 6), (3, 6, 7), (3, 6, 8), (3, 6, 9), (3, 7), (3, 8), (3, 9), (4,), (4, 7), (4, 8), (4, 9), (5,), (5, 7), (5, 8), (5, 9), (6,), (6, 7), (6, 8), (6, 9), (7,), (8,), (9,)]