如何获得多个列表的笛卡尔积

Question

如何获得多个列表的笛卡尔积

521

我该如何从一组列表中获得笛卡尔积（每个值的所有可能组合）？

例如，给定：

somelists = [
   [1, 2, 3],
   ['a', 'b'],
   [4, 5]
]

我该怎么得到这个？

[(1, 'a', 4), (1, 'a', 5), (1, 'b', 4), (1, 'b', 5), (2, 'a', 4), (2, 'a', 5), ...]

_{这种技术的一个常见应用是避免嵌套循环。有关更具体的情况，请参见避免嵌套for循环。同样，该技术可能用于“展开”具有列表值的字典；请参见将Python字典排列组合成字典列表。}

_{如果您想要同一列表的笛卡尔积，可以使用itertools.product来处理。请参见在列表中每对元素上操作或如何从列表中获取“带重复的排列”（列表本身的笛卡尔积）？。}

_{许多已经了解itertools.product的人都很难处理它期望每个输入序列都作为单独的参数传递的事实，而不是例如一个列表的列表。接受的答案显示了如何使用*处理此问题。然而，在函数调用中使用*来解包参数从根本上并没有不同于在任何其他时间使用它。有关此主题，请参见将元组扩展为参数（并根据需要使用它来关闭重复的问题）。}

- ʞɔıu

40

请注意，“每种可能的组合”并不完全等同于“笛卡尔积”，因为在笛卡尔积中，允许存在重复项。 - Kenan Banks

10

笛卡尔积是否有非重复版本？ - KJW

23

@KJW 是的，set(cartesian product)。 - NoBugs

14

Cartesian乘积中不应该出现重复项，除非输入列表本身包含重复项。如果你想在Cartesian乘积中去重，请对所有的输入列表使用set(inputlist)。不要对结果进行操作。 - CamilB

9

在数学上，笛卡尔积是一个集合，因此笛卡尔积不包含重复项。另一方面，如果输入有重复项，则itertools.product的输出将包含重复项。因此，严格来说，itertools.product不是笛卡尔积，除非您像@CamilB提到的那样将输入包装在set中。 - Cameron Bieganek

显示剩余4条评论

19个回答

124

import itertools
>>> for i in itertools.product([1,2,3],['a','b'],[4,5]):
...         print i
...
(1, 'a', 4)
(1, 'a', 5)
(1, 'b', 4)
(1, 'b', 5)
(2, 'a', 4)
(2, 'a', 5)
(2, 'b', 4)
(2, 'b', 5)
(3, 'a', 4)
(3, 'a', 5)
(3, 'b', 4)
(3, 'b', 5)
>>>

- Jason Baker

44

对于 Python 2.5 及更早版本：

>>> [(a, b, c) for a in [1,2,3] for b in ['a','b'] for c in [4,5]]
[(1, 'a', 4), (1, 'a', 5), (1, 'b', 4), (1, 'b', 5), (2, 'a', 4), 
 (2, 'a', 5), (2, 'b', 4), (2, 'b', 5), (3, 'a', 4), (3, 'a', 5), 
 (3, 'b', 4), (3, 'b', 5)]

这是一个递归版本的product()函数（仅供参考）：

def product(*args):
    if not args:
        return iter(((),)) # yield tuple()
    return (items + (item,) 
            for items in product(*args[:-1]) for item in args[-1])

示例：

>>> list(product([1,2,3], ['a','b'], [4,5])) 
[(1, 'a', 4), (1, 'a', 5), (1, 'b', 4), (1, 'b', 5), (2, 'a', 4), 
 (2, 'a', 5), (2, 'b', 4), (2, 'b', 5), (3, 'a', 4), (3, 'a', 5), 
 (3, 'b', 4), (3, 'b', 5)]
>>> list(product([1,2,3]))
[(1,), (2,), (3,)]
>>> list(product([]))
[]
>>> list(product())
[()]

- J.F. Sebastian

1

如果args中有一些迭代器，递归版本将无法工作。 - jfs

41

我会使用列表推导式：

somelists = [
   [1, 2, 3],
   ['a', 'b'],
   [4, 5]
]

cart_prod = [(a,b,c) for a in somelists[0] for b in somelists[1] for c in somelists[2]]

- user1035648

33

因为代码似乎被固定到了列表的数量。 - Bằng Rikimaru

@Bằng Rikimaru 列表推导式如何修复？lst = [i for i in itertools.product(*somelists)] - Lucas Schwartz

1

@LucasSchwartz 这个答案没有使用itertools，而是使用了链式列表推导循环。你的解决方案是另一个答案，基本上是这样的。 - Lucas Lima

33

使用itertools.product：

import itertools
result = list(itertools.product(*somelists))

- SilentGhost

7

在某些列表前面使用 * 的作用是什么？ - Vineet Kumar Doshi

1

"product(somelists)" 是一个笛卡尔积，它在子列表之间的方式是 Python 首先将 "[1, 2, 3]" 视为一个元素，然后在下一个逗号和换行符之后获取其他元素，因此第一个积项是 ([1, 2, 3],)，类似地，第二个积项是 ([4, 5],)，所以"[([1, 2, 3],), ([4, 5],), ([6, 7],)]"。如果想要在元组内部的元素之间获得笛卡尔积，需要使用星号告诉 Python 元组结构。对于字典，可以使用 **。详情请查看这里：https://dev59.com/PnRC5IYBdhLWcg3wJNcN#400753。 - hhh

@VineetKumarDoshi 请查看https://dev59.com/HHVD5IYBdhLWcg3wQZUg。 - Solomon Ucko

18

这里有一个递归生成器，它不会存储任何临时列表。

def product(ar_list):
    if not ar_list:
        yield ()
    else:
        for a in ar_list[0]:
            for prod in product(ar_list[1:]):
                yield (a,)+prod

print list(product([[1,2],[3,4],[5,6]]))

输出：

[(1, 3, 5), (1, 3, 6), (1, 4, 5), (1, 4, 6), (2, 3, 5), (2, 3, 6), (2, 4, 5), (2, 4, 6)]

- Anurag Uniyal

2

它们存储在堆栈中。 - Quentin Pradet

@QuentinPradet你是指像def f(): while True: yield 1这样的生成器会随着我们遍历它而不断增加其堆栈大小吗？ - Anurag Uniyal

@QuentinPradet 是的，但即使在这种情况下，只需要为最大深度需要的堆栈，而不是整个列表，因此在这种情况下是三个的堆栈。 - Anurag Uniyal

没错，抱歉。进行基准测试可能会很有趣。 :) - Quentin Pradet

我们现在有了yield from，这使得这个过程更简单了。 - njzk2

11

在Python 2.6及以上版本中，你可以使用`itertools.product`。在旧版本的Python中，你可以使用下面的代码（几乎等价，请查看文档），至少作为一个起点：文档中的代码。

def product(*args, **kwds):
    # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
    # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
    pools = map(tuple, args) * kwds.get('repeat', 1)
    result = [[]]
    for pool in pools:
        result = [x+[y] for x in result for y in pool]
    for prod in result:
        yield tuple(prod)

两者的结果都是迭代器，所以如果您确实需要一个列表进行进一步处理，请使用list(result)。

- user3850

根据文档，实际的itertools.product实现不会构建中间结果，这可能是昂贵的。对于中等大小的列表，使用此技术可能会很快失控。 - Kenan Banks

5

我只能向OP指出文件资料，而不能替他阅读。 - user3850

1

文档中的代码旨在演示产品功能，而不是解决早期版本Python的问题。 - Kenan Banks

9

虽然已经有很多答案了，但我想分享一些我的想法：

迭代方法

def cartesian_iterative(pools):
  result = [[]]
  for pool in pools:
    result = [x+[y] for x in result for y in pool]
  return result

递归方法

def cartesian_recursive(pools):
  if len(pools) > 2:
    pools[0] = product(pools[0], pools[1])
    del pools[1]
    return cartesian_recursive(pools)
  else:
    pools[0] = product(pools[0], pools[1])
    del pools[1]
    return pools
def product(x, y):
  return [xx + [yy] if isinstance(xx, list) else [xx] + [yy] for xx in x for yy in y]

Lambda方法

def cartesian_reduct(pools):
  return reduce(lambda x,y: product(x,y) , pools)

- weiyixie

在“迭代方法”中，为什么要将结果声明为 result = [[]] 我知道它是列表的列表，但通常即使我们声明了列表的列表，也使用 [] 而不是 [[]]。 - Sachin S

1

我在Pythonic解决方案方面有点新手。你或者路过的人能否请分别用循环编写列表推导式中的“迭代方法”？ - Johnny Boy

1

pools是什么？它是我想要乘积列表的列表吗？ - blkpingu

1

请有人帮忙解释一下这行代码：return [xx + [yy] if isinstance(xx, list) else [xx] + [yy] for xx in x for yy in y] - CyTex

这里没有任何想法，只有一堆代码。这是一个虚假的答案吗？ - Peter Mortensen

显示剩余2条评论

9

递归方法：

def rec_cart(start, array, partial, results):
  if len(partial) == len(array):
    results.append(partial)
    return 

  for element in array[start]:
    rec_cart(start+1, array, partial+[element], results)

rec_res = []
some_lists = [[1, 2, 3], ['a', 'b'], [4, 5]]  
rec_cart(0, some_lists, [], rec_res)
print(rec_res)

迭代方法：

def itr_cart(array):
  results = [[]]
  for i in range(len(array)):
    temp = []
    for res in results:
      for element in array[i]:
        temp.append(res+[element])
    results = temp

  return results

some_lists = [[1, 2, 3], ['a', 'b'], [4, 5]]  
itr_res = itr_cart(some_lists)
print(itr_res)

- Jai

3

你可以使用标准库中的itertools.product来获取笛卡尔积。其他酷炫的、相关的常用工具在itertools中包括permutations, combinations和combinations_with_replacement。这里是一个指向下方代码片段的Python CodePen的链接：

from itertools import product

somelists = [
   [1, 2, 3],
   ['a', 'b'],
   [4, 5]
]

result = list(product(*somelists))
print(result)

- chriskoch

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Triptych · Accepted Answer

633

使用itertools.product，它自Python 2.6起就已经可用。

import itertools

somelists = [
   [1, 2, 3],
   ['a', 'b'],
   [4, 5]
]
for element in itertools.product(*somelists):
    print(element)

这与以下代码相同：

for element in itertools.product([1, 2, 3], ['a', 'b'], [4, 5]):
    print(element)

- Triptych

35

只是想要补充一下，如果您使用 OP 提供的变量 somelists，则需要添加“*”字符。 - brian buck

1

@jaska: product()函数在结果中生成nitems_in_a_list ** nlists个元素(reduce(mul, map(len, somelists)))。没有理由认为产生单个元素的时间复杂度不是O(nlists)(摊销)，即与简单嵌套for循环的时间复杂度相同，例如对于问题中的输入:nlists=3，结果中的元素总数为:3*2*2，每个元素都有nlists个项目(在这种情况下为3)。 - jfs

5

在一些列表前面使用 * 是什么意思？它有什么作用？ - Vineet Kumar Doshi

13

@VineetKumarDoshi：这里用于将列表解包为多个参数传递给函数调用。在此阅读更多信息：https://dev59.com/HHVD5IYBdhLWcg3wQZUg - Moberg

2

只是一个细节，但请注意 itertools.product() 也可以处理生成器，而不仅仅是类似于列表的对象。 - normanius

显示剩余3条评论