按特定顺序生成组合的生成器

Question

按特定顺序生成组合的生成器

4

我有一个递归生成器，它会从0到top-1的数字范围内每个组合都产生一遍。以下是代码：

def f(width, top):
  if width == 0:
    yield []
  else:
    for v in range(top):
      for subResult in f(width - 1, top):
        yield [ v ] + subResult

如果以 f(3, 3) 调用，则会产生以下值

[0, 0, 0], [0, 0, 1], [0, 0, 2], [0, 1, 0], [0, 1, 1], [0, 1, 2],
[0, 2, 0], [0, 2, 1], [0, 2, 2], [1, 0, 0], [1, 0, 1], [1, 0, 2],
[1, 1, 0], [1, 1, 1], [1, 1, 2], [1, 2, 0], [1, 2, 1], [1, 2, 2],
[2, 0, 0], [2, 0, 1], [2, 0, 2], [2, 1, 0], [2, 1, 1], [2, 1, 2],
[2, 2, 0], [2, 2, 1], [2, 2, 2]

(Try calling it as list(f(3,3)) to get these as a list.)

我需要的是相同的值，但以不同的顺序排列：我想按它们的最大值排序，即首先是值[0, 0, 0]，然后是所有具有1作为最大值的值，即[0, 0, 1]、[0, 1, 0]、[0, 1, 1]、[1, 0, 0]...，然后是包含2的值，即[0, 0, 2]、[0, 1, 2]、[0, 2, 0]、[0, 2, 1]、[0, 2, 2]、[2, 0, 0]...等等。

生成器不应该产生两次相同的值（当然），并且应该能够使用非常大的值进行调用，例如f(4,1000)，然后简单地不要完全排空它（因此，在生成所有值之后根据它们的最大值进行排序是行不通的）。

我唯一能想到的方法是首先为f(w,0)生成所有值，然后为f(w,1)和f(w,2)生成所有值，并始终跳过已经产生的值，但我有一种烦人的感觉，认为可能会有更好的方法：

def g(width, top):
  for t in range(top):
    for v in f(width, t+1):
      if t in v:
        yield v

有什么想法吗？

- Alfe

你是否有同样最大值的两个列表的首选顺序？ - Apiwat Chantawibul

老实说，你对g的实现基本上就是我会做的方式。虽然有办法避免跳过，但增加的复杂性可能不值得。 - Ilmari Karonen

下一个排列版本已编码（请参见答案）。 - גלעד ברקן

6个回答

2

我自己找到了一个解决方法。我首先遍历顶部值，然后生成所有具有一个或多个此顶部值的值。为此，我循环遍历顶部值的数量（从1到宽度）。对于每个这样的数量，我循环遍历所有位置组合，这些顶部值可以具有。然后，我用顶部值填充这些位置，并使用其余值填充所有低于顶部值的值的简单乘积。

代码如下：

from itertools import product, combinations

def h(width, top):
  for t in range(top):
    for topAmount in range(1, width+1):  # how many top values are present?
      for topPositions in combinations(range(width), topAmount):
        for fillers in product(
            *[ range(t) for x in range(width-len(topPositions)) ]):
          fillers = list(fillers)
          yield [ t if i in topPositions else fillers.pop()
              for i in range(width) ]

但我仍然希望你能提出更优雅的解决方案。对我来说，这似乎仍然是一种蛮力方法，并且我产生的值的构建方式肯定不是我见过的最便宜的方法。

- Alfe

使用“组合”使代码更紧凑，否则此解决方案与我的想法非常相似。 - Apiwat Chantawibul

没错，我在发布之前没有更新，所以在我发布之前我没有看到你的回答，现在我想出来的更简洁的版本更好，但我们解决方案背后的思路是相同的 :-) （所以当接受答案时，我会始终优先选择你的回答）。 - Alfe

1

这是一个生成下一个字典序排列的算法（顺便说一句，我也喜欢将每个集合作为不同进制的数字，例如，基数1、基数2等）：

当没有所有数字都达到最大值时，对左侧最大值右侧的所有数字进行增量操作，方法如下：增加未达到最大值的最右侧数字，并将其右侧的所有数字设置为零。如果它们已经达到了最大值，则将第一个数字向左增加。如果它已经达到最大值，则将其右侧的所有数字设置为零；否则，将最右侧的数字设置为最大值，并将中间的数字设置为零。

Python代码：

def nextP(perm,top):
  if all (i == top for i in perm):
    return None

  left_max = perm.index(top)

  if all (i == top for i in perm[left_max:]):
    perm[left_max - 1] = perm[left_max - 1] + 1
    perm[left_max:] = [0] * (len(perm) - left_max - 1) + ([0] if perm[left_max - 1] == top else [top])
  else:
    right_max = len(perm) - next(x[0] for x in enumerate(perm[left_max + 1:][::-1]) if x[1] < top) - 1
    perm = perm[:right_max] + [perm[right_max] + 1] + [0] * (len(perm) - right_max - 1)

  return perm

例子：

permutation = [0,0,2]

while permutation:
  print permutation
  permutation = nextP(permutation,2)

[0, 0, 2]
[0, 1, 2]
[0, 2, 0]
[0, 2, 1]
[0, 2, 2]
[1, 0, 2]
[1, 1, 2]
[1, 2, 0]
[1, 2, 1]
[1, 2, 2]
[2, 0, 0]
[2, 0, 1]
[2, 0, 2]
[2, 1, 0]
[2, 1, 1]
[2, 1, 2]
[2, 2, 0]
[2, 2, 1]
[2, 2, 2]

- גלעד ברקן

1

首先注意，您可以轻松生成包含最大值为2的唯一解列表，使用包含最大值为1的唯一解列表。只需递增所有可能的1组合即可。例如，从[1,0,1]，您只需生成[2,0,1]、[1,0,2]和[2,0,2]。这表明以下解决方案：

import itertools

def g(n) :
    if n == 0 :
        yield [ 0,0,0 ]
    else :
        for x in g(n-1) : # for each solution containing `1` as the maximum
            idx = [ i for (i,xi) in enumerate(x) if xi == n-1 ] # locate the '1' to be incremented
            for j in xrange(1,len(idx)+1) : # increment one '1', then two '1', then three '1', etc
                for tup in itertools.combinations( idx, j ) : # all possible combinations of j '1'
                    y = list(x)
                    for t in tup : # prepare the new solution
                        y[t] += 1
                    yield y

示例：

list( g(0) )

[[0, 0, 0]]

list( g(1) )

[[1, 0, 0], [0, 1, 0], [0, 0, 1], [1, 1, 0], [1, 0, 1], [0, 1, 1], [1, 1, 1]]

list( g(2) )

[[2, 0, 0],
 [0, 2, 0],
 [0, 0, 2],
 [2, 1, 0],
 [1, 2, 0],
 [2, 2, 0],
 [2, 0, 1],
 [1, 0, 2],
 [2, 0, 2],
 [0, 2, 1],
 [0, 1, 2],
 [0, 2, 2],
 [2, 1, 1],
 [1, 2, 1],
 [1, 1, 2],
 [2, 2, 1],
 [2, 1, 2],
 [1, 2, 2],
 [2, 2, 2]]

- usual me

我真的很喜欢这种方法！特别是它的递归性质。它还将组合与乘积混合在一起（即使没有明确说明），因此这是另一个提示，表明这种方法可能是解决这个问题最有效的解决方案！ :-) - Alfe

1

一个生长立方体的想法

(更新自“对角线”想法)

当我在纸上画这个任务时，我得到了以下的内容：

 |0|1|2|3|
-|-|-|-|-|
0|a|b|c|d|
-|-|-|-|-|
1|b|b|c|d|
-|-|-|-|-|
2|c|c|c|d|
-|-|-|-|-|
3|d|d|d|d|
-|-|-|-|-|

它只显示2D，实际上有与数字一样多的维数。

字母a，b，c，d表示您想要获取组合的组。

我的意思是，这些组正在塑造n维增长立方体的角表面。

所有组合都由此立方体中所有点（包括内部空间）的坐标表示。请注意，我们的坐标使用离散值（0、1、2..），因此它们的数量是有限的。

如果您找到了扫描该增长立方体表面上的所有坐标的规则，则可以获得所需的生成器。

- Jan Vlcinsky

听起来很有前途。可惜，这个简单的想法还缺乏足够的细节，无法让我理解你的方法，以至于它变得有用 :) - Alfe

我的意思是：是的，当然，那就是我想要的值的顺序（在n维空间中），但你能提供一个更优美的算法来产生它们吗（比我的g更优美）？ - Alfe

转念一想，那个对角线不是我想要的顺序。你把(1,1)放在了组_c_中，与(0,2)和(2,0)在一起，但它应该与(1,0)和(0,1)在组_b_中。所以我们需要一个正方形（立方体）形状，一个正方形包含另一个正方形。尽管有这个修正，仍然是一个好想法，也许这会导致更好的解决方案。 - Alfe

根据您的图形方法，我找到了一个解决方案 :) 请查看我的答案（即将推出）。 - Alfe

1

@Alfe 期待着。今天我已经超出了我的能力极限，但是想知道接下来会发生什么。顺便说一下，你可以猜猜我的最爱书籍是什么。 - Jan Vlcinsky

可视化通常可以帮助，是的，我总是更喜欢这些方法。但我也发现，选择这种方式有时会太不愿意喜欢那些不适合它的问题;-) - Alfe

1

我相信你的函数 f 产生的值与 itertools.product 相同；也就是说，我认为你可以用以下代码替换 f：

from itertools import product

def f(width, top):
    for p in product(range(top), repeat=width):
        yield list(p)

为了按照您问题中所述的顺序排序这些值，您可以简单地使用 itertools.groupby ：

from itertools import groupby
from collections import defaultdict

def group_by_max_value(x, y):
    grouped = defaultdict(list)
    for k, g in groupby(f(x, y), key=max):
        grouped[k].extend(list(g))
    return [grouped[k] for k in sorted(grouped.keys())]

修改函数定义，使其能够在不必先生成整个序列的情况下产生已排序的值。

from itertools import groupby
from collections import defaultdict

def lazy_group_by_max_value(width, top):
    grouped = defaultdict(list)
    # using `itertools.product` with a `range` object
    # guarantees that the product-tuples are emitted
    # in sorted order.
    ps = product(range(top), repeat=width)
    for k, g in groupby(ps, key=max):
        xs = list(g)
        grouped[k].extend(xs)
        # if xs[-1] is of the form (0, 0, .., 0), (1, 1, .., 1), .., (n, n, .., n) etc
        # then we have found all the maxes for `k`, because all future
        # sequences will contain at least one value which is greater than k.
        if set(xs[-1]) == {k}:
            # `pop` (ie. remove) the values from `grouped`
            # which are associated with key `k`.
            all_maxes_for_k = grouped.pop(k)
            for coll in all_maxes_for_k:
                yield coll

- superjump

@superjump OP 不想排序，你的 sorted 和 max 在进行排序。 - Jan Vlcinsky

@JanVlcinsky OP说：“我想按它们的最大值排序。” 我有误解吗？ - superjump

@superjump 是的，按最大值排序是一个要求，但也有“先生成所有值，然后在它们的最大值确定之后再进行排序”的要求。 - Jan Vlcinsky

我还说过，我不能先收集所有值，然后再应用适当的排序。我只是说明了“排序顺序”，以明确我想要产生它们的顺序。 - Alfe

@Alfe，修改后的函数是否达到了你想要的效果？ - superjump

显示剩余3条评论

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Apiwat Chantawibul · Accepted Answer

def h(width,top,top_count):
    """
    Producing lists of length 'width' containing numbers from 0 to top-1.
    Where top-1 only occur exactly top_count times.
    """
    if width == 0:
        yield []
    elif width == top_count:
        yield [top-1]*top_count
    else:
        for x in range(top-1):
            for result in h(width-1,top,top_count):
                yield [x]+result
        if top_count > 0:
            for result in h(width-1,top,top_count-1):
                yield [top-1]+result


def m(width,top):
    yield [0]*width
    for current_top in range(2,top+1):
        for top_count in range(1,width+1):
            print "=== h{}".format((width,current_top,top_count))
            for result in h(width,current_top,top_count):
                print result
                yield result

ans = [x for x in m(3,3)]

结果：

=== h(3, 2, 1)
[0, 0, 1]
[0, 1, 0]
[1, 0, 0]
=== h(3, 2, 2)
[0, 1, 1]
[1, 0, 1]
[1, 1, 0]
=== h(3, 2, 3)
[1, 1, 1]
=== h(3, 3, 1)
[0, 0, 2]
[0, 1, 2]
[0, 2, 0]
[0, 2, 1]
[1, 0, 2]
[1, 1, 2]
[1, 2, 0]
[1, 2, 1]
[2, 0, 0]
[2, 0, 1]
[2, 1, 0]
[2, 1, 1]
=== h(3, 3, 2)
[0, 2, 2]
[1, 2, 2]
[2, 0, 2]
[2, 1, 2]
[2, 2, 0]
[2, 2, 1]
=== h(3, 3, 3)
[2, 2, 2]

为了显示每次调用函数h及其结果，添加了打印语句。

h函数的注释应该足够清晰地解释一般思路。