寻找按偏好排序的组合生成算法。

Question

寻找按偏好排序的组合生成算法。

4

我正在尝试找到一种算法来识别如下有序组合的结果：

有 N 场比赛，每场比赛有三个互斥的结果（胜利、失败或平局），总共有 3N 个可能的结果和3^N种组合。
每个可能的结果都被分配了一个唯一的排名，最喜欢的结果排名为1，最不喜欢的结果排名为3N。
找出每场比赛的前 M 种结果组合，从包含最喜欢排名结果的组合开始。

例如，假设 N = 3，结果按以下方式排名：

Contest 1 Win = 1
Contest 1 Tie = 4
Contest 1 Loss = 7
Contest 2 Win = 2
Contest 2 Tie = 5
Contest 2 Loss = 8
Contest 3 Win = 3
Contest 3 Tie = 6
Contest 3 Loss = 9

考虑到这些排名，组合应按以下顺序排序:

Contest 1 Win  (1), Contest 2 Win  (2), Contest 3 Win  (3)
Contest 1 Win  (1), Contest 2 Win  (2), Contest 3 Tie  (6)
Contest 1 Win  (1), Contest 2 Win  (2), Contest 3 Loss (9)
Contest 1 Win  (1), Contest 2 Tie  (5), Contest 3 Win  (3)
Contest 1 Win  (1), Contest 2 Loss (8), Contest 3 Win  (3)
Contest 1 Win  (1), Contest 2 Tie  (5), Contest 3 Tie  (6)
Contest 1 Win  (1), Contest 2 Tie  (5), Contest 3 Loss (9)
Contest 1 Win  (1), Contest 2 Loss (8), Contest 3 Win  (6)
Contest 1 Win  (1), Contest 2 Loss (8), Contest 3 Loss (9)
Contest 1 Tie  (4), Contest 2 Win  (2), Contest 3 Win  (3)
Contest 1 Loss (7), Contest 2 Win  (2), Contest 3 Win  (3)
Contest 1 Tie  (4), Contest 2 Win  (2), Contest 3 Tie  (6)
Contest 1 Tie  (4), Contest 2 Win  (2), Contest 3 Loss (9)
Contest 1 Loss (7), Contest 2 Win  (2), Contest 3 Tie  (6)
Contest 1 Loss (7), Contest 2 Win  (2), Contest 3 Loss (9)
Contest 1 Tie  (4), Contest 2 Tie  (5), Contest 3 Win  (3)
Contest 1 Tie  (4), Contest 2 Loss (8), Contest 3 Win  (3)
Contest 1 Loss (7), Contest 2 Tie  (5), Contest 3 Win  (3)
Contest 1 Loss (7), Contest 2 Loss (8), Contest 3 Win  (3)
Contest 1 Tie  (4), Contest 2 Tie  (5), Contest 3 Tie  (6)
Contest 1 Tie  (4), Contest 2 Tie  (5), Contest 3 Loss (9)
Contest 1 Tie  (4), Contest 2 Loss (8), Contest 3 Tie  (6)
Contest 1 Tie  (4), Contest 2 Loss (8), Contest 3 Loss (9)
Contest 1 Loss (7), Contest 2 Tie  (5), Contest 3 Tie  (6)
Contest 1 Loss (7), Contest 2 Tie  (5), Contest 3 Loss (9)
Contest 1 Loss (7), Contest 2 Loss (8), Contest 3 Tie  (6)
Contest 1 Loss (7), Contest 2 Loss (8), Contest 3 Loss (9)

我正在寻找一种方法，可以为任意大的N生成这些组合，尽管我并不希望得到所有的组合。例如，当N = 256且总共有3^256个组合时，我希望找到前500个组合。

- Tim Dean

“7,2,3”不应该在“4,5,3”之前吗？因为第一个数字中最大的是2，而第二个数字中最大的是3，且2 < 3。如果不是这样，我真的不明白你是如何得出输出结果的。 - Bernhard Barker

@Dukeling - 你是正确的。我想现在我弄对了。很明显，我无法像将其放入算法中那样高效地在脑海中解决这个问题 ;) - Tim Dean

@PieterGeerkens - 好的，我可以用三进制表示来模拟，但这怎么帮助我按等级排序组合呢？ - Tim Dean

@TimDean：从0数到3的N次方，然后执行与评分矩阵的点积。然后排序。 - Pieter Geerkens

1

@PieterGeerkens（请阅读问题的最后一句话）生成多达3^256个组合将需要永远的时间，无论你如何高效地完成它。 - Bernhard Barker

显示剩余5条评论

4个回答

1

首先，让我们重新表述问题以抽象出细节并确保我们在谈论同一个问题。

长度为N的元组有3^N个。每个元组(a_1,a_2,...,a_N)的组成部分a_i是介于1到3N之间的不同整数，并且对于固定的i，a_i只能取值于基数为3的子集S_i中。对于[1,3N]中的每个值，元组中只有一个位置可以承担该值。

现在，用自然数排序将元组S的组成部分排序后得到sorted(S)。如果sorted(S)在字典序中小于sorted(T)，则称元组S小于元组T。

问题在于在存在的3^N个元组中按照给定顺序找到前M个元组，其中M << 3^N。

我看到的解决原则本质上是带有修剪的回溯。

为了以关键方式修剪搜索空间，计算不大于M的最高3次幂。假设这个幂是H。我们有3^(H+1) > M >= 3^H。此时，您知道您的M个元组位于一个集合中，其中(N-H-1)个元组成分取最小可能值。可以按以下方式找到并固定这些组件：首先，取可以取值1的组件i，并将其固定为1。然后，在[1,3N]的值中未被组件i占用的值中选择最小值，并将唯一能够取该值的组件j固定为该值。以相同方式继续固定(N-H-1)个组件。之后，您已确定了至多3M个元组的集合。您可以通过对剩余的H+1个组件进行详尽的搜索来生成该完整元组集，并对该集进行排序，从而获得前M个元组。

- naitoon

1

感谢@naitoon - 我真的很喜欢这个解决方案。虽然Claudiu提供的解决方案是一个非常好的通用解决方案，但你的解决方案可能更适合我的特定用途。具体而言，我认为我可以将此解决方案调整为在3^n个可能解的任意范围内查找解决方案。 - Tim Dean

感谢TimDean。也许你可以将@Claudiu的解决方案与这些关于修剪的建议结合起来，以获得一个非常高效的算法，按顺序枚举范围内的所有元组。有关修剪的考虑将确定Claudiu描述的给定范围的枚举的初始和最终状态。 - naitoon

0

您可以在O(N.M)的时间内构建解决方案。这里的解决方案将是一个包含M个排列及其相关分数的列表，按分数降序排列。

如果N为1，则解决方案很简单：根据它们的得分对您获得的3个结果进行排序（如果M＜3，则减少解决方案）。也就是说，如果结果得分为胜利、失败和平局，则解决方案为：

sorted([('W', win), ('L', lose), ('D', draw)])

（当然是按得分降序排序）。

现在是迭代步骤。给定大小为N-1的问题的大小为M的解决方案，然后考虑具有结果得分（胜利、失败、平局）的新比赛。

只需将胜利、失败、平局得分添加到每个M解决方案中，并重新排序即可。也就是说，假设解决方案为[(res_1, score_1), (res_2, score_2), ... (res_M, score_M)]，则：

wins = [(res_1 + 'W', score_1+win), (res_2 + 'W', score_2+win), ... (res_M + 'W', score_M+win)]
loses = [(res_1 + 'L', score_1+lose), (res_2 + 'L', score_2+lose), ... (res_M + 'L', score_M+lose)]
draws = [(res_1 + 'D', score_1+draw), (res_2 + 'D', score_2+draw), ... (res_M + 'D', score_M+draw)]

新的解决方案将是sorted(wins + loses + draws)[:M]，这是来自组合解决方案的前M个最大解决方案。请注意，您可以使用归并排序的合并步骤在O(M)时间内完成此排序步骤。

以下是演示算法的粗略代码。对于紧凑的解决方案，列表切片和字符串附加需要被替换为O(1)的内容，并且需要使用合并步骤而不是一般排序。

def results(outcomes, M):
    if len(outcomes) == 1:
        return sorted(zip(outcomes[0], 'WLD'), reverse=True)[:M]
    tr = results(outcomes[1:], M)
    result = []
    for outscore, outcome in zip(outcomes[0], 'WLD'):
        result.extend((sc + outscore, pr + outcome) for sc, pr in tr)
    return sorted(result, reverse=True)[:M]

# Outcome scores for each contest, in order (win, lose, draw).
testcase = [(1, 7, 4), (2, 8, 5), (3, 9, 6)]
print results(testcase, 5)

输出结果为：

[(24, 'LLL'), (21, 'LLD'), (21, 'LDL'), (21, 'DLL'), (18, 'WLL')]

- Paul Hankin

谢谢您的建议，但这似乎并不符合我的要求。这个方法似乎是按照组合中所有结果的排名总和排序的组合列表。正如我在N=3时展示的组合顺序所示，组合的顺序并不是基于总和的排序。 - Tim Dean

0

GCC 4.7.3：g++ -Wall -Wextra -std=c++0x perm.cpp

#include <algorithm>
#include <cassert>
#include <functional>
#include <iostream>
#include <iterator>
#include <string>

using ints = std::vector<int>;

template <typename T>
bool equal(const std::vector<T>& a, const std::vector<T>& b) {
  return std::equal(std::begin(a), std::end(a), std::begin(b)); }

bool is_strictly_monotonic(const ints& p) {
  return std::is_sorted(std::begin(p), std::end(p), std::less_equal<int>()); }

ints gen_seq(int size, int init) {
  ints v(size);
  for (auto& e : v) { e = ++init; }
  return v; }

ints begin_combination(const ints& p, int mod) {
  return gen_seq(p.size(), -1); }

// Not the same as a normal STL end. This is actually the last combination.
ints end_combination(const ints& p, int mod) {
  return gen_seq(p.size(), mod - p.size()); }

// This is the most complicated bit of code, but what it does is
// straightforward. This function treats the input sequence as the
// (radix m+1) digits in a counter. It increments the counter by one, while
// maintaining the constraint that the digits in the sequence be strictly
// monotonic. This means that some numbers in the regular counter sequence
// will be skipped, for example the sequence after {1, 2, 9} (radix 10)
// is {1, 3, 4}. Like all digital counters it wraps on overflow.
void inc_monotonic_seq(ints& p, const int m) {
  assert(is_strictly_monotonic(p));

  int i = p.size() - 1;
  // scan right to left for number to increment
  while (i != -1) {
    if (p[i] < m) {
      ++p[i];
      ints::size_type j = i + 1;
      // propogate carry left to right
      while (j != p.size()) {
        p[j] = p[j - 1] + 1;
        if (m < p[j]) { break; }
        ++j; }
      if (j == p.size()) { break; } }
    --i; }

  // wrap around
  if (i == -1) { p = begin_combination(p, m); }

  assert(is_strictly_monotonic(p));
}

// A combination is valid if each contest is represented once.
bool is_valid_combination(const ints& p, const ints& contests) {
  auto t(p);
  for (auto& e : t) { e = contests[e]; }
  std::sort(std::begin(t), std::end(t));
  return is_strictly_monotonic(t); }

// This is the second most complicated bit of code. It calculates the
// combination following p in the ordered combination sequence. Combinations
// are ordered lexically by the sort of their elements, for example:
// {6, 1, 2} < {3, 1, 5} because {1, 2, 6} < {1, 3, 5}. Further, it enforces
// the constraint that each digit in the combination must be drawn from the
// contest in that position.
bool next_combination(ints& p, const ints& contests) {
  std::sort(std::begin(p), std::end(p));
  const auto mx = end_combination(p, contests.size() - 1);

  do {
    if (equal(p, mx)) { return false; }
    inc_monotonic_seq(p, contests.size() - 1);
  } while (!is_valid_combination(p, contests));

  // Sort in contest order: contest0, contest1, ...
  for (int i = 0; i < p.size(); ++i) {
    while (i != contests[p[i]]) {
      std::swap(p[i], p[contests[p[i]]]); } }

  return true;
}

int main() {
  const int N = 256; // number of contests
  const int M = 500; // number of most preferably ranked outcomes to display

  // This is the third most complicated bit of code. The following vector is
  // a map from priorities to contests. For example, the following 3 contest
  // win-tie-loss priorities {{0, 3, 6}, {1, 4, 7}, {2, 4, 8}} are stored as
  // {0, 1, 2, 0, 1, 2, 0, 1, 2}. This inversion is possible because the
  // contest outcome priorities are required to be disjoint since there is
  // a total ordering over the outcomes. Note, the highest priority is 0
  // (not 1 as in the specification).
  ints contests(3 * N); // map priorities to contests
  int c  = 0;
  for (auto& e : contests) { e = c % N; ++c; }

  // Highest priority combination.
  ints p(N);
  p = begin_combination(p, contests.size() - 1);

  int total = 1;
  do {
    // Finally, doing some sort of mapping here from priorities to strings,
    // as in the problem specification, is trivially done.
    std::copy(std::begin(p), std::end(p), std::ostream_iterator<int>(std::cout, " "));
    std::cout << "\n";
    if (M < ++total) { break; }
  } while(next_combination(p, contests));

  return 0;
}

一些注释：

如果按照现有写法，使用 N=256 和 M=500 运行代码会发现速度太慢了。我相信有聪明的方法可以避免所有排序操作。
如果你想要 M=500，那么 N 并不需要很大，N=6 就足够了（3^6=729）。如果 N 很大，你会看到一个长的恒定前缀/头部，后面跟着一个短的变化尾部。如果你思考一下，就不难理解其中的原因。

- Adam Burry

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Claudiu · Accepted Answer

这个算法似乎可行。下面是Python实现代码。

基本上，我会按照给定结果的值对输入进行排序：

Contest 1 Win = 1
Contest 2 Win = 2
Contest 3 Win = 3
Contest 1 Tie = 4
Contest 2 Tie = 5
Contest 3 Tie = 6
Contest 1 Loss = 7
Contest 2 Loss = 8
Contest 3 Loss = 9

我称之为“排序”。然后，我生成一个空结果列表：

[None, None, None]

递归算法非常简单，具体如下：

如果所有插槽都已经填满，则打印结果并返回。
否则，按升序迭代未使用的排序。如果排序是未使用过的插槽，则填充插槽、标记排序为已用，并进行递归。否则，继续迭代。

以下是代码。还有一个额外的技巧避免重复，例如我们刚刚填满了排序#6，那么我们只会使用排序#7、8和9。

#rankings as tuple of (winval, tieval, lossval) for each
#contest
rankings = [(1, 4, 7), (2, 5, 8), (3, 6, 9)]

#first sort the rankings by their values into
#list of (contestnum, w/t/l, value)
orderings = []
for i, (w, t, l) in enumerate(rankings):
    orderings.append((i, 'w', w))
    orderings.append((i, 't', t))
    orderings.append((i, 'l', l))
orderings.sort(key=lambda (i,res,val): val)

#now, find solution recursively as follows:
#- if list is full then print result & return
#- else, iterate thru the rankings & recur for each unused slot

def solve(orderings, slots, used_orderings, first_ordering):
    if all(slot is not None for slot in slots):
        yield slots
        return

    i = first_ordering
    while i < len(orderings):
        slot, result, value = orderings[i]

        if used_orderings[i]:
            i += 1
            continue
        if slots[slot] is not None:
            i += 1
            continue

        slots[slot] = (result, value)
        used_orderings[i] = True
        for solution in solve(orderings, slots, used_orderings, i):
            yield solution
        #backtrack
        slots[slot] = None
        used_orderings[i] = False

        i += 1

#print the first 40 solutions
num_solutions = 0
for solution in solve(orderings, [None]*len(rankings), [False]*len(orderings), 0):
    print "Solution #%d: %s" % (num_solutions+1, solution)
    num_solutions += 1
    if num_solutions >= 40:
        break

以下是给定输入所打印的结果，与问题相匹配：

Solution #1: [('w', 1), ('w', 2), ('w', 3)]
Solution #2: [('w', 1), ('w', 2), ('t', 6)]
Solution #3: [('w', 1), ('w', 2), ('l', 9)]
Solution #4: [('w', 1), ('t', 5), ('w', 3)]
Solution #5: [('w', 1), ('l', 8), ('w', 3)]
Solution #6: [('w', 1), ('t', 5), ('t', 6)]
Solution #7: [('w', 1), ('t', 5), ('l', 9)]
Solution #8: [('w', 1), ('l', 8), ('t', 6)]
Solution #9: [('w', 1), ('l', 8), ('l', 9)]
Solution #10: [('t', 4), ('w', 2), ('w', 3)]
Solution #11: [('l', 7), ('w', 2), ('w', 3)]
Solution #12: [('t', 4), ('w', 2), ('t', 6)]
Solution #13: [('t', 4), ('w', 2), ('l', 9)]
Solution #14: [('l', 7), ('w', 2), ('t', 6)]
Solution #15: [('l', 7), ('w', 2), ('l', 9)]
Solution #16: [('t', 4), ('t', 5), ('w', 3)]
Solution #17: [('t', 4), ('l', 8), ('w', 3)]
Solution #18: [('l', 7), ('t', 5), ('w', 3)]
Solution #19: [('l', 7), ('l', 8), ('w', 3)]
Solution #20: [('t', 4), ('t', 5), ('t', 6)]
Solution #21: [('t', 4), ('t', 5), ('l', 9)]
Solution #22: [('t', 4), ('l', 8), ('t', 6)]
Solution #23: [('t', 4), ('l', 8), ('l', 9)]
Solution #24: [('l', 7), ('t', 5), ('t', 6)]
Solution #25: [('l', 7), ('t', 5), ('l', 9)]
Solution #26: [('l', 7), ('l', 8), ('t', 6)]
Solution #27: [('l', 7), ('l', 8), ('l', 9)]

如果我随机生成了256个比赛的排名，似乎程序立即运行。