寻找给定排列的索引

Question

寻找给定排列的索引

algorithmpermutationspacecombinatorics

10

我正在按某种顺序逐个阅读数字 0，1，...，(N - 1)。我的目标是找到给定排列的字典序索引，仅使用O(1)空间。

此问题以前已被提出，但我找到的所有算法都使用了 O(N)空间。我开始认为这是不可能的。但如果能减少分配数量，这将对我非常有帮助。

- Mugen

“给定排列的词典序指的是什么？” - Cratylus

你知道哪些O(n)算法？你确定它们不适合或者很容易修改使其适合吗？ - hugomg

1

排列p的索引定义为所有下标j的总和，使得p_j>p_(j+1)，其中1<=j<=n。这是您所指的吗？您的帖子过于简洁，留下了太多问题。还请包括您所指的解决方案的链接。 - kasavbere

1

@Mugen 我已经添加了步骤说明和伪代码来帮助这个过程。 - Rubens

1

@j_random_hacker，我研究了OP的使用方式，显然这是可接受的用法。我在stackoverflow之外找到了一个实现方法：http://www.geekviewpoint.com/java/numbers/permutation_index. - kasavbere

显示剩余3条评论

7个回答

3

如果您正在寻找一种获取唯一组合的词典索引或排名而不是排列的方法，则您的问题属于二项式系数。二项式系数处理在N个项目中选择K个唯一组合的问题。

我已经用C#编写了一个类来处理与二项式系数相关的常见函数。它执行以下任务：

以漂亮的格式将任何N选K的所有K索引输出到文件中。 K索引可以替换为更具描述性的字符串或字母。
将K索引转换为排序二项式系数表中条目的正确词典序索引或排名。这种技术比依赖迭代的旧技术快得多。它通过使用Pascal三角形固有的数学属性来实现，与迭代整个集合相比非常高效。
将排序的二项式系数表中的索引转换为相应的K索引。我认为它也比旧的迭代解决方案更快。
使用Mark Dominus方法计算二项式系数，这样不太可能溢出，并且可以处理更大的数字。
该类是用.NET C#编写的，并提供了一种使用通用列表来管理与问题相关的对象的方法（如果有）。此类的构造函数接受一个名为InitTable的bool值，当为true时，将创建一个通用列表来保存要管理的对象。如果此值为false，则不会创建该表。不需要创建表格即可使用上述4种方法。提供了访问器方法来访问表格。
有一个相关的测试类，展示了如何使用该类及其方法。它已经通过了2个案例的广泛测试，没有已知的错误。

要阅读关于这个类的内容并下载代码，请参见将二项式系数制表化。

以下经过测试的代码将迭代每个唯一组合：

public void Test10Choose5()
{
   String S;
   int Loop;
   int N = 10;  // Total number of elements in the set.
   int K = 5;  // Total number of elements in each group.
   // Create the bin coeff object required to get all
   // the combos for this N choose K combination.
   BinCoeff<int> BC = new BinCoeff<int>(N, K, false);
   int NumCombos = BinCoeff<int>.GetBinCoeff(N, K);
   // The Kindexes array specifies the indexes for a lexigraphic element.
   int[] KIndexes = new int[K];
   StringBuilder SB = new StringBuilder();
   // Loop thru all the combinations for this N choose K case.
   for (int Combo = 0; Combo < NumCombos; Combo++)
   {
      // Get the k-indexes for this combination.  
      BC.GetKIndexes(Combo, KIndexes);
      // Verify that the Kindexes returned can be used to retrive the
      // rank or lexigraphic order of the KIndexes in the table.
      int Val = BC.GetIndex(true, KIndexes);
      if (Val != Combo)
      {
         S = "Val of " + Val.ToString() + " != Combo Value of " + Combo.ToString();
         Console.WriteLine(S);
      }
      SB.Remove(0, SB.Length);
      for (Loop = 0; Loop < K; Loop++)
      {
         SB.Append(KIndexes[Loop].ToString());
         if (Loop < K - 1)
            SB.Append(" ");
      }
      S = "KIndexes = " + SB.ToString();
      Console.WriteLine(S);
   }
}

您应该能够轻松地将此类移植到您选择的语言中。为了实现您的目标，您可能不需要移植类的通用部分。根据您处理的组合数量，您可能需要使用比4字节整数更大的字长。

- Bob Bryan

您的方法能否用于这个问题？提前感谢您的帮助。https://math.stackexchange.com/questions/4346937/index-of-a-permutation-that-has-repetition?noredirect=1#comment9072818_4346937 - Eftekhari

1

@Eftekhari 看起来那个问题是关于排列而不是组合的。组合基于二项式定理。区别在于组合不允许重复，但对于排列来说这是可以的。例如，一个组合可能是(2,1,0)。而一个排列可能是(1,1,1)。因此，我的类不适用于排列。但是，我不明白为什么不能使用简单的公式来计算已知维数和它们各自长度的任何给定排列的秩。 - Bob Bryan

你能想到任何编程语言的答案吗？ - Eftekhari

2

在geekviewpoint上有一个解决这个问题的Java方案。它对为什么是正确的进行了很好的解释，代码易于理解。 http://www.geekviewpoint.com/java/numbers/permutation_index。它还有一个单元测试，可以使用不同的输入运行代码。

- kasavbere

1

正是我想要的！然而，没有描述反向操作（从索引中检索排列）。 - Lucas Sousa

0

这个想法并没有什么新意，但是使用Numpy实现了一个完全矩阵化的方法，没有显式的循环或递归（易于调整）：

import numpy as np
import math
vfact = np.vectorize(math.factorial, otypes='O')

def perm_index(p):
    return np.dot( vfact(range(len(p)-1, -1, -1)),
                   p-np.sum(np.triu(p>np.vstack(p)), axis=0) )

- Thomas Baruchel

0

如果您想假设算术运算是恒定时间，这里有一种方法：

def permutationIndex(numbers):
  n=len(numbers)
  result=0
  j=0
  while j<n:
    # Determine factor, which is the number of possible permutations of
    # the remaining digits.
    i=1
    factor=1
    while i<n-j:
      factor*=i
      i+=1
    i=0
    # Determine index, which is how many previous digits there were at
    # the current position.
    index=numbers[j]
    while i<j:
      # Only the digits that weren't used so far are valid choices, so
      # the index gets reduced if the number at the current position
      # is greater than one of the previous digits.
      if numbers[i]<numbers[j]:
        index-=1
      i+=1
    # Update the result.
    result+=index*factor
    j+=1
  return result

我特意写出了某些计算，这些计算可以使用一些Python内置操作更简单地完成，但我想让它更明显，即没有额外的非常量空间被使用。

正如maxim1000所指出的那样，表示结果所需的位数随着n的增加而快速增长，因此最终需要大整数，这些大整数不再具有恒定时间算术，但我认为这段代码解决了你问题的本质。

- Vaughn Cato

0

有N!个排列。为了表示索引，您至少需要N位。

- maxim1000

2

不，你需要至少 ceil(log2(N!)) 位。 - Todd Lehman

0

我刚刚使用Visual Basic编写了一段代码，我的程序可以直接计算给定索引的每个指数或每个相应排列，最多可达17个元素（这个限制是由于我的编译器对17！以上数字的科学记数法的近似）。

如果您有兴趣，我可以发送程序或在某个地方发布以供下载。它运行良好，可以用于测试和比较您的代码输出。

我使用了James D. McCaffrey的方法，称为factoradic，您可以在这里阅读相关内容，还有这里（页面末尾的讨论部分）。

- NP2P

这是一个免费程序的页面链接：http://lotterie.xoom.it/virgiliowizard/factorindex-1-0-english。 - NP2P

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Rubens · Accepted Answer

考虑以下数据：

chars = [a, b, c, d]
perm = [c, d, a, b]
ids = get_indexes(perm, chars) = [2, 3, 0, 1]

重复排列的可能解决方案如下：

len = length(perm)         (len = 4)
num_chars = length(chars)  (len = 4)

base = num_chars ^ len     (base = 4 ^ 4 = 256)
base = base / len          (base = 256 / 4 = 64)

id = base * ids[0]         (id = 64 * 2 = 128)
base = base / len          (base = 64 / 4 = 16)

id = id + (base * ids[1])  (id = 128 + (16 * 3) = 176)
base = base / len          (base = 16 / 4 = 4)

id = id + (base * ids[2])  (id = 176 + (4 * 0) = 176)
base = base / len          (base = 4 / 4 = 1)

id = id + (base * ids[3])  (id = 176 + (1 * 1) = 177)

反向过程：

id = 177
(id / (4 ^ 3)) % 4 = (177 / 64) % 4 =   2 % 4 = 2 -> chars[2] -> c
(id / (4 ^ 2)) % 4 = (177 / 16) % 4 =  11 % 4 = 3 -> chars[3] -> d
(id / (4 ^ 1)) % 4 = (177 / 4)  % 4 =  44 % 4 = 0 -> chars[0] -> a
(id / (4 ^ 0)) % 4 = (177 / 1)  % 4 = 177 % 4 = 1 -> chars[1] -> b

可能的排列数由num_chars ^ num_perm_digits给出，其中num_chars是可能字符的数量，num_perm_digits是排列中数字的数量。

这需要O(1)的空间，将初始列表视为固定成本；它需要O(N)时间，其中N是您的排列将具有的数字数。

根据上述步骤，您可以执行以下操作：

function identify_permutation(perm, chars) {

    for (i = 0; i < length(perm); i++) {
        ids[i] = get_index(perm[i], chars);
    }

    len = length(perm);
    num_chars = length(chars);

    index = 0;
    base = num_chars ^ len - 1;
    base = base / len;
    for (i = 0; i < length(perm); i++) {
        index += base * ids[i];
        base = base / len;
    }

}

这是一个伪代码，但也很容易转换为任何语言（: