应用置换算法的常数内存空间实现方法

Question

应用置换算法的常数内存空间实现方法

arraysalgorithmpermutationin-place

21

我看到这个问题是一个编程面试书中的问题，这里我简化了一下这个问题。

假设你有一个长度为 n 的数组 A，同时你还有一个长度为 n 的置换数组 P。你的方法将返回一个数组，其中 A 的元素将按照在 P 中指定的索引顺序出现。

快速示例：你的方法接受 A = [a, b, c, d, e] 和 P = [4, 3, 2, 0, 1]。那么它将返回 [e, d, c, a, b]。你只能使用恒定的空间（即不能分配另一个需要 O(n) 空间的数组）。

有什么想法吗？

- ahmet alp balkan

你如何使用常数空间？你的输入数组使用了O(n)的空间，这不是常数。 - Patrick Kostjens

6

你可以重复使用输入的数组 - 实际上，问题是要“原地”应用排列。 - Sergey Kalinichenko

2

初始问题中给出的返回值是错误的。A = [a b c d e] P = [4 3 2 0 1] 应该返回 [d e c b a]，而不是在问题中指示的 [e d c a b]。 - Pradeep Krishnaraj

1

相关问题，从 TCS 视角来看：https://cstheory.stackexchange.com/questions/6711/complexity-of-applying-a-permutation-in-place - Dan Stahlke

9个回答

8

又是另一个不必要的答案！这个答案明确保留了排列数组P，这对我的情况是必要的，但代价较高。此外，这种方法不需要跟踪正确放置的元素。我知道之前的答案提供了O(N)解决方案，所以我想这只是为了娱乐！

我们得到最好情况复杂度为 O(N)，最坏情况下为 O(N^2)，平均情况下为 O(NlogN)。对于大型数组 (N~10000 或更大)，平均情况基本上等于 O(N)。

以下是 Java 的核心算法（我是说伪代码 *咳咳*）：

int ind=0;
float temp=0;

for(int i=0; i<(n-1); i++){
  // get next index
  ind = P[i];
  while(ind<i)
    ind = P[ind];

  // swap elements in array
  temp = A[i];
  A[i] = A[ind];
  A[ind] = temp;
}

下面是算法的一个运行示例（类似于之前的答案）：

假设有A = [a, b, c, d, e]

并且P = [2, 4, 3, 0, 1]

那么预期结果为[ c, e, d, a, b ]

i=0:  [a, b, c, d, e] // (ind=P[0]=2)>=0 no while loop, swap A[0]<->A[2]
       ^     ^
i=1:  [c, b, a, d, e] // (ind=P[1]=4)>=1 no while loop, swap A[1]<->A[4]
          ^        ^
i=2:  [c, e, a, d, b] // (ind=P[2]=3)>=2 no while loop, swap A[2]<->A[3]
             ^  ^
i=3a: [c, e, d, a, b] // (ind=P[3]=0)<3 uh-oh! enter while loop...
                ^
i=3b: [c, e, d, a, b] // loop iteration: ind<-P[0]. now have (ind=2)<3
       ?        ^
i=3c: [c, e, d, a, b] // loop iteration: ind<-P[2]. now have (ind=3)>=3
             ?  ^
i=3d: [c, e, d, a, b] // good index found. Swap A[3]<->A[3]
                ^
done.

这个算法可以在while循环中跳来跳去，直到任何索引j<i，在第ith次迭代期间最多可以反弹次。最坏情况下（我认为！）每次外层的for循环迭代都会导致来自while循环的个额外赋值，因此我们将得到一个算术级数的形式，这将为复杂度增加一个N^2的因素！然而，运行这个算法的一系列N并平均计算while循环所需的“额外”赋值数（在每个N的多个排列上进行平均），强烈提示我平均情况下的时间复杂度是O(NlogN)。
谢谢！

- RinRisson

@AhmadS 您的样例程序输出为 [c, a, b, d] = [A[2], A[0], A[1], A[3]]。这不是预期的输出吗？ - Dan Stahlke

3

@RinRisson 迄今为止，他给出了唯一完全正确的答案！其他答案都需要额外的存储空间——O(n)栈空间，或者假设排列P方便地存储在O(n)未使用但可变的标志位旁边等等。

这是RinRisson在C++中编写的正确答案。它通过了我投入的每一个测试，包括每个长度从0到11的可能排列的穷举测试。

请注意，您甚至不需要将排列实体化；我们可以将其视为一个完全的黑盒函数OldIndex -> NewIndex：

template<class RandomIt, class F>
void permute(RandomIt first, RandomIt last, const F& p)
{
    using IndexType = std::decay_t<decltype(p(0))>;

    IndexType n = last - first;
    for (IndexType i = 0; i + 1 < n; ++i) {
        IndexType ind = p(i);
        while (ind < i) {
            ind = p(ind);
        }
        using std::swap;
        swap(*(first + i), *(first + ind));
    }
}

或者在顶部添加更STL-ish的接口：

template<class RandomIt, class ForwardIt>
void permute(RandomIt first, RandomIt last, ForwardIt pfirst, ForwardIt plast)
{
    assert(std::distance(first, last) == std::distance(pfirst, plast));
    permute(first, last, [&](auto i) { return *std::next(pfirst, i); });
}

- Quuxplusone

2

最简单的情况是只有一个元素需要交换到目标索引。例如：array=abcd perm=1032，你只需要进行两次直接交换：ab和cd。

对于其他情况，我们需要不断交换元素直到它到达最终目标。例如：abcd, 3021，从第一个元素开始，我们交换a和d。我们检查a的目标是否为0，即perm[perm[0]]。它不是，所以我们将a与array[perm[perm[0]]]中的元素b交换。然后我们再次检查a是否已经到达了目标位置，即perm[perm[perm[0]]]，结果是肯定的，所以我们停止操作。

我们对每个数组索引都重复这个过程。每个项仅移动一次，因此时间复杂度为O(N)，存储复杂度为O(1)。

def permute(array, perm):

for i in range(len(array)):
    elem, p = array[i], perm[i]

    while( p != i ): 
        elem, array[p] = array[p], elem  
        elem = array[p]
        p = perm[p]

return array

- Pari Rajaram

1

不鼓励仅提供代码的答案。请添加一些解释，说明如何解决问题，或者与现有答案的区别。来自审核 - Nick

0

因此，在下一次迭代步骤中，您可以将所需元素放在数组的前面，同时处理大小为(n-1)的剩余数组。

排列数组需要相应地调整以反映数组的减小尺寸。也就是说，如果您放在前面的元素在位置“X”被找到，则需要在排列表中减少所有大于或等于X的索引。

以您的示例为例：

array                   permutation -> adjusted permutation

A  =  {[a  b  c  d  e]}                 [4 3 2 0 1]
A1 =  { e [a  b  c  d]}   [3 2 0 1] ->    [3 2 0 1] (decrease all indexes >= 4)
A2 =  { e  d [a  b  c]}     [2 0 1] ->      [2 0 1] (decrease all indexes >= 3)
A3 =  { e  d  c [a  b]}       [0 1] ->        [0 1] (decrease all indexes >= 2)
A4 =  { e  d  c  a [b]}         [1] ->          [0] (decrease all indexes >= 0)

另一个例子：

A0 = {[a  b  c  d  e]}                  [0 2 4 3 1]
A1 = { a [b  c  d  e]}     [2 4 3 1] ->   [1 3 2 0] (decrease all indexes >= 0)
A2 = { a  c [b  d  e]}       [3 2 0] ->     [2 1 0] (decrease all indexes >= 2)
A3 = { a  c  e [b  d]}         [1 0] ->       [1 0] (decrease all indexes >= 2)
A4 = { a  c  e  d [b]}           [0] ->         [0] (decrease all indexes >= 1)

该算法虽然不是最快的，但仍能避免额外的内存分配，同时保持元素的初始顺序跟踪。

- pegazik

@Ziyao Wei，你说“一个循环之后”，你怎么知道“下一个不在正确位置的元素”是什么？除非你存储已排序元素的信息，但这需要额外的空间。我的算法略微复杂，但不会在一个封闭循环后中断。 - pegazik

我看到你的算法有问题。在第一步中，你执行了 e [ a b c d]，这基本上需要移动元素，这是 O(n) 的操作，而你又重复执行了 n 次，因此你的算法变成了 O(n^2)。 - ahmet alp balkan

根据底层数组结构的实现方式，对于双向链表，在每个迭代步骤中最多需要更改3个链接，这意味着即使加上索引操作，复杂度也仅为O(n)。无论如何，任务是要使用比线性额外空间分配更好的方法，与复杂度无关;-) 尽管如此，我同意Ziyao算法的修改更快、更简单。 - pegazik

0

这是一个简单的C/C++代码示例，补充Ziyao Wei的答案。由于评论不允许代码，所以只能作为答案发布：

for (int i = 0; i < count; ++i)
{
    // Skip to the next non-processed item
    if (destinations[i] < 0)
        continue;

    int currentPosition = i;

    // destinations[X] = Y means "an item on position Y should be at position X"
    // So we should move an item that is now at position X somewhere
    // else - swap it with item on position Y. Then we have a right
    // item on position X, but the original X-item now on position Y,
    // maybe should be occupied by someone else (an item Z). So we
    // check destinations[Y] = Z and move the X-item further until we got
    // destinations[?] = X which mean that on position ? should be an item
    // from position X - which is exactly the X-item we've been kicking
    // around all this time. Loop closed.
    // 
    // Each permutation has one or more such loops, they obvisouly
    // don't intersect, so we may mark each processed position as such
    // and once the loop is over go further down by an array from
    // position X searching for a non-marked item to start a new loop.
    while (destinations[currentPosition] != i)
    {
        const int target = destinations[currentPosition];

        std::swap(items[currentPosition], items[target]);
        destinations[currentPosition] = -1 - target;

        currentPosition = target;
    }

    // Mark last current position as swapped before moving on
    destinations[currentPosition] = -1 - destinations[currentPosition];
}

for (int i = 0; i < count; ++i)
    destinations[i] = -1 - destinations[i];

（对于 C 语言 - 用其他方法代替 std::swap）

- Andrian Nord

这个实现中有几个错误。首先，在 while 循环中，所有对 "i" 的引用都应该改为 "currentPosition"，此外，重置目标数组需要检查值是否为负数。 - James

是的，谢谢您指出。不知怎么的，我发布了错误版本。实际上有四个错误：1.用于跳过负索引的for循环可能会跳到最后一个项目之后；2.当我们退出while循环时，缺少对最后处理的项目进行反转（最好将所有内容反转，然后在循环中检查 - 在大型数组上速度更快）。3. 正如您正确指出的那样- while循环体中误用了i，应该是currentPosition。4.反转索引的公式错误。我已经更新了帖子，谢谢。 - Andrian Nord

这实际上是在作弊。它使用每个int的负空间来存储附加信息。因此，它确实分配了一个“O（n）”数组，只是隐式地分配了。它显然不适用于对负整数进行置换。 - Jan Schultke

1

@JanSchultke 这是在排列数组上工作，因此索引很可能只是正数。项目可以是任何东西。 - Andrian Nord

0

通过检查索引来追踪我们已经交换的内容。

Java，O(N) 交换，O(1) 空间：

    static void swap(char[] arr, int x, int y) {
        char tmp = arr[x];
        arr[x] = arr[y];
        arr[y] = tmp;
    }
    public static void main(String[] args) {
        int[] intArray = new int[]{4,2,3,0,1};
        char[] charArray = new char[]{'A','B','C','D','E'};
        for(int i=0; i<intArray.length; i++) {
            int index_to_swap = intArray[i];
            // Check index if it has already been swapped before
            while (index_to_swap < i) {
                // trace back the index
                index_to_swap = intArray[index_to_swap];
            }
            swap(charArray, index_to_swap, i);
        }
    }

- Ilya Kharlamov

0

这里有一个更清晰的版本，它采用了一个接受索引的swapElements函数，例如std::swap(Item[cycle], Item[P[cycle]])$。基本上，它遍历所有元素，并在它们尚未被访问时遵循循环。我们可以使用循环中已经完成的其他地方的第一个元素进行比较，而不是第二个检查！visited[P[cycle]]。

 bool visited[n] = {0};
 for (int i = 0; i < n; i++)   {
     int cycle = i;
     while(! visited[cycle] && ! visited[P[cycle]]) {
         swapElements(cycle,P[cycle]);
         visited[cycle]=true;
         cycle = P[cycle];
     }
 }

- phil

-1

我同意这里的许多解决方案，但以下是一个非常简短的代码片段，可以在排列循环中进行排列：

def _swap(a, i, j):
    a[i], a[j] = a[j], a[i]


def apply_permutation(a, p):
    idx = 0
    while p[idx] != 0:
        _swap(a, idx, p[idx])
        idx = p[idx]

所以下面的代码片段

a = list(range(4))
p = [1, 3, 2, 0]
apply_permutation(a, p)
print(a)

输出 [2, 4, 3, 1]

- Elior Malul

这不是一个完整的答案，因为它不能处理任何不是单个循环的置换。例如，apply_permutation(a, [1, 0, 3, 2]) 将会得到错误的答案。 - Quuxplusone

是的，但你可以始终持有一个辅助数组来标记哪些项目已经被交换。一旦完成一个循环，你就可以继续处理尚未被触及的项目（从辅助数组中），这些项目不属于你刚刚完成的循环。 - Elior Malul

1

问题的标题是“在常量内存空间中应用排列算法”。 - Quuxplusone

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- zw324 · Accepted Answer

有一个平凡的O(n^2)算法，但你可以用O(n)的时间复杂度完成。例如:

A = [a, b, c, d, e]

P = [4, 3, 2, 0, 1]

我们可以将A中每个元素与P所需的正确元素进行交换，每次交换后，就会有一个更多的元素处于正确的位置，然后以循环方式按照每个位置操作（使用指向^的符号交换元素）：

[a, b, c, d, e] <- P[0] = 4 != 0 (where a initially was), swap 0 (where a is) with 4
 ^           ^
[e, b, c, d, a] <- P[4] = 1 != 0 (where a initially was), swap 4 (where a is) with 1
    ^        ^
[e, a, c, d, b] <- P[1] = 3 != 0 (where a initially was), swap 1 (where a is) with 3
    ^     ^
[e, d, c, a, b] <- P[3] = 0 == 0 (where a initially was), finish step

在一次循环后，我们找到下一个没有停留在正确位置的数组元素，并重复这个步骤。最终，您将获得想要的结果，并且由于每个位置被访问了恒定次数（对于每个位置，最多执行一次操作（交换）），所以时间复杂度为O(n)。

您可以通过以下方式存储哪些元素已经处于正确的位置：

将P中相应的条目设置为-1，这是无法恢复的：经过上述操作后，P将变为[-1，-1，2，-1，-1]，表示只有第二个元素可能不在正确的位置上，进一步的步骤将确保它位于正确的位置并终止算法；
将P中相应的条目设置为-n - 1：P变为[-5，-4，2，-1，-2]，可以轻松地在O(n)时间内进行恢复。