使用std::next_permutation的重复排列（不改变重复项的顺序）/不使用std::next_permutation的重复排列

Question

使用std::next_permutation的重复排列（不改变重复项的顺序）/不使用std::next_permutation的重复排列

9

我曾经使用std::next_permutation实现重复排列。

但是我发现它(std::next_permutation)会改变重复项的位置。

e.g.
[0] 0 1 2 2'
[1] 2' 0 1 2
[2] 2' 1 0 2
[3] 2' 1 2 0 
...

如何实现带有/不带有使用std::next_permutation的重复排列，同时不改变重复项的顺序？

e.g.
[0] 0 1 2 2'
[1] 1 0 2 2'
[2] 1 2 0 2'
[3] 1 2 2' 0
...

- kyong kyong

你是否正在寻找std::shuffle？ - JHBonarius

@JHBonarius：我认为这里的目标是一个稳定的排列，其中访问了每个排列，但相同的项目保留其原始顺序：在第二个示例中，2 prime始终在2之后。 - M Oehm

@MOehm 对的，我想保留它们的原始顺序。 - kyong kyong

@JHBonarius 谢谢您的评论 :) 我不想交换相同项目的位置。 - kyong kyong

我们在谈论多大的数组呢？将相等的元素简单地重新排序是否有帮助？ - Aconcagua

2个回答

0

在这里，我们可以使用索引而不是值进行操作。

我们将使用索引进行排列，并仅输出符合要求的排列。

如果我们考虑保持顺序的要求，那么这就很简单了。

让我们看看“0、1、2、2”。它在（从零开始的）索引2和3处有一个重复的数字。如果我们现在对4个索引进行排列，那么我们可以检查是否满足要求。

为此，在对索引进行排列后，我们将寻找重复项的原始索引。

例如：如果排列是“0,1,3,2”，我们知道原始的重复项在2和3处。因此，我们查找索引2和3，并发现这些数字现在位于新索引3和2。我们不想显示这个。

对于实现，我们将在std::vector中存储重复数字的索引。并将此向量与重复项的值关联在一个std::unordered_map中。

再举个例子：

在开始时，我们在std::unordered_map中有以下数据：

Value  Vector with positions
 0          0
 1          1
 2          2,3

现在，如果我们遍历所有排列，我们将搜索双值的原始索引。因此，我们将在索引排列中搜索2和3，并找到它们的新位置。它们也将存储在std::vector中。

幸运的是，std::vector具有比较运算符，因此我们可以简单地比较原始的std::vector，该向量现在可能包含“3,2”。而且这将违反要求。

当然，这也适用于更多组重复值。

使用上述方法的一种可能的实现如下：


#include <iostream>
#include <vector>
#include <algorithm>
#include <unordered_map>
#include <numeric>

int main() {
    // Test Data
    std::vector data{ 0,1,2,2 };

    // Find duplicated values and their positions
    std::unordered_map<int, std::vector<size_t>> valuesAndPositionsOriginal{};
    for (size_t index{}; index < data.size(); ++index)
        valuesAndPositionsOriginal[data[index]].push_back(index);

    // We will work and do permutations of indices
    std::vector<size_t> indices(data.size());
    std::iota(indices.begin(), indices.end(), 0);

    // Here we will store the current positions of the suplicates after a permutation
    std::vector<size_t> currentPositions{};

    do {
        // If any set of duplicates will be reversed, then we will not show it
        bool allOk{ true };

        // For this permutation, make a check of the current indeces with the original ones
        for (const auto& [value, positions] : valuesAndPositionsOriginal) {

            // Need only to do something, if there are duplicates, so if value was there more than once
            if (positions.size() > 1) {

                currentPositions.clear();
                // Where is the index from the original position now?

                for (const size_t pos : positions)
                    currentPositions.push_back(std::distance(indices.begin(), std::find(indices.begin(), indices.end(), pos)));

                // And this is the evaluation, if the positions were reversed
                if (currentPositions > positions)
                    allOk = false;
            }
        }
        // Show result
        if (allOk) {
            for (const size_t index : indices)
                std::cout << data[index] << ' ';
            std::cout << '\n';
        }

    } while (std::next_permutation(indices.begin(), indices.end()));
}

对于大向量来说，这将会很慢。也许我可以想出一个数学解决方案...

- A M

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- M Oehm · Accepted Answer

next_permutation的参考实现会找到数组中最右边的一个逆序部分。如果这个逆序部分是整个数组，那么这就是字典序最大的排列，排列过程停止。否则，它会找到第一个未排序项之后，最右边的比它大的项并进行交换，然后将右侧逆序部分反转。

交换项和反转列表是跳跃过项并且失去排列稳定性的好机会。

稳定交换

使此算法稳定的一种方法是执行“稳定交换”。假设我们有以下列表：

1  1' 1" 2  2'

我们想要交换最外层的项目。交换后，列表应该是：

2  1  1' 2' 1"

我们可以通过进行两次循环交换来实现这一点。我们取出1，向2'移动，每当我们看到另一个1时，我们就放回原始的1并拿出1'，以此类推。同样的方法也适用于将2'向上冒泡到1。

这种稳定的交换可能如下所示：

namespace stable {
    template<class T>
    void iter_swap(T a, T b)
    {
        T lo = std::min(a, b);
        T hi = std::max(a, b);

        if (*lo != *hi) {
            auto loval = *lo;
            auto hival = *hi;

            for (auto it = lo + 1; it < hi; ++it) {
                if (loval == *it) {
                    std::swap(loval, *it);
                }
            }

            for (auto it = hi; it-- > lo; ) {
                if (hival == *it) {
                    std::swap(hival, *it);
                }
            }

            *lo = hival;
            *hi = loval;
        }
    }
}

当然，现在交换操作的时间复杂度是O(N)，而不是通常的O(1)。对于反转操作来说情况更糟糕，我使用了朴素的实现方式——我想这方面还有提升的空间。

namespace stable {
    template<class T>
    void reverse(T first, T last)
    {
        while (first != last && first != --last) {
            stable::iter_swap(first++, last);
        }
    }
}

现在，在原始的next_permutation算法中使用这两个稳定的变量：

namespace stable {
    template<class T>
    bool next_permutation(T first, T last)
    {
        auto r_first = std::make_reverse_iterator(last);
        auto r_last = std::make_reverse_iterator(first);
        auto left = std::is_sorted_until(r_first, r_last);

        if (left != r_last){
            auto right = std::upper_bound(r_first, left, *left);
            stable::iter_swap(left, right);
        }

        stable::reverse(left.base(), last);

        return left != r_last;
    }
}

这个算法并不是非常高效。但是，对于大型集合的排列来说，这种情况比较少见。这个变量的优点在于它可以直接使用：如果你有一个可以进行 <、== 和 != 比较的类，那么就没问题了。

（应该有一种变体，其中将小于比较函数作为第三个参数传递。我猜你必须用 !(a < b) && !(a > b) 替换 a == b，用 a < b || a > b 替换 a != b 才能使其正常工作。）

我写了一个简短的演示，其中包含一个字符串的包装结构体，其中对第一个字符进行比较。

排列和纠正

如果你需要更好的效率，我认为更好的方法是首先使用常规的 std::next_permutation，然后在第二次遍历中“整理”排列数组，通过用正确顺序中相同元素的每个出现覆盖每个元素来实现。

这样做需要设置一些额外的数据。也许应该为每个相同元素组分配一个唯一的、可比较的和可哈希的键，该键可用于比较和将原始元素存储在映射中。

这是这个想法的一个实现：

template<class Iter, typename Key>
class Permuter {
public:
    Permuter(Iter begin_, Iter end_,
        Key (*key_)(const typename Iter::value_type& ref))
    : begin(begin_), end(end_), key(key_), less(Less(key_))
    {
        Iter it = begin_;
        
        while (it != end_) {
            orig.push_back(*it++);
        }
        
        std::stable_sort(orig.begin(), orig.end(), less);
        
        typename std::vector<typename Iter::value_type>::iterator vec;
        vec = orig.begin();
        
        while (vec != orig.end()) {
            Key k = key(*vec);

            if (map.find(k) == map.end()) {
                map.insert(std::make_pair(k, vec));
            }
            
            vec++;
        }        
    }
    
    bool next()
    {
        if (std::next_permutation(begin, end, less)) {
            auto mmap = map;
            auto it = begin;
            
            while (it != end) {
                *it = *mmap[key(*it)]++;
                it++;
            }

            return true;
        }
        
        return false;
    }

private:
    struct Less {
        Key (*key)(const typename Iter::value_type& iter);

        Less(Key (*key_)(const typename Iter::value_type& iter))
        : key(key_) {}

        bool operator()(const typename Iter::value_type& a,
                      const typename Iter::value_type& b)
        {
            return (key(a) < key(b));
        }
    };

    Iter begin;
    Iter end;
    Key (*key)(const typename Iter::value_type& iter);
    std::vector<typename Iter::value_type> orig;
    std::unordered_map<Key,
        typename std::vector<typename Iter::value_type>::iterator > map;
    Less less;
};

这个想法是创建一个围绕现有的双向可迭代集合的permuter实例，然后调用next方法：

Permuter<std::vector<Stuff>::iterator, int>
    perm(stuff.begin(), stuff.end(), stuff_key); 

do {
    // so something with std::vector<Stuff> stuff
} while (perm.next());

在这里，函数stuff_key从每个const Stuff&项返回一个int键，该键将用于排序并插入到无序映射中。 Permuter保留原始数组的副本。该副本首先进行稳定排序，然后为每个键存储一系列相同元素的第一个元素的迭代器。在置换之后，该映射用于用其原始顺序覆盖容器中的元素。

我编写了一个简短演示，其中字符串的键是第一个字母，因此示例与上面的示例类似。

性能

一些快速而不科学的测量结果显示出有趣的结果：稳定交换的速度只比不维护稳定性的std::next_permutation慢了一点，大约10%。 Permuter要慢得多，它需要多达两倍的时间。

我原以为情况会相反，但很容易看出Permuter为什么很慢：对于每个置换后的修正步骤，它都会创建一个新的无序映射（因此创建了副本），并在完成后将其拆除。这一点一定是浪费的。（在映射中将原始迭代器和当前迭代器存储为一对也没有帮助。可能有更好的方法，但我不知道如何在不使用映射的情况下保持这种通用方法。）

稳定交换也可能受益于良好的局部性：它不需要任何附加数据，并且所有访问仅针对原始数组。

因此，我对稳定交换非常满意。它的实现并不是很复杂，并且在客户端代码中像std::next_permutation一样使用。