partial_sort_copy是最快的C++部分排序算法吗？

Question

partial_sort_copy是最快的C++部分排序算法吗？

9

Consider the following function, median:

  real_t median(const std::initializer_list<real_t> vars) {
    real_t tmp[15];
    const unsigned x = vars.size() / 2;
    if (x & 1) {
      std::partial_sort_copy(vars.begin(), vars.end(), &tmp[0], &tmp[x]);
      return tmp[x];
    }
    const unsigned y = x + 1;
    std::partial_sort_copy(vars.begin(), vars.end(), &tmp[0], &tmp[y]);
    return (tmp[x] + tmp[y]) / 2;
  }

我正在使用部分排序来降低复杂度，因为我只需要对列表的一半进行排序。

此外，我假设std::partial_sort_copy比std::partial_sort或std::nth_element更快，因为在排序算法中不需要洗牌（It1 != It2）。我的假设正确吗？

注意：假设real_t可能是double，请不要批评使用除法。

注：我正在使用-pedantic，并且已知vars不会超过15个元素。

- kvanbere

你可以尝试其他的并看看。 - pm100

1

这个问答应该仍然存在于互联网上，供其他人查找。无论我最终是否需要自己进行基准测试并找出答案，我们都会看到的。 - kvanbere

1

你肯定意识到，即使对整个列表进行冒泡排序，对于只有15个元素的列表来说也足够了... - Euro Micelli

首先，您可以进行测试。不同的实现速度不同。其次，nth_element的渐近复杂度是线性的，而另外两个则是线性对数的。第三，partial_sort_copy需要分配内存并复制元素，因此可能会更慢。 - Siyuan Ren

partial_sort_copy 把更多的内存投入到问题中，但这是一把双刃剑。如果你溢出了CPU缓存，一个更大的工作集会降低性能。 - Potatoswatter

3个回答

6

如果我可以选择，我会选择部分快速排序。关于部分快速排序的信息但是，如果您只比较这两个...那么部分排序优于部分排序复制。这里有更多关于这两种方法的信息：部分排序的信息部分排序复制的信息在这里，您还可以找到部分快速排序的算法代码示例-它是用C和matlab实现的：示例-部分快速排序

- Avanz

这些幻灯片有点复杂，不太容易理解。它是否有标准库实现？ - kvanbere

关于您的编辑，我得出的印象是对于更大的数据集，std::nth_element实际上比两者都更快，因为它具有O(n)复杂度而不是Onlog(m)。 - kvanbere

1

刚刚添加了一个链接，其中包含部分快速排序的实现示例 - 请查看。 - Avanz

请注意，std::nth_element在最坏情况下的时间复杂度将是O(N * N)，而不是O(n)。参考文献：http://www-home.fh-konstanz.de/~bittel/prog2/Praktikum/musser97introspective.pdf。 - Avanz

什么让你相信这个快速排序比任何std::partial_sort实现都要快？partial_sort允许使用快速排序，但通常不会使用它，因为其他算法普遍被认为平均更快。 - Potatoswatter

partial_sort和nth_element无法在initialization_list上工作，因为初始化列表无法被修改。您需要首先将其复制到另一个容器中，使用partial_sort_copy，或传递其他类型的参数（不同的容器类型或起始/结束非const随机访问迭代器）。 - Michael Burr

1

你测试过你的代码了吗？

std::partial_sort_copy(vars.begin(), vars.end(), &tmp[0], &tmp[x]);不会将任何内容复制到tmp[x]，因为&tmp[x]被视为半开范围的结尾（即它刚好在最后一个有效元素之后）。所以你的return语句访问了未确定或默认构造的数组元素。

尝试以下操作：

real_t median(const std::initializer_list<real_t> vars) 
{
    real_t tmp[15];

    size_t siz = vars.size();
    if ((siz == 0) || (15 < siz)) return 0;     // or throw some sort of exception or ???

    const unsigned x = vars.size() / 2;
    std::partial_sort_copy(vars.begin(), vars.end(), &tmp[0], &tmp[x+1]);

    if (siz % 2 == 0) {
        return (tmp[x-1] + tmp[x]) / 2;
    }

    return tmp[x];
}

请注意，如果给定一个initializer_list作为数据源，像nth_element或partial_sort这样的就地修改算法将无法工作，因为初始化列表无法修改（无论参数是否标记为const - 对于initializer_list中的迭代器都是const限定的）。因此，必须进行复制才能使用标准算法函数找到中位数，可以在调用算法之前复制列表，也可以使用执行复制作为其工作一部分的算法变体，如partial_sort_copy()。

- Michael Burr

不，我还没有测试过。这是预先优化的;) 感谢您的答案，非常有帮助。 - kvanbere

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Peter Atashian · Accepted Answer

使用以下代码：

#include <chrono>
#include <iostream>
#include <thread>
#include <string>
#include <array>
#include <algorithm>

volatile int answer;

const int size = 15;

std::array<std::array<int, size>, 0x100> fresh_data;
std::array<std::array<int, size>, 0x100> data;

void naive(int n) {
    auto & a = data[n];
    std::sort(a.begin(), a.end());
    answer = a[size / 2];
}

void fancy(int n) {
    auto & a = data[n];
    std::partial_sort(a.begin(), a.begin() + (size / 2 + 1), a.end());
    answer = a[size / 2 ];
}

void ghoul(int n) {
    auto & a = data[n];
    std::array<int, size / 2 + 1> temp;
    std::partial_sort_copy(a.begin(), a.end(), temp.begin(), temp.end());
    answer = temp[size / 2];

}

void nthel(int n) {
    auto & a = data[n];
    std::nth_element(a.begin(), a.begin() + size / 2, a.end());
    answer = a[size / 2];
}

void gen_data() {
    for (auto & a : fresh_data)
    for (auto & b : a)
        b = rand();
}

void regen_data() {
    data = fresh_data;
}


template <typename T>
void test(T f, std::string n) {
    regen_data();
    auto a = std::chrono::high_resolution_clock::now();
    for (auto i = 0; i < 10000; ++i)
    for (auto i = 0; i < 0x100; ++i)
        f(i);
    auto b = std::chrono::high_resolution_clock::now();
    std::cout << n << ": " << std::chrono::duration_cast<std::chrono::milliseconds>(b - a).count() << std::endl;
}

int main() {
    gen_data();
    test(naive, "             std::sort");
    test(fancy, "     std::partial_sort");
    test(ghoul, "std::partial_sort_copy");
    test(nthel, "      std::nth_element");
}

我得到了以下结果：

             std::sort: 141
     std::partial_sort: 359
std::partial_sort_copy: 831
      std::nth_element: 149

在一台搭载AMD Phenom II x4 2.5GHz的计算机上，使用64位发布模式在Visual Studio 2013中进行了测试。