C++20 版本的 std::accumulate

Question

C++20 版本的 std::accumulate

6

我正在努力理解这段代码，但是我无法弄清楚为什么这个版本

for (; first != last; ++first) 
    init = std::move(init) + *first;

比这个更快

for (; first != last; ++first)
    init += *first;

我确实从std::accumulate中获取了它们。第一个版本的汇编代码比第二个版本长。即使第一个版本创建了init的rvalue引用，它也总是通过添加*first来创建一个临时值，然后将其分配给init，这与第二种情况中创建临时值并将其分配给init的过程相同。那么，为什么使用std::move比使用+=运算符的“追加值”更好呢？

编辑

我正在查看C++20版本的accumulate代码，并且他们说在C++20之前，accumulate是这样的。

template<class InputIt, class T>
T accumulate(InputIt first, InputIt last, T init)
{
    for (; first != last; ++first) {
        init = init + *first;
    }
    return init;
}

在C++20之后，它变成了

template<class InputIt, class T>
constexpr // since C++20
T accumulate(InputIt first, InputIt last, T init)
{
    for (; first != last; ++first) {
        init = std::move(init) + *first; // std::move since C++20
    }
    return init;
}

我想知道使用std::move是否有任何实际改进。

编辑2

好的，这是我的示例代码：

#include <utility>
#include <chrono>
#include <iostream>

using ck = std::chrono::high_resolution_clock;

std::string
test_no_move(std::string str) {

    std::string b = "t";
    int count = 0;

    while (++count < 100000)
        str = std::move(str) + b;   // Without std::move

    return str;
}

std::string
test_with_move(std::string str) {

    std::string b = "t";
    int count = 0;

    while (++count < 100000)        // With std::move
        str = str + b;

    return str;

}

int main()
{
    std::string result;
    auto start = ck::now();
    result = test_no_move("test");
    auto finish = ck::now();

    std::cout << "Test without std::move " << std::chrono::duration_cast<std::chrono::microseconds>(finish - start).count() << std::endl;

    start = ck::now();
    result = test_with_move("test");
    finish = ck::now();

    std::cout << "Test with std::move " << std::chrono::duration_cast<std::chrono::microseconds>(finish - start).count() << std::endl;

    return 0;
}

如果你运行它，你会发现std::move版本比另一个版本要快得多，但是如果你使用内置类型进行尝试，你会发现std::move版本比另一个版本要慢。因此，我的问题是，既然这种情况可能与std::accumulate相同，为什么他们说C++20版本的accumulate使用std::move比没有使用它的版本更快呢？为什么在像字符串这样的东西中使用std::move会得到这样的改进，而不是使用像int这样的东西呢？为什么在这两种情况下，程序都创建了一个临时字符串str + b（或std::move(str) + b），然后移动到str？我的意思是，这是相同的操作。为什么第二个更快呢？

感谢您的耐心阅读，希望我这次表述清晰了。

- Sam

2

第二个版本从未成为C++标准。std::accumulate始终使用operator+()或BinaryOperation模板参数进行操作。 - Ruslan

2

std::accumulate是一个模板，因此在查看汇编之前需要执行几个步骤。你能包含一个[mcve]吗？ - 463035818_is_not_a_number

2

请发布完整的基准测试代码以及您如何编译和运行它。 - Maxim Egorushkin

1

对于内置类型，在生成的汇编中不应该有任何区别。 - Maxim Egorushkin

1

http://quick-bench.com/ 是一个很好的性能比较工具。 - Jarod42

显示剩余4条评论

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Evg · Accepted Answer

对于具有非平凡移动语义的类型，它可能更快。考虑累加足够长的字符串的std :: vector<std :: string>：

std::vector<std::string> strings(100, std::string(100, ' '));

std::string init;
init.reserve(10000);
auto r = accumulate(strings.begin(), strings.end(), std::move(init));

使用不带 std::move 的 accumulate:

std::string operator+(const std::string&, const std::string&);

将被使用。在每次迭代中，它将为生成的字符串在堆上分配存储空间，仅在下一次迭代时将其丢弃。

对于使用std::move的accumulate，

std::string operator+(std::string&&, const std::string&);

将被使用。与前一情况相反，第一个参数的缓冲区可以被重复使用。如果初始字符串具有足够的容量，则在累积过程中不会分配额外的内存。

简单演示

without std::move
n_allocs = 199

with std::move
n_allocs = 0

对于像 int 这样的内置类型，移动操作只是复制操作，没有什么可以移动的。对于优化构建来说，你很可能会得到完全相同的汇编代码。如果你的基准测试显示出任何速度提升/降低，很可能是你没有正确地进行（没有优化、噪音干扰、代码被优化掉等）。