为什么 istream/ostream 很慢

Question

为什么 istream/ostream 很慢

46

在http://channel9.msdn.com/Events/GoingNative/2013/Writing-Quick-Code-in-Cpp-Quickly的50:40处，Andrei Alexandrescu开了一个关于istream效率不高/慢的玩笑。

我曾经遇到过ostream速度慢的问题，在运行主循环时，fwrite明显要快得多(可以节省几秒钟)，但我从来没有理解为什么，也没有深入研究过。

是什么导致C++中的istream和ostream变慢？或者至少比其他东西(如fread/fget, fwrite)慢，这些同样可以满足需求。

- user34537

如果我没记错的话，C++流必须与C I/O“构造”同步（出于兼容性原因）。我相信你可以通过关闭同步来加快它们的速度（尽管你将不得不避免在此之后执行诸如printf之类的操作）。 - Borgleader

@Borgleader：C语言中，ostream会同步到哪些“构造”（它是一个文件输出流而不是std::out），为什么它比C fwrite慢？ - user34537

3

请看这个回答：https://dev59.com/_mox5IYBdhLWcg3wLhei#9371717 - Borgleader

@Borgleader：那绝对回答了cin的问题。+1 - user34537

相关链接：https://dev59.com/W-o6XIcBkEYKwwoYTS1D - Ben Voigt

可能是为什么在C++中从stdin读取行比Python慢得多？的重复问题。 - 7hi4g0

5个回答

43

iostreams设计缓慢的原因有几个：

共享格式状态：每个格式化输出操作都必须检查之前可能被I/O操纵符改变的所有格式化状态。因此，iostreams比类似于printf的API（特别是像Rust或{fmt}中的格式字符串编译那样避免解析开销）慢得多，其中所有格式化信息都是局部的。
无法控制的区域设置使用：即使您不希望如此，所有格式化都会通过一个低效的区域设置层进行处理，例如在编写JSON文件时。请参见N4412: iostreams的缺点。
低效的代码生成：使用iostreams格式化消息通常包括多个格式化函数调用，因为参数和I/O操纵符与消息的各个部分交错。例如，在以下等效的printf调用中有三个这样的调用（godbolt）：
```
std::cout << "The answer is " << answer << ".\n";
```
而在等效的printf调用中只有一个调用（godbolt）：
```
printf("The answer is %d.\n", answer);
```
这些格式化函数调用中的每一个都有很大的开销（见上文）。
额外的缓冲和同步。可以通过sync_with_stdio(false)禁用此功能，但会导致与其他I/O设施的互操作性差。

- vitaut

14

也许这可以给你一些概念，帮助你了解你正在处理的内容：

#include <stdio.h>
#include <iomanip>
#include <iostream>
#include <iterator>
#include <fstream>
#include <time.h>
#include <string>
#include <algorithm>

unsigned count1(FILE *infile, char c) { 
    int ch;
    unsigned count = 0;

    while (EOF != (ch=getc(infile)))
        if (ch == c)
            ++count;
    return count;
}

unsigned int count2(FILE *infile, char c) { 
    static char buffer[8192];
    int size;
    unsigned int count = 0;

    while (0 < (size = fread(buffer, 1, sizeof(buffer), infile)))
        for (int i=0; i<size; i++)
            if (buffer[i] == c)
                ++count;
    return count;
}

unsigned count3(std::istream &infile, char c) {    
    return std::count(std::istreambuf_iterator<char>(infile), 
                    std::istreambuf_iterator<char>(), c);
}

unsigned count4(std::istream &infile, char c) {    
    return std::count(std::istream_iterator<char>(infile), 
                    std::istream_iterator<char>(), c);
}

unsigned int count5(std::istream &infile, char c) {
    static char buffer[8192];
    unsigned int count = 0;

    while (infile.read(buffer, sizeof(buffer)))
        count += std::count(buffer, buffer+infile.gcount(), c);
    count += std::count(buffer, buffer+infile.gcount(), c);
    return count;
}

unsigned count6(std::istream &infile, char c) {
    unsigned int count = 0;
    char ch;

    while (infile >> ch)
        if (ch == c)
            ++count;
    return count;
}

template <class F, class T>
void timer(F f, T &t, std::string const &title) { 
    unsigned count;
    clock_t start = clock();
    count = f(t, 'N');
    clock_t stop = clock();
    std::cout << std::left << std::setw(30) << title << "\tCount: " << count;
    std::cout << "\tTime: " << double(stop-start)/CLOCKS_PER_SEC << "\n";
}

int main() {
    char const *name = "equivs2.txt";

    FILE *infile=fopen(name, "r");

    timer(count1, infile, "ignore");

    rewind(infile);
    timer(count1, infile, "using getc");

    rewind(infile);
    timer(count2, infile, "using fread");

    fclose(infile);

    std::ifstream in2(name);
    timer(count3, in2, "ignore");

    in2.clear();
    in2.seekg(0);
    timer(count3, in2, "using streambuf iterators");

    in2.clear();
    in2.seekg(0);
    timer(count4, in2, "using stream iterators");

    in2.clear();
    in2.seekg(0);
    timer(count5, in2, "using istream::read");

    in2.clear();
    in2.seekg(0);
    timer(count6, in2, "using operator>>");

    return 0;
}

运行这个程序，我得到的结果如下（使用MS VC++）：

ignore                          Count: 1300     Time: 0.309
using getc                      Count: 1300     Time: 0.308
using fread                     Count: 1300     Time: 0.028
ignore                          Count: 1300     Time: 0.091
using streambuf iterators       Count: 1300     Time: 0.091
using stream iterators          Count: 1300     Time: 0.613
using istream::read             Count: 1300     Time: 0.028
using operator>>                Count: 1300     Time: 0.619

并且这个（使用MinGW）：

ignore                          Count: 1300     Time: 0.052
using getc                      Count: 1300     Time: 0.044
using fread                     Count: 1300     Time: 0.036
ignore                          Count: 1300     Time: 0.068
using streambuf iterators       Count: 1300     Time: 0.068
using stream iterators          Count: 1300     Time: 0.131
using istream::read             Count: 1300     Time: 0.037
using operator>>                Count: 1300     Time: 0.121

正如我们在结果中看到的那样，iostreams 并不是一种绝对缓慢的方法。相反，很大程度上取决于你如何使用iostreams（以及较小程度上的 FILE *）。这两种实现之间也有相当大的差异。

尽管如此，每个版本中最快的版本（fread 和 istream::read）本质上是相同的。在 VC++ 中，getc 比 istream::read 或 istreambuf_iterator 慢得多。

底线是：要从iostreams中获得良好的性能，需要比 FILE * 更加小心谨慎 - 但这肯定是可能的。它们还为您提供更多选项：方便快捷，当您不太关心速度时，以及直接与C样式I / O最佳竞争性能相当，并需要一些额外的工作。

- Jerry Coffin

2

由于我的编辑被拒绝了：你的istream::read版本有一个错误。最后一块字符没有被检查，在这里看。 - Darklighter

方便。另外，如果你使用“while (infile.get(ch))”将count6复制到新的count7中，你会发现它比operator>>快两倍，但仍然比getc慢两倍。 - Nick Westgate

@NickWestgate：是的 - 无论我添加多少个，至少还有三个可以添加。如果（例如）另一种方法比其他任何方法都要快，我可能会添加它 - 但是排名中间的另一个似乎不值得费心... - Jerry Coffin

这对于那些（像我一样）正在将某些代码的当前状态与其他选项进行比较的人来说会很有用。我非常失望istream::get在我维护的某些单线程代码中花费了大量时间进入和退出临界区。;-) 无论如何，感谢提供这个方便的测试套件。 - Nick Westgate

文件I/O在Windows上本质上是嘈杂的，可能在Linux上也是如此，因为存在缓存。 - gast128

1

虽然这个问题很老了，但我很惊讶没有人提到iostream对象的构造。

也就是说，每当你创建一个STL的iostream（以及其他流变体），如果你进入代码，构造函数就会调用内部的Init函数。在那里，会调用operator new来创建一个新的locale对象。同样，在销毁时也会被销毁。

在我看来，这很丑陋。并且肯定会导致对象构造/销毁变慢，因为在某些时候会使用系统锁来分配/释放内存。

此外，STL流中的一些流允许您指定一个allocator，那么为什么不使用指定的allocator来创建locale呢？

在多线程环境中使用流时，您还可以想象每次构造新流对象时调用operator new所带来的瓶颈。

如果你问我，我认为这是一团糟，因为我现在自己也发现了！

- dicksters

Karl Knechtel 在这里说：“（...）这个任务几乎肯定是 I/O bound 的，关于在 C++ 中创建 std::string 对象的成本或者使用 <iostream> 本身存在太多的 FUD。” - Marc.2377

有人和你的想法完全一样，他在这里发表了评论：Somebody else。 - dicksters

LLVM项目提供的从不使用<iostream>的理由很有趣。 - Chris Kitching

0

在类似的话题上，STL说：“您可以调用setvbuf()函数来启用stdout的缓冲。”

https://web.archive.org/web/20170329163751/https://connect.microsoft.com/VisualStudio/feedback/details/642876/std-wcout-is-ten-times-slower-than-wprintf-performance-bug-in-c-library

- AndrewDover

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Dietmar Kühl · Accepted Answer

实际上，IOStreams 不一定会很慢！关键在于以合理的方式实现它们，使它们变得更快。然而，大多数标准 C++ 库似乎不太注重实现 IOStreams。很久以前，当我的 CXXRT 仍然得到维护时，它的速度与 stdio 差不多 - 只要使用正确！

请注意，对于使用 IOStreams 的用户来说，有一些性能陷阱。以下准则适用于所有 IOStream 实现，尤其是那些专门设计为快速的实现：

使用 std::cin，std::cout 等时，需要调用 std::sync_with_stdio(false)！如果没有进行此调用，任何使用标准流对象都需要与 C 的标准流同步。当然，使用 std::sync_with_stdio(false) 时，假定你不会将 std::cin 与 stdin 混淆，std::cout 与 stdout 混淆等。不要使用 std::endl，因为它会强制刷新缓冲区，导致许多不必要的刷新。同样，不要设置 std::ios_base::unitbuf 或不必要地使用 std::flush。

当创建自己的流缓冲区（好吧，很少有用户这样做）时，请确保它们使用内部缓冲区！处理单个字符需要通过多个条件和一个虚函数，使其变得非常缓慢。