C++中将字符串转换为整数的最有效方法（比atoi更快）

Question

C++中将字符串转换为整数的最有效方法（比atoi更快）

32

正如标题所述，我正在寻找比atoi更高效的解决方案。目前，我所知道的最快速的方法是

atoi(mystring.c_str())

最后，我希望一种不依赖于Boost的解决方案。是否有好的性能技巧来实现这个？

附加信息：整数不会超过20亿，始终为正数，字符串中没有小数位。

- user788171

11

你会很难击败 atoi。 - Joel

5

这个问题的答案可能会有些取决于你允许的整数范围。你想转换“任何”整数吗，还是你的输入范围更具体？你确定'mystring'只包含没有其他字符的整数吗？它可以是负数吗？ - paddy

我添加了一些额外的信息，常规大小的整数，始终为正，字符串中没有小数。 - user788171

7

你得到了很好的答案，但我一直在想 - 你是否真正知道 atoi 单独使用会消耗你整体时间的相当比例？人们经常问这样的问题，实际上有其他的事情可以产生更大的加速效果，但他们不知道如何找到这些事情。 - Mike Dunlavey

11个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- johnnycrash · Answer 1

以下是我的内容。Atoi 是我能想到的最快的方法。我使用 msvc 2010 进行了编译，所以可能可以将两个模板结合起来。在 msvc 2010 中，当我结合模板时，会使得提供 cb 参数的情况变慢。

Atoi 处理了几乎所有特殊的 atoi 情况，并且与这种方法一样快甚至更快。

int val = 0;
while( *str ) 
    val = val*10 + (*str++ - '0');

这是代码：

#define EQ1(a,a1) (BYTE(a) == BYTE(a1))
#define EQ1(a,a1,a2) (BYTE(a) == BYTE(a1) && EQ1(a,a2))
#define EQ1(a,a1,a2,a3) (BYTE(a) == BYTE(a1) && EQ1(a,a2,a3))

// Atoi is 4x faster than atoi.  There is also an overload that takes a cb argument.
template <typename T> 
T Atoi(LPCSTR sz) {
    T n = 0;
    bool fNeg = false;  // for unsigned T, this is removed by optimizer
    const BYTE* p = (const BYTE*)sz;
    BYTE ch;
    // test for most exceptions in the leading chars.  Most of the time
    // this test is skipped.  Note we skip over leading zeros to avoid the 
    // useless math in the second loop.  We expect leading 0 to be the most 
    // likely case, so we test it first, however the cpu might reorder that.
    for ( ; (ch=*p-'1') >= 9 ; ++p) { // unsigned trick for range compare
      // ignore leading 0's, spaces, and '+'
      if (EQ1(ch, '0'-'1', ' '-'1', '+'-'1'))
        continue;
      // for unsigned T this is removed by optimizer
      if (!((T)-1 > 0) && ch==BYTE('-'-'1')) {
        fNeg = !fNeg;
        continue;
      }
      // atoi ignores these.  Remove this code for a small perf increase.
      if (BYTE(*p-9) > 4)  // \t, \n, 11, 12, \r. unsigned trick for range compare
        break;
    }
    // deal with rest of digits, stop loop on non digit.
    for ( ; (ch=*p-'0') <= 9 ; ++p) // unsigned trick for range compare
      n = n*10 + ch; 
    // for unsigned T, (fNeg) test is removed by optimizer
    return (fNeg) ? -n : n;
}

// you could go with a single template that took a cb argument, but I could not
// get the optimizer to create good code when both the cb and !cb case were combined.
// above code contains the comments.
template <typename T>
T Atoi(LPCSTR sz, BYTE cb) {
    T n = 0;
    bool fNeg = false; 
    const BYTE* p = (const BYTE*)sz;
    const BYTE* p1 = p + cb;
    BYTE ch;
    for ( ; p<p1 && (ch=*p-'1') >= 9 ; ++p) {
      if (EQ1(ch,BYTE('0'-'1'),BYTE(' '-'1'),BYTE('+'-'1')))
        continue;
      if (!((T)-1 > 0) && ch == BYTE('-'-'1')) {
        fNeg = !fNeg;
        continue;
      }
      if (BYTE(*p-9) > 4)  // \t, \n, 11, 12, \r
        break;
    }
    for ( ; p<p1 && (ch=*p-'0') <= 9 ; ++p)
      n = n*10 + ch; 
    return (fNeg) ? -n : n;
}