我有:
$ ruby -v
ruby 2.0.0p648 (2015-12-16 revision 53162) [universal.x86_64-darwin16]
假设您有一个整数序列
1..n
,对于初学 Ruby 的人来说,他们会这样求和:$ ruby -e 's=0
for i in 1..500000
s+=i
end
puts s'
125000250000
现在假设我有来自 stdin
的相同序列:
$ seq 1 500000 | ruby -lne 'BEGIN{s=0}
s+=$_.to_i
END{puts s} '
125000250000
到目前为止一切都好。现在将终值从500,000增加到5,000,000:
$ ruby -e 's=0
for i in 1..5000000
s+=i
end
puts s'
12500002500000 <=== CORRECT
$ seq 1 5000000 | ruby -lne 'BEGIN{s=0}
s+=$_.to_i
END{puts s} '
500009500025 <=== WRONG!
它会产生不同的总和。
awk
和 perl
都使用相同的序列产生了正确的结果:
$ seq 1 5000000 | awk '{s+=$1} END{print s}'
12500002500000
$ seq 1 5000000 | perl -nle '$s+=$_; END{print $s}'
12500002500000
为什么 Ruby 的求和结果不正确?我认为这不是溢出问题,因为相同的输入在 awk 和 perl 中能正常工作。
结论:
感谢 David Aldridge 进行诊断。
OS X and BSD
seq
converts to a float output at 1,000,000 while GNUseq
supports arbitrary precision integers. OS Xseq
is useless as a source of integers greater than 1,000,000. Example on OS X:$ seq 999999 1000002 999999 1e+06 1e+06 1e+06
The ruby method
.to_i
silently converts a partial string to an integer and that was the 'bug' in this case. Example:irb(main):002:0> '5e+06'.to_i #=> 5
The 'correct' line in the script is to either use
$_.to_f.to_i
to use floats or to useInteger($_)
to not have the script fail silently.awk
andperl
parse 5e+06 into a float, andruby
does not implicitly:$ echo '5e+06' | awk '{print $1+0}' 5000000 $ echo '5e+06' | ruby -lne 'print $_.to_i+0' 5
And thanks to Stefan Schüßler for opening a Ruby feature request regarding
.to_i
behavior.
(1..5000000).sum
几乎可以立即返回结果。 - steenslag