partition
比 split
更快,因为它在第一个匹配后就停止检查。
使用带有 index
的常规 slice
比正则表达式 slice
更快。
当匹配前字符串的部分变得更大时,正则表达式切片也会明显减慢。在 ~ 10 个字符后它比原始分割更慢,然后从那时起变得更糟。如果您有一个没有 +
或 *
匹配的正则表达式,我认为它会好一些。
require 'benchmark'
n=1000000
def bench n,email
printf "\n%s %s times\n", email, n
Benchmark.bm do |x|
x.report('split ') do n.times{ email.split('@')[0] } end
x.report('partition') do n.times{ email.partition('@').first } end
x.report('slice reg') do n.times{ email[/[^@]+/] } end
x.report('slice ind') do n.times{ email[0,email.index('@')] } end
end
end
bench n, 'a@be.pl'
bench n, 'some_name@regulardomain.com'
bench n, 'some_really_long_long_email_name@regulardomain.com'
bench n, 'some_name@rediculously-extra-long-silly-domain.com'
bench n, 'some_really_long_long_email_name@rediculously-extra-long-silly-domain.com'
bench n, 'a'*254 + '@' + 'b'*253
bench n, 'a'*1000 + '@' + 'b'*1000
结果 1.9.3p484:
a@be.pl 1000000 times
user system total real
split 0.405000 0.000000 0.405000 ( 0.410023)
partition 0.375000 0.000000 0.375000 ( 0.368021)
slice reg 0.359000 0.000000 0.359000 ( 0.357020)
slice ind 0.312000 0.000000 0.312000 ( 0.309018)
some_name@regulardomain.com 1000000 times
user system total real
split 0.421000 0.000000 0.421000 ( 0.432025)
partition 0.374000 0.000000 0.374000 ( 0.379021)
slice reg 0.421000 0.000000 0.421000 ( 0.411024)
slice ind 0.312000 0.000000 0.312000 ( 0.315018)
some_really_long_long_email_name@regulardomain.com 1000000 times
user system total real
split 0.593000 0.000000 0.593000 ( 0.589034)
partition 0.531000 0.000000 0.531000 ( 0.529030)
slice reg 0.764000 0.000000 0.764000 ( 0.771044)
slice ind 0.484000 0.000000 0.484000 ( 0.478027)
some_name@rediculously-extra-long-silly-domain.com 1000000 times
user system total real
split 0.483000 0.000000 0.483000 ( 0.481028)
partition 0.390000 0.016000 0.406000 ( 0.404023)
slice reg 0.406000 0.000000 0.406000 ( 0.411024)
slice ind 0.312000 0.000000 0.312000 ( 0.344020)
some_really_long_long_email_name@rediculously-extra-long-silly-domain.com 1000000 times
user system total real
split 0.639000 0.000000 0.639000 ( 0.646037)
partition 0.609000 0.000000 0.609000 ( 0.596034)
slice reg 0.764000 0.000000 0.764000 ( 0.773044)
slice ind 0.499000 0.000000 0.499000 ( 0.491028)
a<254>@b<253> 1000000 times
user system total real
split 0.952000 0.000000 0.952000 ( 0.960055)
partition 0.733000 0.000000 0.733000 ( 0.731042)
slice reg 3.432000 0.000000 3.432000 ( 3.429196)
slice ind 0.624000 0.000000 0.624000 ( 0.625036)
a<1000>@b<1000> 1000000 times
user system total real
split 1.888000 0.000000 1.888000 ( 1.892108)
partition 1.170000 0.016000 1.186000 ( 1.188068)
slice reg 12.885000 0.000000 12.885000 ( 12.914739)
slice ind 1.108000 0.000000 1.108000 ( 1.097063)
2.1.3p242的差异大约相同,但在所有方面都比它快10-30%,除了正则表达式拆分之外,它会更慢。