捕获组与非捕获组

Question

捕获组与非捕获组

11

我尝试测试正则表达式的捕获组和非捕获组的性能。

顺便说一句，捕获组和非捕获组之间的差异非常微小。

这个结果正常吗？

[root@Sensor ~]# ll -h sample.log
-rw-r--r-- 1 root root 21M Oct 20 23:01 sample.log

[root@Sensor ~]# time grep -ciP '(get|post).*' sample.log
20000

real    0m0.083s
user    0m0.070s
sys     0m0.010s

[root@Sensor ~]# time grep -ciP '(?:get|post).*' sample.log
20000

real    0m0.083s
user    0m0.077s
sys     0m0.004s

- Mr.kang

非捕获组比捕获组少花一点时间，因为缓冲区中没有保存文本。 - Wiktor Stribiżew

1

如果你想节省时间，可以去掉 .*，因为它总是匹配的，而且你并没有捕获它。 - Andy Lester

2个回答

4

如果使用大量的捕获组，差异似乎更大。

谢谢大家。:)

[root@Sensor ~]# time grep -ciP "(get|post)\s[^\s]+" sample.log
20000

real    0m0.057s
user    0m0.051s
sys     0m0.005s
[root@Sensor ~]# time grep -ciP "(?:get|post)\s[^\s]+" sample.log
20000

real    0m0.061s
user    0m0.053s
sys     0m0.006s
[root@Sensor ~]# time grep -ciP "(get|post)\s[^\s]+(get|post)" sample.log
1880

real    0m0.839s
user    0m0.833s
sys     0m0.005s
[root@Sensor ~]# time grep -ciP "(?:get|post)\s[^\s]+(?:get|post)" sample.log
1880

real    0m0.744s
user    0m0.741s
sys     0m0.003s

- Mr.kang

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Pi Marillion · Accepted Answer

通常，非捕获组比捕获组性能更好，因为它们需要更少的内存分配，并且不会复制组匹配。然而，这里有三个重要的注意事项：

对于简单的、短小的表达式和短小的匹配，差异通常非常小。
启动像grep这样的程序本身就需要大量的时间和内存，可能会压倒使用非捕获组所获得的任何小改进。
一些语言以相同的方式实现捕获组和非捕获组，使得后者没有性能提升。