这个Perl正则表达式m/(.?):(.?)$/g的含义是什么？

Question

这个Perl正则表达式m/(.?):(.?)$/g的含义是什么？

5

我正在编辑一个Perl文件，但我不理解这个正则表达式比较。请有人给我解释一下吗？

if ($lines =~ m/(.*?):(.*?)$/g) { } ..

这里发生了什么？$lines是文本文件中的一行。

- perlnewb

看起来第一个(.*?)总是匹配为空字符串。 - Ivan Nevostruev

1

不总是。它将匹配到第一个冒号之前的所有字符。 - CanSpice

5个回答

10

有一个工具可以帮助理解正则表达式：YAPE::Regex::Explain。

忽略这里不需要的g修饰符：

use strict;
use warnings;
use YAPE::Regex::Explain;

my $re = qr/(.*?):(.*?)$/;
print YAPE::Regex::Explain->new($re)->explain();

__END__

The regular expression:

(?-imsx:(.*?):(.*?)$)

matches as follows:

NODE                     EXPLANATION
----------------------------------------------------------------------
(?-imsx:                 group, but do not capture (case-sensitive)
                         (with ^ and $ matching normally) (with . not
                         matching \n) (matching whitespace and #
                         normally):
----------------------------------------------------------------------
  (                        group and capture to \1:
----------------------------------------------------------------------
    .*?                      any character except \n (0 or more times
                             (matching the least amount possible))
----------------------------------------------------------------------
  )                        end of \1
----------------------------------------------------------------------
  :                        ':'
----------------------------------------------------------------------
  (                        group and capture to \2:
----------------------------------------------------------------------
    .*?                      any character except \n (0 or more times
                             (matching the least amount possible))
----------------------------------------------------------------------
  )                        end of \2
----------------------------------------------------------------------
  $                        before an optional \n, and the end of the
                           string
----------------------------------------------------------------------
)                        end of grouping
----------------------------------------------------------------------

详见 perldoc perlre。

- toolic

3

这是由一个对正则表达式了解过多或者不够了解$'和$`变量的人所编写。

这也可以写成：

if ($lines =~ /:/) {
    ... # use $` ($PREMATCH)  instead of $1
    ... # use $' ($POSTMATCH) instead of $2
}

或者

if ( ($var1,$var2) = split /:/, $lines, 2 and defined($var2) ) {
    ... # use $var1, $var2 instead of $1,$2
}

- mob

1

如果你想使用 /:/，请使用 Perl 5.10 中的 /p 标志以及 ${^PREMATCH} 和 ${^POSTMATCH} 变量。不过我更喜欢使用 split，因为这才是实际发生的事情。 - brian d foy

2

(.*?) 捕获任何字符，但尽可能少地捕获。

因此，它会寻找类似于<something>:<somethingelse><end of line>的模式，并且如果字符串中有多个:，第一个将被用作<something>和<somethingelse>之间的分隔符。

- Amber

2

那行代码的作用是对$lines使用正则表达式m/(.*?):(.*?)$/g进行匹配。如果在$lines中找到了匹配项，它将有效地返回true；如果没有找到匹配项，则返回false。 =~操作符的解释如下：

二元“=~”将标量表达式绑定到模式匹配上。某些操作默认搜索或修改字符串 $_。该运算符使得这种操作可以在其他字符串上工作。右参数是搜索模式、替换或转换规则。左参数是要搜索、替换或转换的对象，而不是默认的 $_。在标量语境中使用时，返回值通常表示操作的成功与否。

正则表达式本身如下：

m/    #Perform a "match" operation
(.*?) #Match zero or more repetitions of any characters, but match as few as possible (ungreedy)
:     #Match a literal colon character
(.*?) #Match zero or more repetitions of any characters, but match as few as possible (ungreedy)
$     #Match the end of string
/g    #Perform the regex globally (find all occurrences in $line)

所以如果$lines与该正则表达式匹配，则进入条件部分，否则它将是false并跳过它。

- eldarerathis

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- CanSpice · Accepted Answer

将其分解为几个部分：

$lines =~ m/ (.*?)      # Match any character (except newlines)
                        # zero or more times, not greedily, and
                        # stick the results in $1.
             :          # Match a colon.
             (.*?)      # Match any character (except newlines)
                        # zero or more times, not greedily, and
                        # stick the results in $2.
             $          # Match the end of the line.
           /gx;

这个正则表达式将匹配像 ":" 这样的字符串（它匹配零个字符，然后是一个冒号，然后是行末的零个字符，$1 和 $2 是空字符串），或者 "abc:"（$1 = "abc"，$2 是空字符串），或者 "abc:def:ghi"（$1 = "abc"，$2 = "def:ghi"）。

如果传入一个不匹配的行（看起来就是字符串中不包含冒号），那么它将不会处理括号内的代码。但如果匹配成功，那么括号内的代码可以使用和处理特殊的 $1 和 $2 变量（至少在下一个正则表达式出现之前，如果在括号内有一个的话）。

这个Perl正则表达式m/(.*?):(.*?)$/g的含义是什么？

这个Perl正则表达式m/(.?):(.?)$/g的含义是什么？