我是一名有用的助手,可以翻译文本。
我正在开发一个新的Java项目,因此正在阅读已经存在的代码。在代码的一个非常重要的部分中,我发现了以下正则表达式,但我无法确定它们的作用。有人能用通俗易懂的语言解释一下吗?
1)
[^,]*|.+(,).+
2)
(\()?\d+(?(1)\))
我正在开发一个新的Java项目,因此正在阅读已经存在的代码。在代码的一个非常重要的部分中,我发现了以下正则表达式,但我无法确定它们的作用。有人能用通俗易懂的语言解释一下吗?
1)
[^,]*|.+(,).+
2)
(\()?\d+(?(1)\))
下次您需要解释一个正则表达式时,可以使用Rick Measham的以下explain.pl
服务:
Regex: [^,]*|.+(,).+
NODE EXPLANATION
--------------------------------------------------------------------------------
[^,]* any character except: ',' (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
.+ any character except \n (1 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
( group and capture to \1:
--------------------------------------------------------------------------------
, ','
--------------------------------------------------------------------------------
) end of \1
--------------------------------------------------------------------------------
.+ any character except \n (1 or more times
(matching the most amount possible))
Regex: (\()?\d+(?(1)\))
NODE EXPLANATION
--------------------------------------------------------------------------------
( group and capture to \1 (optional
(matching the most amount possible)):
--------------------------------------------------------------------------------
\( '('
--------------------------------------------------------------------------------
)? end of \1 (NOTE: because you're using a
quantifier on this capture, only the LAST
repetition of the captured pattern will be
stored in \1)
--------------------------------------------------------------------------------
\d+ digits (0-9) (1 or more times (matching
the most amount possible))
--------------------------------------------------------------------------------
(?(1) if back-reference \1 matched, then:
--------------------------------------------------------------------------------
\) ')'
--------------------------------------------------------------------------------
| else:
--------------------------------------------------------------------------------
succeed
--------------------------------------------------------------------------------
) end of conditional on \1
JAVA不支持条件语句!针对第二个模式的无条件正则表达式可能是:
\d+|\(\d+\)
即非零数字的重复,带或不带括号。
条件语句由JGsoft引擎、Perl、PCRE和.NET框架支持。
这是第一个模式的测试工具。
import java.util.regex.*;
//...
Pattern p = Pattern.compile("[^,]*|.+(,).+");
String[] tests = {
"", // [] is a match with no commas
"abc", // [abc] is a match with no commas
",abc", // [,abc] is not a match
"abc,", // [abc,] is not a match
"ab,c", // [ab,c] is a match with separating comma
"ab,c,", // [ab,c,] is a match with separating comma
",", // [,] is not a match
",,", // [,,] is not a match
",,,", // [,,,] is a match with separating comma
};
for (String test : tests) {
Matcher m = p.matcher(test);
System.out.format("[%s] is %s %n", test,
!m.matches() ? "not a match"
: m.group(1) != null
? "a match with separating comma"
: "a match with no commas"
);
}
\1
可以用于区分这两种情况下面是一个类似的测试框架,用于第二个模式,重写而不使用条件语句(Java 不支持条件语句):
Pattern p = Pattern.compile("\\d+|(\\()\\d+\\)");
String[] tests = {
"", // [] is not a match
"0", // [0] is a match without parenthesis
"(0)", // [(0)] is a match with surrounding parenthesis
"007", // [007] is a match without parenthesis
"(007)", // [(007)] is a match with surrounding parenthesis
"(007", // [(007] is not a match
"007)", // [007)] is not a match
"-1", // [-1] is not a match
};
for (String test : tests) {
Matcher m = p.matcher(test);
System.out.format("[%s] is %s %n", test,
!m.matches() ? "not a match"
: m.group(1) != null
? "a match with surrounding parenthesis"
: "a match without parenthesis"
);
}
如先前所述,这将匹配一个非零数量的数字,可能被括号包围(\1
能够区分两者)。
1)
[^,]* means any number of characters that are not a comma
.+(,).+ means 1 or more characters followed by a comma followed by 1 or more characters
| means either the first one or the second one
2)
(\()? means zero or one '(' note* backslash is to escape '('
\d+ means 1 or more digits
(?(1)\)) means if back-reference \1 matched, then ')' note* no else is given
请注意,括号用于捕获正则表达式中的某些部分,但是如果它们通过反斜杠进行转义,则不会捕获
(?
是if-then-else条件语句,不被Java支持。请看我的回答。 - polygenelubricants1)任何不以逗号开头的东西,或者在其中包含逗号的东西。
2)任何以1结尾且在括号内的数字,可能在数字之前关闭括号并再次打开。
[^,]*
并不是指“任何不以逗号开头的东西”,这个应该是[^,].*
(除了换行符)。相反,[^,]*
指的是“任何不包含逗号的东西”。
2)正如其他答案中所指出的,(?(1)...)
部分是一个有条件的表达式(在Java中缺失),而不是一组分组为1的字符。请参见http://www.regular-expressions.info/conditional.html。 - Christian Semrau