多行正则表达式

4

我是一名新手使用正则表达式,我有以下字符串:

ELEMENTS'"MCMCU","MCSTYL","MCDC","MCLDM","MCCO","MCAN8","MCAN8O","MCCNTY","MCADDS","MCFMOD","MCDL01","MCDL02","MCDL03","MCDL04","MCRP01","MCRP02","MCRP03","MCRP04","MCRP05","MCRP06","MCRP07","MCRP08","MCRP09","MCRP10","MCRP11","MCRP12","MCRP13","MCRP14\
","MCRP15","MCRP16","MCRP17","MCRP18","MCRP19","MCRP20","MCRP21","MCRP22","MCRP23","MCRP24","MCRP25","MCRP26","MCRP27","MCRP28","MCRP29","MCRP30","MCTA","MCTXJS","MCTXA1","MCEXR1","MCTC01","MCTC02","MCTC03","MCTC04","MCTC05","MCTC06","MCTC07","MCTC08","\
MCTC09","MCTC10","MCND01","MCND02","MCND03","MCND04","MCND05","MCND06","MCND07","MCND08","MCND09","MCND10","MCCC01","MCCC02","MCCC03","MCCC04","MCCC05","MCCC06","MCCC07","MCCC08","MCCC09","MCCC10","MCPECC","MCALS","MCISS","MCGLBA","MCALCL","MCLMTH","MCL\
F","MCOBJ1","MCOBJ2","MCOBJ3","MCSUB1","MCTOU","MCSBLI","MCANPA","MCCT","MCCERT","MCMCUS","MCBTYP","MCPC","MCPCA","MCPCC","MCINTA","MCINTL","MCD1J","MCD2J","MCD3J","MCD4J","MCD5J","MCD6J","MCFPDJ","MCCAC","MCPAC","MCEEO","MCERC","MCUSER","MCPID","MCUPMJ\
","MCJOBN","MCUPMT","MCBPTP","MCAPSB","MCTSBU"'

我希望提取"文本1",文本2,...,"文本n"; 我尝试了

Pattern p = Pattern.compile("^ELEMENTS\\s'\".*\"'$",Pattern.MULTILINE);
Matcher m = p.matcher(s);

但它只适用于单行字符串,对于多行字符串则无效。


为什么在 ELEMENTS 后面似乎没有空格/制表符等的情况下,你还要加上 \\s?此外,你可能想要完全删除多行模式,因为它似乎与你想要做的相反。 - Jerry
@Jerry,原始字符串中有一个空格,我对Multiline的作用感到困惑。 - Mikou
1个回答

6

警告: Pattern.MULTILINE 并不是你想象的那样。如果你想匹配跨越多行的输入中的内容,你需要使用Pattern.DOTALL:这将告诉正则表达式点号和补充字符类也应该匹配换行符,而它们默认情况下不会匹配。

Pattern.MULTILINE 的作用是改变 ^$ 锚点的行为,使得它们分别在新行之后和之前匹配,除了匹配输入的开头和结尾(这是它们的默认行为)。

例如,给定以下输入:

Hello\nworld\n

你有这个:

 Hello \n world \n
|                    # `^` without Pattern.MULTILINE
                  |  # `$` without Pattern.MULTILINE
|        |        |  # `^` with Pattern.MULTILINE
      |        |  |  # `$` with Pattern.MULTILINE

是的,MULTILINE这个名称很令人困惑。像Perl一样的正则表达式引擎中的/m修饰符和/s修饰符也是如此。


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,