在Java中使用String.replaceAll时出现java.lang.StackOverflowError

5
我需要将给定字符串中的单词'OR'替换为'||'。只有当它在输入字符串中是一个完整的单词时才会被替换。而且,如果它出现在引号内,就不应该被替换。例如,如果输入字符串是:

application.path="EXCEL.exe" OR application.path="EXCELSIOR.exe" OR application.path="XYZ OR ABC.exe"

输出应该是:

application.path="EXCEL.exe" || application.path="EXCELSIOR.exe" || application.path="XYZ OR ABC.exe"

请注意,EXCELSIOR.exe和"XYZ OR ABC.exe"中的OR没有被替换。
以下是我使用的Java代码:
String inputStr = "(quote.AGE was 24 AND (application.path = \"**\\acad.exe\" OR application.path = \"**\\dxfdwg.exe\" OR application.path = \"**\\EXCELSIOR.EXE\" OR application.path = \"**\\iges.exe\" OR application.path = \"**\\notepad.exe\" OR application.path = \"**\\run_journal.exe\" OR application.path = \"**\\AcroRd32.exe\" OR application.path = \"**\\dllhost.exe\" OR application.path = \"**\\powerpnt.exe\" OR application.path = \"**\\Edge.exe\" OR application.path = \"**\\step203ug.exe\" OR application.path = \"**\\step214ug.exe\" OR application.path = \"**\\VisView.exe\" OR application.path = \"**\\Teamcenter.exe\" OR application.path = \"**\\ug_convert_part.exe\" OR application.path = \"**\\ugraf.exe\" OR application.path = \"**\\ugtopv.exe\" OR application.path = \"**\\wmplayer.exe\" OR application.path = \"**\\winword.exe\" OR application.path = \"**\\wordpad.exe\" OR application.path = \"**\\vlc.exe\" OR application.path = \"**\\dwgviewr.exe\" OR application.name = \"RMS\" OR application.path = \"**\\acrobat.exe\" OR application.path = \"**\\Alias.exe\" OR application.path = \"**\\awtessd.exe\" OR application.path = \"**\\proe.exe\" OR application.path = \"**\\STPViewer.exe\" OR application.path = \"**\\gom_inspect.exe\" OR application.path = \"**\\gom_cad_server2.exe\" OR application.path = \"**\\sldworks.exe\" OR application.path = \"**\\sldworks_fs.exe\" OR application.path = \"**\\sldProcMon.exe\" OR application.path = \"**\\AdapplicationMgr.exe\" OR application.path = \"**\\AdapplicationMgrSvc.exe\" OR application.path = \"**\\SE3Dtrans.exe\" OR application.path = \"**\\stamp.exe\" OR application.path = \"**\\psolid.exe\" OR application.path = \"**\\mpid.exe\" OR application.path = \"**\\mpirun.exe\" OR application.path = \"**\\FS.exe\" OR application.path = \"**\\xtop.exe\" OR application.path = \"**\\pro_comm_msg.exe\" OR application.path = \"**\\nmsd.exe\" OR application.path = \"**\\creoagent.exe\" OR application.path = \"**\\parametric.exe\" OR application.path = \"**\\PDFEditor.exe\" OR application.path = \"**\\CNEXT.exe\" OR application.path = \"**\\drafter.exe\" OR application.path = \"**\\convert.exe\" OR application.path = \"**\\ActCut3D.exe\" OR application.path = \"**\\ppcbasic.exe\" OR application.path = \"**\\deltamesh_stamping.exe\" OR application.path = \"Xasfsf\" OR application.path = \"sfdsdf\"))";
String replacedStr = inputStr.replaceAll("(?m)\\bOR\\b(?=(?:\"[^\"]*\"|[^\"])*$)", "||");

这适用于较短的字符串,但一旦长度超过2000个字符,就会抛出以下错误:
Exception in thread "main" java.lang.StackOverflowError at java.util.regex.Pattern$BmpCharProperty.match(Pattern.java:3796) at java.util.regex.Pattern$Branch.match(Pattern.java:4604) at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658) at java.util.regex.Pattern$Loop.match(Pattern.java:4785) at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717) at java.util.regex.Pattern$BranchConn.match(Pattern.java:4568) at java.util.regex.Pattern$CharProperty.match(Pattern.java:3777) at java.util.regex.Pattern$Branch.match(Pattern.java:4604)
我在其他帖子(thread1, thread2)中看到,Java无法很好地处理长字符串的正则表达式。有人能提供建议,如何改进我的正则表达式以避免StackOverflowError吗?

你会用String.replaceAll()吗? - John
1个回答

1

有人能建议我如何改进我的正则表达式以避免StackOverflowError吗?

是的,我可以给你两个解决方案,你只需要从另一个角度看待你的问题。

这里是关于你的问题的快速分析和快速解决方案,你可以使用这个正则表达式代替(.*?\"\s+)\bOR\b(\s+application.*?):

解决方案一

String inputStr = //that long String
String regex = "(.*?\"\\s+)\\bOR\\b(\\s+application.*?)";
String replacedStr = inputStr.replaceAll(regex, "$1||$2");

System.out.println(replacedStr);

我注意到您想要替换的OR出现在"空格OR application之后,我的正则表达式将匹配并替换该OR。

对于短示例的输出,它将为您提供与长示例相同的结果:

application.path="EXCEL.exe" || application.path="EXCELSIOR.exe" || application.path="XYZ OR ABC.exe"
                             ^^                          ^^      ^^                       ^^

解决方案二

如果您正在使用Java 9+,您可以使用此正则表达式application.path=(\"(.*?)\"),匹配所有类似application.path="something here"的内容,并使用||收集结果。

String regex = "application.path=(\"(.*?)\")";
String text = Pattern.compile(regex)
        .matcher(inputStr).results().map(MatchResult::group)
        .collect(Collectors.joining(" || "));

感谢 @YCF_L 提供的解决方案。 - Santy

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接