字符串.replace()没有替换所有出现的情况。

3

我有一个非常长的字符串,它看起来类似于这个。

355,356,357,358,359,360,361,382,363,364,365,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,368,369,313,370,371,372,373,374,375,376,377,378,379,380,381,382,382,382,382,382,382,383,384,385,380,381,382,382,382,382,382,386,387,388,389,380,381,382,382,382,382,382,382,390,391,380,381,382,382,382,382,382,392,393,394,395,396,397,398,399,....

当我尝试使用以下代码从字符串中删除数字382时:
String str = "355,356,357,358,359,360,361,382,363,364,365,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,368,369,313,370,371,372,373,374,375,376,377,378,379,380,381,382,382,382,382,382,382,383,384,385,380,381,382,382,382,382,382,386,387,388,389,380,381,382,382,382,382,382,382,390,391,380,381,382,382,382,382,382,392,393,394,395,396,397,398,399,...."
str = str.replace(",382,", ",");

但是似乎并不是所有的出现都被替换了。原本有超过3000次出现的字符串,在替换后仍然剩下约630次出现。

String.replace()的能力是否受到限制?如果是,是否有可能实现我需要的功能?

6个回答

3

我认为问题出在你对replace()的第一个参数上,特别是在382前后的逗号(,)。如果你有"382,382,383",你只会匹配内部的",382,",而把最初的一个留在那里。尝试一下这个:

str.replace("382,", "");

尽管此方法可以匹配“382”,但由于它后面没有逗号,所以无法完全匹配。
完整的解决方案可能需要进行两次方法调用:
str = str.replace("382", "");  // Remove all instances of 382
str.replaceAll(",,+", ",");    // Compress all duplicates, triplicates, etc. of commas

这结合了这两种方法:
str.replaceAll("382,?", "");  // Remove 382 and an optional comma after it. 

注意:如果382位于末尾,则最后两种方法都会留下一个逗号。

1
这怎么可能发生630次?而且你会得到两个逗号挨在一起。 - Zarwan
我们需要看数据,但我猜测682自身相邻的情况可能有630个。你关于逗号连续出现的观点是正确的。我已经相应地调整了我的答案。 - dave
2
replace() 不使用正则表达式,而是使用纯文本搜索。 - Bohemian
那么像324、343、1382、340这样的数字怎么样?这也会匹配上1382。 - Lizzy
1
@Lizzie 这是真的,但是原帖中的数据只包含三位数。 - dave

3
您需要同时替换末尾的逗号(如果有逗号存在,这在列表中如果是最后一个则不会出现):
str = str.replaceAll("\\b382,?", "");

注意\b单词边界以防止匹配"-,1382,-"

以上将会转换为:

382,111,382,1382,222,382

to:

111,1382,222

1
尝试这个。
str = str.replaceAll(",382,", ",");

2
为什么这样做会更好?使用该参数时,replacereplaceAll没有区别。 - resueman

1
首先,在匹配字符串中删除前导逗号。然后,通过使用Java正则表达式将逗号替换为单个逗号来删除重复逗号。
 String input = "355,356,357,358,359,360,361,382,363,364,365,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,368,369,313,370,371,372,373,374,375,376,377,378,379,380,381,382,382,382,382,382,382,383,384,385,380,381,382,382,382,382,382,386,387,388,389,380,381,382,382,382,382,382,382,390,391,380,381,382,382,382,382,382,392,393,394,395,396,397,398,399";
    String result = input.replace("382,", ","); // remove the preceding comma
    String result2 = result.replaceAll("[,]+", ","); // replace duplicate commas

    System.out.println(result2);

1
如dave所说,问题在于您的模式重叠。 在字符串"...,382,382,..."中,有两个",382,"的出现:
"...,382,382,..."
    -----         first occurrence
        -----     second occurrence

这两个出现重叠在逗号处,因此Java只能替换其中一个。在查找出现次数时,它还没有看到您用什么替换模式,因此它并没有看到第一个出现被逗号替换时生成的新出现",382,"
如果你的数据不包含超过三位数字的数,则可以这样做:
str.replace("382,", "");

然后将结尾处的出现视为特殊情况进行处理。但如果您的数据可能包含大数字,则"...,1382,..."将被替换为"...,1,...",这可能不是您想要的。

以下是两种没有上述问题的解决方案:

首先,简单地重复替换操作,直到不再发生更改:

String oldString = str;
str = str.replace(",382,", ",");
while (!str.equals(oldString)) {
    oldString = str;
    str = str.replace(",382,", ",");
}

在此之后,您需要处理字符串末尾可能出现的情况。
其次,如果您使用的是Java 8,您可以自己完成更多的工作并使用Java流:
str = Arrays.stream(str.split(","))
    .filter(s -> !s.equals("382"))
    .collect(Collectors.joining(","));

这首先将字符串在“,”处分割,然后过滤掉所有等于“382”的字符串,最后再用“,”重新连接剩余的字符串。(这两个代码片段都未经测试。)

0

传统方式:

    String str = ",abc,null,null,0,0,7,8,9,10,11,12,13,14";
    String newStr = "", word = "";
    for (int i=0; i<str.length(); i++) {
        if (str.charAt(i) == ',') {
            if (word.equals("null") || word.equals("0"))
                word = "";
            newStr += word+",";
            word = "";
        } else {
            word += str.charAt(i);
            if (i == str.length()-1)
                newStr += word;
        }
    }
    System.out.println(newStr);

输出: ,abc,,,,,7,8,9,10,11,12,13,14


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接