Java regex (?i) vs Pattern.CASE_INSENSITIVE

Question

Java regex (?i) vs Pattern.CASE_INSENSITIVE

6

我正在使用 "\\b(\\w+)(\\W+\\1\\b)+" 和 input = input.replaceAll(regex, "$1"); 来查找字符串中的重复单词并删除重复。例如，输入字符串为 "for for for" 将变成 "for"。

但是它无法将 "Hello hello" 转换为 "Hello"，即使我已经使用了 Pattern p = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);

我可以通过使用 "(?i)\\b(\\w+)(\\W+\\1\\b)+" 来纠正它，但我想知道为什么需要这样做？为什么在已经指定 Pattern.CASE_INSENSITIVE 的情况下还要使用 (?i) 标志？

以下是完整的代码：

import java.util.Scanner;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class DuplicateWords {

public static void main(String[] args) {

    String regex = "\\b(\\w+)(\\W+\\1\\b)+";
    Pattern p = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);

    Scanner in = new Scanner(System.in);
    int numSentences = Integer.parseInt(in.nextLine());

    while (numSentences-- > 0) {
        String input = in.nextLine();

        Matcher m = p.matcher(input);

        // Check for subsequences of input that match the compiled pattern
        while (m.find()) {
            input = input.replaceAll(regex, "$1");
        }

        // Prints the modified sentence.
        System.out.println(input);
    }
    in.close();
}
}

- Paddy

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- anubhava · Accepted Answer

您的问题在于您在定义正则表达式时使用了 CASE_SENSITIVE 标志，但在 replaceAll 方法中没有正确使用它。

您还可以在正则表达式中间使用 (?i) 来忽略大小写匹配后向引用 \1 ，就像这样：

String repl = "Hello hello".replaceAll("\\b(\\w+)(\\W+(?i:\\1)\\b)+", "$1");
//=> Hello

然后稍后使用 Matcher.replaceAll。

可行代码：

public class DuplicateWords {

    public static void main(String[] args) {

        String regex = "\\b(\\w+)(\\W+(?i:\\1)\\b)+";
        Pattern p = Pattern.compile(regex);

        // OR this one also works
        // String regex = "\\b(\\w+)(\\W+\\1\\b)+";
        // Pattern p = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);

        Scanner in = new Scanner(System.in);
        int numSentences = Integer.parseInt(in.nextLine());

        while (numSentences-- > 0) {
            String input = in.nextLine();

            Matcher m = p.matcher(input);

            // Check for subsequences of input that match the compiled pattern
            if (m.find()) {
                input = m.replaceAll("$1");
            }

            // Prints the modified sentence.
            System.out.println(input);
        }
        in.close();
    }
}