我想将一段文本分成句子。一个句子以(点)或?或!结尾,后面跟着一个或多个空格字符,下一个句子以大写字母开头。
例如:
第一句话。第二句话!
我该怎么做?
例如:
第一句话。第二句话!
我该怎么做?
你可以使用匹配空格的正则表达式进行分割,并使用向后引用来查找句子终止符:
string[] sentences = Regex.Split(input, @"(?<=[\.!\?])\s+");
这段代码将在空格字符上进行分割,并保留句子中的终止符。
示例:
string input = "First sentence. Second sentence! Third sentence? Yes.";
string[] sentences = Regex.Split(input, @"(?<=[\.!\?])\s+");
foreach (string sentence in sentences) {
Console.WriteLine(sentence);
}
输出:
First sentence.
Second sentence!
Third sentence?
Yes.
char[] separators = new char[] {'!', '.', '?'};
string[] sentences1 = "First sentence. Second sentence!".Split(separators);
//or...
string[] sentences2 = "First sentence. Second sentence!".Split('!', '.', '?');