在多行字符串中匹配特定单词之前的所有内容

Question

在多行字符串中匹配特定单词之前的所有内容

4

我正在尝试使用正则表达式从字符串中过滤掉一些垃圾文本，但似乎无法使其工作。我不是一个正则表达式专家（甚至离那还很遥远），并且我已经搜索了类似的例子，但没有一个似乎可以解决我的问题。

我需要一个正则表达式，它匹配从字符串开头到特定单词的所有内容，但不包括该单词本身。

以下是一个示例：

<p>This is the string I want to process with as you can see also contains HTML tags like <i>this</i> and <strong>this</strong></p>
<p>I want to remove everything in the string BEFORE the word "giraffe" (but not "giraffe" itself and keep everything after it.</p>

那么，我如何匹配在单词“giraffe”之前的字符串中的所有内容呢？

谢谢！

- Joakim Megert

5个回答

4

为什么要使用正则表达式？

String s = "blagiraffe";
s = s.SubString(s.IndexOf("giraffe"));

- Jaster

1

试试这个：

    var s =
         @"<p>This is the string I want to process with as you can see also contains HTML tags like <i>this</i> and <strong>this</strong></p>
         <p>I want to remove everything in the string BEFORE the word ""giraffe"" (but not ""giraffe"" itself and keep everything after it.</p>";
    var ex = new Regex("giraffe.*$", RegexOptions.Multiline);
    Console.WriteLine(ex.Match(s).Value);

这段代码片段会产生以下输出：

giraffe" (but not "giraffe" itself and keep everything after it.</p>

- Sergey Kalinichenko

0

使用前瞻可以解决问题：

^.*(?=\s+giraffe)

- dorsh

0

你可以使用具有前瞻的模式，像这样：

^.*?(?=giraffe)

- Sam Greenhalgh

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Tim Pietzcker · Accepted Answer

resultString = Regex.Replace(subjectString, 
    @"\A             # Start of string
    (?:              # Match...
     (?!""giraffe"") #  (unless we're at the start of the string ""giraffe"")
    .                #  any character (including newlines)
    )*               # zero or more times", 
    "", RegexOptions.Singleline | RegexOptions.IgnorePatternWhitespace);

应该能够工作。