正则表达式：捕获成对花括号

Question

正则表达式：捕获成对花括号

phpregex

6

我想要捕捉匹配的大括号。

例如：

Some example text with \added[author]{text with curly braces{some text}..}

Some example text with \added[author]{text without curly braces}

Some example text with \added[author]{text with {}and {} and {}curly braces{some text}..}

Some example text with \added[author]{text with {}and {} and {}curly braces{some text}..} and extented text with curly braces {}

期望输出结果：

Some example text with text with curly braces{some text}..

Some example text with text without curly braces

Some example text with text with {}and {} and {}curly braces{some text}..

Some example text with text with {}and {} and {}curly braces{some text}.. and extented text with curly braces {}

也就是说，我想捕获\added[] {和}之间的文本（与其相关的右花括号）。我的正则表达式存在问题，我不知道如何捕获相关花括号之间的文本。

我尝试了以下方法：

       "/\\\\added\\[.*?\\]{(.[^{]*?)}/s"

我知道它会忽略文本中的 {。但是我不知道如何创建一个正则表达式来匹配花括号。

- Learning

请大家帮我回答我的问题，链接为http://stackoverflow.com/questions/33841196/how-to-match-text-inside-starting-and-closing-curly-brace-the-tags-and-the-spec。谢谢！ - WebICT By Leo

4个回答

2

为了匹配成对的大括号，您需要使用递归子模式。

Example:

$regex = <<<'REGEX'
/
\\added\[.*?\]                # Initial \added[author]

(                             # Group to be recursed on.
    {                         # Opening brace.

    (                         # Group for use in replacement.

        ((?>[^{}]+)|(?1))*    # Any number of substrings which can be either:
                              # - a sequence of non-braces, or
                              # - a recursive match on the first capturing group.
    )

    }                         # Closing brace.
)
/xs
REGEX;

$strings = [
    'Some example text with \added[author]{text with curly braces{some text}..}',
    'Some example text with \added[author]{text without curly braces}',
    'Some example text with \added[author]{text with {}and {} and {}curly braces{some text}..}',
    'Some example text with \added[author]{text with {}and {} and {}curly braces{some text}..} and extented text with curly braces {}'
];

foreach ($strings as $string) {
    echo preg_replace($regex, '$2', $string), "\n";
}

输出：

Some example text with text with curly braces{some text}..
Some example text with text without curly braces
Some example text with text with {}and {} and {}curly braces{some text}..
Some example text with text with {}and {} and {}curly braces{some text}.. and extented text with curly braces {}

- user3942918

太好了！非常感谢。你能解释一下这个正则表达式吗？ - Learning

1

使用以下正则表达式：

\\\\added\\[[^\\]]\*][^\\{]\*{((?:(?:[^\\{\\}]\*\\{[^\\}\\{]\*\\})\*||[^\\}]\*)\*)}

- nkit

如果出现

Some example text with \added[author]{text with {}and {} and {}curly braces{some text}..} and extented text with curly braces {}

这种情况，输出应该是这样的：Some example text with text with {}and {} and {}curly braces{some text}.. and extented text with curly braces {}。问题在于它只捕获到最后一个花括号，而不是匹配的花括号。 - Learning

0

使用此正则表达式。

/\\added[^]]*]{([^}]*}[^}]*)}/s

这里演示

- Shrinivas Shukla

谢谢！它只适用于我给出的示例...但正则表达式的目的是匹配大括号。请参阅我更新的帖子以获取更多示例。 - Learning

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- fronthem · Accepted Answer

这里应该可以工作。

/\\added\[.*\]\{(.*(?:.*\{.*\}.*)*)\}/gU

解释

/\\added\ 是一个Latex标签，

\[.*\] 是Latex标签的选项，

\{ 开始括号，

(.*(?:.*\{.*\}.*)*) 被捕获的文本，在此我们还防止了目标标签内递归的{...}或多个{...}，

\} 结束括号。

策略

我不把一对括号看作是递归形式。

{ { {...} } }
c b a   a b c

我们有一对 a、b 和 c，

但我会这样考虑它们！

{ { {...} } }   
a b c   a b c

请参见：演示

我的演示中的最后两个例子也证明它可以正常工作。

重要提示：为了使用非贪婪量词，这里应该使用修饰符U，否则我的正则表达式将无法正常工作。