使用正则表达式捕获方括号内的文本。

Question

使用正则表达式捕获方括号内的文本。

3

我看到了这里的问题：正则表达式捕获{}，与我想要的类似，但我无法使其起作用。

我的数据是：

[Honda] Japanese manufacturer [VTEC] Name of electronic lift control

我希望你能输出结果

[Honda], [VTEC]

我的表达是：

m = re.match('(\[[^\[\]]*\])', '[Honda] Japanese manufacturer [VTEC] Name of electronic lift control')

我期望：

m.group(0) 输出 [Honda]
m.group(1) 输出 [VTEC]

然而两者都输出了 [Honda]。我该如何访问第二个匹配？

- A G

3个回答

2

你可以使用re.findall来获取所有匹配项，虽然你会得到一个列表，但你不需要捕获组：

m = re.findall('\[[^\[\]]*\]', '[Honda] Japanese manufacturer [VTEC] Name of electronic lift control')

给出 ['[Honda]', '[VTEC]']，您可以使用以下方法获取每个元素：

print(m[0])
# => [Honda]

print(m[1])
# => [VTEC]

- Jerry

0

如果您正在考虑除了回复以外的其他选项：

s="[Honda] Japanese manufacturer [VTEC] Name of electronic lift control"
result = []
tempStr = ""
flag = False
for i in s:
    if i == '[':
        flag = True
    elif i == ']':
        flag = False
    elif flag:
        tempStr = tempStr + i
    elif tempStr != "":
        result.append(tempStr)
        tempStr = ""

print result

输出：

['Honda', 'VTEC']

- venpa

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Martijn Pieters · Accepted Answer

你的表达式中只有一个组，因此你只能得到这一个组。第1组是捕获组，第0组是整个匹配文本; 在你的表达式中，它们是相同的。如果省略了 (...) 括号，你将只有第0组。

如果你想要获取所有匹配项，使用 re.findall()。这会返回一个匹配组列表（如果在你的表达式中没有捕获组，则为第0组）：
>>> import re >>> re.findall('\[[^\[\]]*\]', '[Honda] Japanese manufacturer [VTEC] Name of electronic lift control') ['[Honda]', '[VTEC]']