基于正则表达式拆分字符串而不消耗字符

Question

基于正则表达式拆分字符串而不消耗字符

5

我想要分割以下字符串：

text="one,two;three.four:"

加入列表

textOut=["one", ",two", ";three", ".four", ":"]

我已经尝试过了。

import re
textOut = re.split(r'(?=[.:,;])', text)

但是这并没有分裂任何东西。

- XAnguera

只是一个公式注释，它并不是“消耗字符”，更像是保留分隔符。 - Kobi K

2个回答

1

我会在这里使用re.findall，而不是re.split：

>>> from re import findall
>>> text = "one,two;three.four:"
>>> findall("(?:^|\W)\w*", text)
['one', ',two', ';three', '.four', ':']
>>>

以下是上面使用的正则表达式模式的细分：

(?:      # The start of a non-capturing group
^|\W     # The start of the string or a non-word character (symbol)
)        # The end of the non-capturing group
\w*      # Zero or more word characters (characters that are not symbols)

更多信息，请查看这里。

- user2555451

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- timgeb · Accepted Answer

我不知道你的字符串中还会出现什么，但这样做可以解决问题吗？

>>> s='one,two;three.four:'
>>> [x for x in re.findall(r'[.,;:]?\w*', s) if x]
['one', ',two', ';three', '.four', ':']