我需要按逗号分割一个字符串,但是在这种情况下我遇到了问题:
TEXT EXAMPLE (THIS IS (A EXAMPLE, BUT NOT WORKS, FOR ME)), SECOND , THIRD
我想要分割并获取以下内容:
var[0] = "TEXT EXAMPLE (THIS IS (A EXAMPLE, BUT NOT WORKS, FOR ME))"
var[1] = "SECOND"
var[2] = "THIRD"
谢谢你
def top_level_split(s):
"""
Split `s` by top-level commas only. Commas within parentheses are ignored.
"""
# Parse the string tracking whether the current character is within
# parentheses.
balance = 0
parts = []
part = ''
for c in s:
part += c
if c == '(':
balance += 1
elif c == ')':
balance -= 1
elif c == ',' and balance == 0:
parts.append(part[:-1].strip())
part = ''
# Capture last part
if len(part):
parts.append(part.strip())
return parts
my_list = top_level_split("TEXT EXAMPLE (THIS IS (A EXAMPLE, BUT NOT WORKS, FOR ME)), SECOND , THIRD")
print(my_list)
,(?!(?:[^(]*\([^)]*\))*[^()]*\))
这个正则表达式是用来查找逗号的,它有一个断言,确保逗号不在括号里。这是通过使用负向先行断言来实现的,它首先匹配所有匹配的(
和)
,然后匹配)
。 这假设括号是平衡和未转义的。
代码:
>>> s = 'TEXT EXAMPLE (THIS IS (A EXAMPLE, BUT NOT WORKS, FOR ME)), SECOND , THIRD'
print re.split(r',(?!(?:[^(]*\([^)]*\))*[^()]*\))', s)
['TEXT EXAMPLE (THIS IS (A EXAMPLE, BUT NOT WORKS, FOR ME))', ' SECOND ', ' THIRD']
或者:
>>> s = 'TEXT EXAMPLE (THIS, IS (A EXAMPLE, BUT NOT WORKS, FOR ME)), SECOND , THIRD'
>>> print re.split(r',(?!(?:[^(]*\([^)]*\))*[^()]*\))', s)
['TEXT EXAMPLE (THIS, IS (A EXAMPLE, BUT NOT WORKS, FOR ME))', ' SECOND ', ' THIRD']
(
和 )
。不过我在答案中尝试解释了它。 - anubhavatext = "TEXT EXAMPLE (THIS IS (A EXAMPLE, BUT NOT WORKS, FOR ME)), SECOND , THIRD"
array = re.split(r',(?!.*\))', text)
for item in array:
# Print and remove the first space
print item.strip(" ")
结果:
TEXT EXAMPLE (THIS IS (A EXAMPLE, BUT NOT WORKS, FOR ME))
SECOND
THIRD
你可以直接使用rsplit
:
l1 = "TEXT EXAMPLE (THIS IS (A EXAMPLE, BUT NOT WORKS, FOR ME)), SECOND , THIRD".rsplit(",", 2)
for line in l1:
print line
TEXT EXAMPLE (THIS IS (A EXAMPLE, BUT NOT WORKS, FOR ME))
SECOND
THIRD
re.split
,您的具体示例可以使用类似于,(?!.*\))
的内容(即后面没有闭合括号的逗号),但这可能在一般情况下无法正常工作。 - jonrsharpevar.split(',')
不可能起作用,如果你不知道哪些逗号是相关的,那么把它们重新组合起来也会有同样的问题。 - jonrsharpe