我有一堆字符串:
"10people"
"5cars"
..
我该如何分割这个字符串?
['10','people']
['5','cars']
这段文本可以包含任意数量的数字和文本。
我在考虑编写某种正则表达式,不过我相信在Python中有更简单的方法。
(\d+)([a-zA-Z]+)
。import re
a = ["10people", "5cars"]
[re.match('^(\\d+)([a-zA-Z]+)$', x).groups() for x in a]
结果:
[('10', 'people'), ('5', 'cars')]
>>> re.findall('(\d+|[a-zA-Z]+)', '12fgsdfg234jhfq35rjg')
['12', 'fgsdfg', '234', 'jhfq', '35', 'rjg']
>>> re.findall("\d+|[a-zA-Z]+","10people")
['10', 'people']
>>> re.findall("\d+|[a-zA-Z]+","10people5cars")
['10', 'people', '5', 'cars']
/(?<=[0-9])(?=[a-z])|(?<=[a-z])(?=[0-9])/i
进行分割字符串。re.split()
从不在空模式匹配上进行分割。 - Tim Pietzckerre.sub(r'(?<=\d)(?=\D)|(?<=\D)(?=\d)','!SPLIT_ME!',s).split(r'!SPLIT_ME!')
;) - Alan Moore>>> import re
>>> s = '10cars'
>>> m = re.match(r'(\d+)([a-z]+)', s)
>>> print m.group(1)
10
>>> print m.group(2)
cars
如果你和我一样,因为正则表达式太丑陋而绕着弯路,那么这里有一个非正则表达式的方法:
data = "5people10cars"
numbers = "".join(ch if ch.isdigit() else "\n" for ch in data).split()
names = "".join(ch if not ch.isdigit() else "\n" for ch in data).split()
final = zip (numbers, names)
import string
allchars = ''.join(chr(i) for i in range(32,256))
digExtractTrans = string.maketrans(allchars, ''.join(ch if ch.isdigit() else ' ' for ch in allchars))
alpExtractTrans = string.maketrans(allchars, ''.join(ch if ch.isalpha() else ' ' for ch in allchars))
data = "5people10cars"
numbers = data.translate(digExtractTrans).split()
names = data.translate(alpExtractTrans).split()
您只需要创建一次翻译表,然后随时调用翻译和分割函数即可。