如果一个特定的词没有以另一个特定的词结尾,就保留它。这是我的字符串:
x = 'john got shot dead. john with his .... ? , john got killed or died in 1990. john with his wife dead or died'
我希望您能翻译以下内容:需要打印并计算
john
和 dead 或 death 或 died
之间的所有单词。如果 john
不以任何一个 died 或 dead 或 death
结尾,则跳过该单词,重新从 john
开始计数。我的代码:
x = re.sub(r'[^\w]', ' ', x) # removed all dots, commas, special symbols
for i in re.findall(r'(?<=john)' + '(.*?)' + '(?=dead|died|death)', x):
print i
print len([word for word in i.split()])
我的输出:
got shot
2
with his john got killed or
6
with his wife
3
输出我想要的内容:
got shot
2
got killed or
3
with his wife
3
我不知道我犯了什么错误。 这只是一个样本输入。我一次必须检查20000个输入。
约翰和他的...?,约翰被杀或死了
第一个约翰
词不以死亡或去世或死亡
结尾。从第二个约翰
词开始。我想要的输出是被杀了或
而不是和他的约翰被杀了或
。 - Ganesh_