Python 3.6
我想从一个字符串中删除一组字符串。这是我的第一次尝试:
string = 'this is a test string'
items_to_remove = ['this', 'is', 'a', 'string']
result = list(filter(lambda x: x not in items_to_remove, string.split(' ')))
print(result)
输出:
['test']
但是如果x
的间距不好,这种方法就行不通。我感觉肯定有内置的解决方案,嗯,一定有更好的方法!
我看了一下这个discussion在stack overflow上的讨论,问题和我的完全一样...
为了不浪费我的努力,我计时了所有的解决方案。我相信最简单、最快速、最pythonic的方法是简单的for循环。这与其他帖子中的结论不同...
result = string
for i in items_to_remove:
result = result.replace(i,'')
测试代码:
import timeit
t1 = timeit.timeit('''
string = 'this is a test string'
items_to_remove = ['this', 'is', 'a', 'string']
result = list(filter(lambda x: x not in items_to_remove, string.split(' ')))
''', number=1000000)
print(t1)
t2 = timeit.timeit('''
string = 'this is a test string'
items_to_remove = ['this', 'is', 'a', 'string']
def sub(m):
return '' if m.group() in items_to_remove else m.group()
result = re.sub(r'\w+', sub, string)
''',setup= 'import re', number=1000000)
print(t2)
t3 = timeit.timeit('''
string = 'this is a test string'
items_to_remove = ['this', 'is', 'a', 'string']
result = re.sub(r'|'.join(items_to_remove), '', string)
''',setup= 'import re', number=1000000)
print(t3)
t4 = timeit.timeit('''
string = 'this is a test string'
items_to_remove = ['this', 'is', 'a', 'string']
result = string
for i in items_to_remove:
result = result.replace(i,'')
''', number=1000000)
print(t4)
输出:
1.9832003884248448
4.408749988641971
2.124719851741177
1.085117268194475
for
循环)也会替换子字符串。尝试更改items_to_remove
的顺序为:['is','this','a','string']
,你就会明白我在说什么了。 - zwer