startswith
和in
返回布尔值。
in
运算符用于测试成员资格。
- 这可以通过
list-comprehension
或filter
执行。
- 使用包含
in
的list-comprehension
是已经测试过的最快实现方式。
- 如果不区分大小写,考虑将所有单词映射为小写形式。
l = list(map(str.lower, l))
。
- 在Python 3.11.0中进行了测试。
filter
:
- 使用
filter
创建一个filter
对象,因此使用list()
将所有匹配的值显示在一个list
中。
l = ['ones', 'twos', 'threes']
wanted = 'three'
result = list(filter(lambda x: x.startswith(wanted), l))
result = list(filter(lambda x: wanted in x, l))
print(result)
[out]:
['threes']
列表推导式
l = ['ones', 'twos', 'threes']
wanted = 'three'
result = [v for v in l if v.startswith(wanted)]
result = [v for v in l if wanted in v]
print(result)
[out]:
['threes']
哪个实现更快?
- 在 Jupyter Lab 中使用 nltk v3.7 的 words 语料库进行测试,该语料库包含 236736 个单词
- 有着' three '的单词
['three', 'threefold', 'threefolded', 'threefoldedness', 'threefoldly', 'threefoldness', 'threeling', 'threeness', 'threepence', 'threepenny', 'threepennyworth', 'threescore', 'threesome']
from nltk.corpus import words
%timeit list(filter(lambda x: x.startswith(wanted), words.words()))
%timeit list(filter(lambda x: wanted in x, words.words()))
%timeit [v for v in words.words() if v.startswith(wanted)]
%timeit [v for v in words.words() if wanted in v]
%timeit
结果
62.8 ms ± 816 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
53.8 ms ± 982 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
56.9 ms ± 1.33 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
47.5 ms ± 1.04 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)