你的方法很好。我认为使用哨兵是优雅的。也许更符合Python风格的做法是使用嵌套生成器表达式:
def zip_discard_gen(*iterables, sentinel=object()):
return ((entry for entry in iterable if entry is not sentinel)
for iterable in zip_longest(*iterables, fillvalue=sentinel))
这需要更少的导入,因为不需要使用partial()
或ne()
。
它也稍微快一些:
data = [(11, 12, 13 ),
(21, 22, 23, 24),
(31, 32 ),
(41, 42, 43, 44)]
%timeit [list(x) for x in zip_discard(*data)]
10000 loops, best of 3: 17.5 µs per loop
%timeit [list(x) for x in zip_discard_gen(*data)]
100000 loops, best of 3: 14.2 µs per loop
编辑
使用列表推导式的版本稍微快一些:
def zip_discard_compr(*iterables, sentinel=object()):
return [[entry for entry in iterable if entry is not sentinel]
for iterable in zip_longest(*iterables, fillvalue=sentinel)]
时间:
%timeit zip_discard_compr(*data)
100000 loops, best of 3: 6.73 µs per loop
Python 2版本:
from itertools import izip_longest
SENTINEL = object()
def zip_discard_compr(*iterables):
sentinel = SENTINEL
return [[entry for entry in iterable if entry is not sentinel]
for iterable in izip_longest(*iterables, fillvalue=sentinel)]
时间
这个版本返回与Tadhg McDonald-Jensen的zip_varlen
相同的数据结构:
def zip_discard_gen(*iterables, sentinel=object()):
return (tuple([entry for entry in iterable if entry is not sentinel])
for iterable in zip_longest(*iterables, fillvalue=sentinel))
它的速度大约快了两倍:
%timeit list(zip_discard_gen(*data))
100000 loops, best of 3: 9.37 µs per loop
%timeit list(zip_varlen(*data))
10000 loops, best of 3: 18 µs per loop
op.is_not
而不是op.ne
。 - Bakuriu