在Python中,调用多个reduce函数的高效方法是什么?

3

我想在Python(2.7)中对可迭代对象运行多个reduce函数。例如,对整数可迭代对象调用minmax。但是,当然你不能在同一个可迭代对象上调用reduce(min, it)reduce(max, it),因为在第一次调用后它就被耗尽了。所以你可能会想做这样的事情:

reduce(lambda a, b: (min(a[0], b[0]), max(a[1], b[1])), ((x, x) for x in it))

你可能认为这很不错,于是你将其概括成以下内容:

from itertools import izip

def multireduce(iterable, *funcs):
    """:Return: The tuple resulting from calling ``reduce(func, iterable)`` for each `func` in `funcs`."""
    return reduce(lambda a, b: tuple(func(aa, bb) for func, aa, bb in izip(funcs, a, b)), ((item,) * len(funcs) for item in iterable))

(你喜欢单元测试,因此你包含了像这样的内容:)
import unittest
class TestMultireduce(unittest.TestCase):
    def test_multireduce(self):
        vecs = (
            ((1,), (min,), (1,)),
            (xrange(10), (min, max), (0, 9)),
            (xrange(101), (min, max, lambda x, y: x + y,), (0, 100, (100 * 101) // 2))
        )
        for iterable, funcs, expected in vecs:
            self.assertSequenceEqual(tuple(multireduce(iterable, *funcs)), expected)

但是当你尝试使用它时,你会发现它非常缓慢

%timeit reduce(min, xrange(1000000)) ; reduce(max, xrange(1000000))
10 loops, best of 3: 140 ms per loop
%timeit reduce(lambda a, b: (min(a[0], b[0]), max(a[1], b[1])), ((x, x) for x in xrange(1000000)))
1 loop, best of 3: 682 ms per loop
%timeit multireduce(xrange(1000000), min, max)
1 loop, best of 3: 1.99 s per loop

哎呀,那么你来到Stack Overflow寻求Python优化智慧...

1个回答

1

好吧,有这个,但这有点违背可迭代对象的初衷...

def multireduce(iterable, *funcs):
    """:Return: The tuple resulting from calling ``reduce(func, iterable)`` for each `func` in `funcs`."""
    return tuple(imap(reduce, funcs, tee(iterable, len(funcs))))

但是对于我的测试案例来说,速度相当快:

%timeit multireduce(xrange(1000000), min, max)
10 loops, best of 3: 166 ms per loop

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接