第二个代码示例较慢,因为您正在向39个工作进程池提交单个对。只有一个工作进程将处理您的请求,其他38个将无事可做!由于需要将数据从主线程传输到工作进程中,因此速度会变慢。
您可以“缓冲”一些对,然后执行这组对以平衡内存使用情况,但仍然可以利用多进程环境的优势。
import itertools
from multiprocessing import Pool
def foo(x):
return sum(x)
cpus = 3
pool = Pool(cpus)
buff_size = 10*cpus
buff = []
for i, r in enumerate(itertools.product(range(20), range(10))):
if (i % buff_size) == (buff_size-1):
print pool.map(foo, buff)
buff = []
else:
buff.append(r)
if len(buff) > 0:
print pool.map(foo, buff)
buff = []
以上代码的输出结果将会是这样的:
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 2, 3, 4, 5, 6, 7, 8, 9, 10]
[3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 5, 6, 7, 8, 9, 10, 11, 12, 13]
[6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 8, 9, 10, 11, 12, 13, 14, 15, 16]
[9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 11, 12, 13, 14, 15, 16, 17, 18, 19]
[12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 14, 15, 16, 17, 18, 19, 20, 21, 22]
[15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 17, 18, 19, 20, 21, 22, 23, 24, 25]
[18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28]
调整 buff_size
的乘数以获得适合您系统的正确平衡!
Pool
,以便Pool
工作进程可以可预测地清理。只需将pool = Pool(39)
更改为with Pool(39) as pool:
并缩进使用池的下面的行;当退出块时,工作进程会立即清理。 - ShadowRanger