这只是时间问题。Windows需要在Pool
中生成4个进程,然后需要启动、初始化并准备从Queue
中消耗。在Windows上,这要求每个子进程重新导入__main__
模块,并在每个子进程中取消使用Pool
内部使用的Queue
实例。这需要一定的时间。事实上,当你执行两个map_async()
调用时,所有Pool
中的进程甚至还没有全部启动和运行。如果您为Pool
中每个工作程序运行的函数添加一些跟踪,您可以看到这一点:
while maxtasks is None or (maxtasks and completed < maxtasks):
try:
print("getting {}".format(current_process()))
task = get()
print("got {}".format(current_process()))
输出:
getting <ForkServerProcess(ForkServerPoolWorker-1, started daemon)>
got <ForkServerProcess(ForkServerPoolWorker-1, started daemon)>
process id = 5145
getting <ForkServerProcess(ForkServerPoolWorker-1, started daemon)>
got <ForkServerProcess(ForkServerPoolWorker-1, started daemon)>
process id = 5145
getting <ForkServerProcess(ForkServerPoolWorker-1, started daemon)>
result = [121]
result1 = [100]
getting <ForkServerProcess(ForkServerPoolWorker-2, started daemon)>
getting <ForkServerProcess(ForkServerPoolWorker-3, started daemon)>
getting <ForkServerProcess(ForkServerPoolWorker-4, started daemon)>
got <ForkServerProcess(ForkServerPoolWorker-1, started daemon)>
正如您所看到的,Worker-1
在其他 2-4 个工作进程尝试从 Queue
中消耗任务之前启动并消耗了两个任务。如果在主进程中实例化 Pool
,但在调用 map_async
之前添加 sleep
调用,则会看到不同的进程处理每个请求:
getting <ForkServerProcess(ForkServerPoolWorker-1, started daemon)>
getting <ForkServerProcess(ForkServerPoolWorker-2, started daemon)>
getting <ForkServerProcess(ForkServerPoolWorker-3, started daemon)>
getting <ForkServerProcess(ForkServerPoolWorker-4, started daemon)>
got <ForkServerProcess(ForkServerPoolWorker-1, started daemon)>
process id = 5183
got <ForkServerProcess(ForkServerPoolWorker-2, started daemon)>
process id = 5184
getting <ForkServerProcess(ForkServerPoolWorker-1, started daemon)>
getting <ForkServerProcess(ForkServerPoolWorker-2, started daemon)>
result = [121]
result1 = [100]
got <ForkServerProcess(ForkServerPoolWorker-3, started daemon)>
got <ForkServerProcess(ForkServerPoolWorker-4, started daemon)>
got <ForkServerProcess(ForkServerPoolWorker-1, started daemon)>
got <ForkServerProcess(ForkServerPoolWorker-2, started daemon)>
(请注意,您看到的额外的“getting/got”语句是向每个进程发送哨兵以优雅地关闭它们。)
在Linux上使用Python 3.x,我能够使用“spawn”和“forkserver”上下文来复现这种行为,但不能使用“fork”。这可能是因为分叉子进程比生成它们并重新导入“__main__”要快得多。