在Python（Windows）中使用进程池进行多进程处理

Question

在Python（Windows）中使用进程池进行多进程处理

pythonnumpymultiprocessingpool

3

我需要并行学习来加快速度。我对Python的多进程库不熟悉，尚未成功运行。

现在，我正在研究每个（源，目标）对是否在我的学习的各个帧之间的某些位置上保持不变。以下是几点：

这是一个函数，我想要使其运行更快（不是多个进程）。
该过程是顺序执行的，也就是说每一帧都与前一帧进行比较。
此代码是原始代码的简化形式。代码输出一个residence_list。
我在使用Windows操作系统。

有人可以检查代码（多进程部分）并帮助我改进它以使其正常工作吗？谢谢。

import numpy as np
from multiprocessing import Pool, freeze_support


def Main_Residence(total_frames, origin_list, target_list):
    Previous_List = {}
    residence_list = []

    for frame in range(total_frames):     #Each frame

        Current_List = {}               #Dict of pair and their residence for frames
        for origin in range(origin_list):

            for target in range(target_list):
                Pair = (origin, target)         #Eahc pair

                if Pair in Current_List.keys():     #If already considered, continue
                    continue
                else:
                    if origin == target:
                        if (Pair in Previous_List.keys()):            #If remained from the previous frame, add residence
                            print "Origin_Target remained: ", Pair
                            Current_List[Pair] = (Previous_List[Pair] + 1)
                        else:                                           #If new, add it to the current
                            Current_List[Pair] = 1

        for pair in Previous_List.keys():                        #Add those that exited from residence to the list
            if pair not in Current_List.keys():
                residence_list.append(Previous_List[pair])

        Previous_List = Current_List
    return residence_list

if __name__ == '__main__':
    pool = Pool(processes=5)
    Residence_List = pool.apply_async(Main_Residence, args=(20, 50, 50))
    print Residence_List.get(timeout=1)
    pool.close()
    pool.join()
    freeze_support()

Residence_List = np.array(Residence_List) * 5

- mah65

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- shmee · Accepted Answer

在这里提出的情境下，多进程并没有意义。你正在创建五个子进程（以及线程池中的三个线程，管理工作进程、任务和结果），来执行一个函数一次。所有这些都是有代价的，无论是系统资源还是执行时间，在其中四个工作进程都不做任何事情。多进程不能加速函数的执行。在您特定的示例代码中，明显直接在主进程中执行Main_Residence(20, 50, 50)会比这种方法更快。
在这样的情境下，要使用多进程就需要将工作分解为一组同质化任务，可以并行处理它们，并且可以在之后将它们的结果合并。

例如（不一定是好的例子），如果你想计算一系列数字的最大质因数，你可以将计算某个特定数字的质因数的任务委托给线程池中的一个工作进程。然后，几个工作进程会并行地进行这些单独的计算。

def largest_prime_factor(n):
    p = n
    i = 2
    while i * i <= n:
        if n % i:
            i += 1
        else:
            n //= i
    return p, n


if __name__ == '__main__':
    pool = Pool(processes=3)
    start = datetime.now()
    # this delegates half a million individual tasks to the pool, i.e. 
    # largest_prime_factor(0), largest_prime_factor(1), ..., largest_prime_factor(499999)      
    pool.map(largest_prime_factor, range(500000))
    pool.close()
    pool.join()
    print "pool elapsed", datetime.now() - start
    start = datetime.now()
    # same work just in the main process
    [largest_prime_factor(i) for i in range(500000)]
    print "single elapsed", datetime.now() - start

输出：

pool elapsed 0:00:04.664000
single elapsed 0:00:08.939000

（largest_prime_factor函数取自@ Stefan在这个答案中）

如您所见，使用进程池执行相同数量的工作时，进程池仅大约比单进程执行快两倍，同时以三个进程并行运行。这是由于多进程/进程池引入的开销导致的。

因此，您说过您示例代码已被简化。您需要分析原始代码，看看它是否可以分解为可以传递到进程池进行处理的同类任务。如果这是可能的，使用多进程可以帮助您加快程序速度。如果不行，则使用多进程可能会浪费时间，而不是节省时间。 编辑：
既然您要求对代码提出建议，我几乎不能对您的函数说任何事情。您自己说它只是一个简化的示例，为了提供MCVE（非常感谢！大多数人没有抽出时间将其代码精简到最少）。关于代码审查的请求更适合在Codereview上进行。

尝试一下可用的任务委派方法。在我的质因数示例中，使用apply_async会受到惩罚。执行时间增加了九倍，与使用map相比。但是我的示例仅使用简单的可迭代对象，而您的示例每个任务需要三个参数。这可能是starmap的情况，但Python 3.3才提供该功能。
无论如何，任务数据的结构/性质基本上决定了正确的方法。

我对多进程运行您的示例函数进行了一些快速测试。输入定义如下：

inp = [(20, 50, 50)] * 5000  # that makes 5000 tasks against your Main_Residence

我在Python 3.6中使用了三个子进程来运行您未经修改的函数，除了删除了print语句（I/O是昂贵的）。我使用了starmap、apply、starmap_async和apply_async，并且每次迭代结果以解决异步结果的阻塞get()问题。
这是输出：

starmap elapsed 0:01:14.506600
apply elapsed 0:02:11.290600
starmap async elapsed 0:01:27.718800
apply async elapsed 0:01:12.571200
# btw: 5k calls to Main_Residence in the main process looks as bad 
# as using apply for delegation
single elapsed 0:02:12.476800

如您所见，尽管这四种方法都执行相同的工作，但执行时间有所不同；您选择的apply_async似乎是最快的方法。

编码风格。您的代码看起来相当不寻常 :)您使用带下划线的大写单词作为名称（函数和变量名称都是如此），在Python中这几乎是禁忌。另外，将名称Previous_List分配给字典是有问题的。请查看PEP 8，特别是命名约定部分，了解Python的通用接受编码风格。

从您的打印方式来看，您仍在使用Python 2。我知道在企业或机构环境中，有时只能使用Python 2。但请记住，Python 2的时钟正在滴答作响。