ProcessPoolExecutor,处理BrokenProcessPool异常

4
在这份文档(https://pymotw.com/3/concurrent.futures/)中说到:

"ProcessPoolExecutor 的工作方式与 ThreadPoolExecutor 相同,但使用进程而不是线程。这使得 CPU 密集型操作可以使用单独的 CPU,并且不会被 CPython 解释器的全局解释器锁定阻塞。"

听起来很不错!它还说:

"如果发生了某些事情导致其中一个工作进程意外退出,那么 ProcessPoolExecutor 将被认为是“broken”,并且将不再调度任务。"

听起来很糟糕 :( 所以我的问题是:什么被认为是“意外”?这是否意味着退出信号不为1?我可以安全地退出线程并继续处理队列吗?以下是示例:

from concurrent import futures
import os
import signal


with futures.ProcessPoolExecutor(max_workers=2) as ex:
    print('getting the pid for one worker')
    f1 = ex.submit(os.getpid)
    pid1 = f1.result()

    print('killing process {}'.format(pid1))
    os.kill(pid1, signal.SIGHUP)

    print('submitting another task')
    f2 = ex.submit(os.getpid)
    try:
        pid2 = f2.result()
    except futures.process.BrokenProcessPool as e:
        print('could not start new tasks: {}'.format(e))
1个回答

0

我没有在现实生活中看到它,但从代码来看,返回的文件描述符似乎不包含results_queue文件描述符。

来自concurrent.futures.process:

    reader = result_queue._reader

    while True:
        _add_call_item_to_queue(pending_work_items,
                                work_ids_queue,
                                call_queue)

        sentinels = [p.sentinel for p in processes.values()]
        assert sentinels
        ready = wait([reader] + sentinels)
        if reader in ready:  # <===================================== THIS
            result_item = reader.recv()
        else:
            # Mark the process pool broken so that submits fail right now.
            executor = executor_reference()
            if executor is not None:
                executor._broken = True
                executor._shutdown_thread = True
                executor = None
            # All futures in flight must be marked failed
            for work_id, work_item in pending_work_items.items():
                work_item.future.set_exception(
                    BrokenProcessPool(
                        "A process in the process pool was "
                        "terminated abruptly while the future was "
                        "running or pending."
                    ))
                # Delete references to object. See issue16284
                del work_item

wait函数依赖于系统,但假设是Linux操作系统(在multiprocessing.connection中,删除了所有与超时相关的代码):

    def wait(object_list, timeout=None):
        '''
        Wait till an object in object_list is ready/readable.

        Returns list of those objects in object_list which are ready/readable.
        '''
        with _WaitSelector() as selector:
            for obj in object_list:
                selector.register(obj, selectors.EVENT_READ)

            while True:
                ready = selector.select(timeout)
                if ready:
                    return [key.fileobj for (key, events) in ready]
                else:
                    # some timeout code


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接