Python Process Pool with custom Process not able to respawn child processes

Question

I have overridden multiprocess.Process (fork of multiprocessing library) like so: When I create normal Process using this class everything works perfectly, including creating logs. Now I want to create a Process Pool of such customized processes but I encountered problem with respawning such processes after t…

Accepted Answer

This gives me such outputThe error here is because you are replacing ctx.Process (a class) with an instance of your own subclass. Instances, unless they have __call__ method defined, are not callable. But even if you were to replace it with your subclass, it wouldn&#8217;t work. This is because you will get a recursion or attribute error since you are replacing a class with a subclass of that same class.Why there is no logging? If I don&#8217;t use pool, logs appear correctly.This is because you never really successfully patched the pool class to use your subclass of Process, this also ties into your second question (read on).Why after four processes being executed, the new ones that should be respawned have problem to be created? (Not callable error). If I remove the maxtasksperchild argument it works perfectly (0, 1, 4, 9, 16, 25&#8230;)The reason this happens is because pool creates the processes when you start the context manager itself (on line ctx.Pool(processes=4, maxtasksperchild=1) as pool). Since you are applying your patch after the processes start, it won&#8217;t have much of an effect unless the pool was to start the processes again (this is where maxtasksperchild comes in). Hence if you provide maxtasksperchild then the pool will attempt to start another process, but because of the faulty patch, it will return error. If you don&#8217;t set a maxtasksperchild then the pool won&#8217;t care about the patch you applied since it doesn&#8217;t have to start a process again.Regardless, here&#8217;s a better patch to do what you wantfrom multiprocess.pool import Poolfrom functools import partialimport multiprocessimport timeclass Process(multiprocess.Process):    def __init__(self, *args, test_name='', **kwargs) -> None:        multiprocess.Process.__init__(self, *args, **kwargs)        self._parent_conn, self._child_conn = multiprocess.Pipe()        self._exception = None        self._test_name = test_name    def run(self) -> None:        # Have your own implementation here        passdef _Process(ctx, *args, **kwds):    return ctx.MyProcess(*args, **kwds)def worker(x):    print(x ** 2)    time.sleep(1)if __name__ == "__main__":    ctx = multiprocess.get_context()    # Some patching, we add our subclass as an attribute to the context    ctx.MyProcess = Process        # Fix test_name to be passed as a kwarg whenever the pool starts a process. Pretty lazy but gets the job done.     test_name = 'test_name'    Pool.Process = partial(_Process, test_name=test_name)    with ctx.Pool(processes=4, maxtasksperchild=1) as pool:        nums = range(10)        pool.map(worker, nums)Note how test_name is now a keyword argument and also optional. This is so to make it work with functools.partial. You probably want to perform checks so that the value is passed and is valid.

Advertisement

Answer