Is it possible to share a numpy array that’s not empty between processes?

Question

I thought that SharedMemory would keep values of target arrays, but when I actually tried it, it seems it doesn't. In the code above, the 2 processes can share one array(a) and access to it. But the value that was given before sharing(a['value'] = 100) is missing. Is it just natural or is there any way to keep the value

Accepted Answer

Here&#8217;s an example of how to use shared_memory using numpy. It was pasted together from several of my other answers, but there are a couple pitfalls to keep in mind with shared_memory:When you create a numpy ndarray from a shm object, it doesn&#8217;t prevent the shm from being garbage collected. The unfortunate side effect of this is that the next time you try to access the array, you get a segfault. From another question I created a quick ndarray subclass to just attach the shm as an attribute, so a reference sticks around, and it doesn&#8217;t get GC&#8217;d.Another pitfall is that on Windows, the OS does the tracking of when to delete the memory rather than giving you the access to do so. That means that even if you don&#8217;t call unlink, the memory will get deleted if there are no active references to that particular segment of memory (given by the name). The way to solve this is to make sure you keep an shm open on the main process that outlives all child processes. Calling close and unlink at the end keeps that reference to the end, and makes sure on other platforms you don&#8217;t leak memory.import numpy as npimport multiprocessing as mpfrom multiprocessing.shared_memory import SharedMemoryclass SHMArray(np.ndarray): #copied from https://numpy.org/doc/stable/user/basics.subclassing.html#slightly-more-realistic-example-attribute-added-to-existing-array    '''an ndarray subclass that holds on to a ref of shm so it doesn't get garbage collected too early.'''    def __new__(cls, input_array, shm=None):        obj = np.asarray(input_array).view(cls)        obj.shm = shm        return obj    def __array_finalize__(self, obj):        if obj is None: return        self.shm = getattr(obj, 'shm', None)def child_func(name, shape, dtype):    shm = SharedMemory(name=name)    arr = SHMArray(np.ndarray(shape, buffer=shm.buf, dtype=dtype), shm)    arr[:] += 5    shm.close() #be sure to cleanup your shm's locally when they're not needed (referring to arr after this will segfault)if __name__ == "__main__":    shape = (10,) # 1d array 10 elements long    dtype = 'f4' # 32 bit floats    dummy_array = np.ndarray(shape, dtype=dtype) #dumy array to calculate nbytes    shm = SharedMemory(create=True, size=dummy_array.nbytes)    arr = np.ndarray(shape, buffer=shm.buf, dtype=dtype) #create the real arr backed by the shm    arr[:] = 0    print(arr) #should print arr full of 0's    p1 = mp.Process(target=child_func, args=(shm.name, shape, dtype))    p1.start()    p1.join()    print(arr) #should print arr full of 5's    shm.close() #be sure to cleanup your shm's    shm.unlink() #call unlink when the actual memory can be deleted

Advertisement

Answer