How to modify global numpy array safely with multithreading in Python?

Question

I am trying to run my simulations in a threadpool and store my results for each repetition in a global numpy array. However, I get problems while doing that and I am observing a really interesting behavior with the following (, simplified) code (python 3.7): The issue is: I get the correct "Start record & Finish record" outputs, e.g. Start

Accepted Answer

Looks like you&#8217;re running the run function using multiple processes, not multiple threads. Try something like this instead:import numpy as npfrom threading import Thread, Locklog_mutex = Lock()repetition_count = 5data_array = np.zeros(shape=(repetition_count, 3, 200), dtype=float)def record_results(repetition_index, data_array, log_mutex):    log_mutex.acquire()    print("Start record {}".format(repetition_index))    # Do some stuff and modify data_array, e.g.:    data_array[repetition_index, 0, 53] = 12.34    print("Finish record {}".format(repetition_index))    log_mutex.release()def run(repetition_index):    global log_mutex    global data_array    record_results(repetition_index, data_array, log_mutex)if __name__ == "__main__":    threads = []    for i in range(repetition_count):        t = Thread(target=run, args=[i])        t.start()        threads.append(t)    for t in threads:        t.join()Update:To do this with multiple processes, you would need to use multiprocessing.RawArray to instantiate your array; the size of the array is the product repetition_count * 3 * 200. Within each process, create a view on the array using np.frombuffer, and reshape it accordingly. While this will be very fast, I discourage this style of programming as it relies on global shared memory objects, which are error-prone in larger programs.If possible, I suggest removing the global data_array and instead instantiate an array in each call to record_results, which you would return in run. The p.map call will return a list of arrays, which you can convert to a numpy array and recover the shape and contents of the global data_array in your original implementation. This will incur a communication cost, but it&#8217;s a cleaner approach to managing concurrency and eliminates the need for locks.It&#8217;s generally a good idea to minimize inter-process communication, but unless performance is critical, I don&#8217;t think shared memory is the right solution. With p.map, you&#8217;ll want to avoid returning large objects, but the object sizes in your snippet are very small (600*8 bytes).

Advertisement

Answer