Process a lot of data without waiting for a chunk to finish

Question

I am confused with map, imap, apply_async, apply, Process etc from the multiprocessing python package. What I would like to do: I have 100 simulation script files that need to be run through a simulation program. I would like python to run as many as it can in parallel, then as soon as one is finished, grab a…

Accepted Answer

It works correctly using map. The trouble is simply that you sleep all thread for 5 seconds, so they all finish at the same time.Try this code to see the effect correctly:import multiprocessing as mp  import timeimport randomdef run_sim(x):    # run     t = random.randint(3,10)    print("Running Sim: ", x, " - sleep ", t)    time.sleep(t)                return xdef main():    # x => my simulation files    x = list(range(100))    # run parralel process    pool = mp.Pool(mp.cpu_count()-1)    # get results    result = pool.map(run_sim, x)    print("Results: ", result)if __name__ == "__main__":  main()

Advertisement

Answer