I’m new on python. I want to learn how to parallel processing in python. I saw the following example:
import multiprocessing as mp np.random.RandomState(100) arr = np.random.randint(0, 10, size=[20, 5]) data = arr.tolist() def howmany_within_range_rowonly(row, minimum=4, maximum=8): count = 0 for n in row: if minimum <= n <= maximum: count = count + 1 return count pool = mp.Pool(mp.cpu_count()) results = pool.map(howmany_within_range_rowonly, [row for row in data]) pool.close() print(results[:10])
but when I run it, this error happened:
RuntimeError: An attempt has been made to start a new process before the current process has finished its bootstrapping phase. This probably means that you are not using fork to start your child processes and you have forgotten to use the proper idiom in the main module: if __name__ == '__main__': freeze_support() ... The "freeze_support()" line can be omitted if the program is not going to be frozen to produce an executable.
What should I do?
Advertisement
Answer
If you place everything in global scope inside this if __name__ == "__main__"
block as follows, you should find that your program behaves as you expect:
def howmany_within_range_rowonly(row, minimum=4, maximum=8): count = 0 for n in row: if minimum <= n <= maximum: count = count + 1 return count if __name__ == "__main__": np.random.RandomState(100) arr = np.random.randint(0, 10, size=[20, 5]) data = arr.tolist() pool = mp.Pool(mp.cpu_count()) results = pool.map(howmany_within_range_rowonly, [row for row in data]) pool.close() print(results[:10])
Without this protection, if your current module was imported from a different module, your multiprocessing code would be executed. This could occur within a non-main process spawned in another Pool and spawning processes from sub-processes is not allowed, hence we protect against this problem.