I have a very specific problem with python parallelisation let’s see if I can explain it,
I want to execute a function foo()
using the multiprocessing library for parallelisation.
# Creation of the n processes, in this case 4, and start it threads = [multiprocessing.Process(target=foo, args=(i)) for i in range(n)] for th in threads: th.start()
The foo()
function is a recursive function who explores a tree in depth until one specific event happens. Depending on how it expands through the tree, this event can occur in a few steps, for example 5 or even in millions. The tree nodes are a set of elements and in each step I select a random element from this set with rand_element = random.sample(node.set_of_elements,1)[0]
and make a recursive call accordingly to them, i.e., two different random elements have different tree paths.
The problem is that for some unknown reason, the processes apparently does not behave independently. For example, if I run 4 processes in parallel, sometimes they return this result.
1, Number of steps: 5 2, Number of steps: 5 3, Number of steps: 5 4, Number of steps: 5
that is to say, all the processes take the “good path” and ends in a very few steps. On the other hand, other times it returns this.
1, Number of steps: 6516 2, Number of steps: 8463 3, Number of steps: 46114 4, Number of steps: 56312
that is to say, all the processes takes “bad paths”. I haven’t had a single execution in which at least one takes the “good path” and the rest the “bad path”.
If I run foo()
multiple times sequentially, more than a half of execution ends with less than 5000 steps, but in concurrency I don’t see this proportion, all the processes ends either fast or slow.
How is it possible?
Sorry if I can’t give you more precise details about the program and execution, but it is too big and complex to explain here.
Advertisement
Answer
I have found the solution, I post it in case someone finds it helpful
The problem was that at some point inside foo()
, I have used the my_set.pop()
method instead of set.remove(random.sample (my_set, 1) [0])
. The first one, my_set.pop()
doesn’t actually return a random element. In Python 3.6 sets have a concrete order like lists, the key is that the established order is generated randomly, so, to return a (pseudo)random element, the my_set.pop()
method, always returns the first element. The problem was that in my case, all processes share that order, so my_set.pop()
returns the same first element in all of them.