Skip to content
Advertisement

When spawn processes, Does Lock have different id?

I’m trying to figure out how Lock works under the hood. I run this code on MacOS which using “spawn” as default method to start new process.

from multiprocessing import Process, Lock, set_start_method
from time import sleep


def f(lock, i):
    lock.acquire()
    print(id(lock))
    try:
        print('hello world', i)
        sleep(3)
    finally:
        lock.release()

if __name__ == '__main__':
    # set_start_method("fork")
    lock = Lock()
    for num in range(3):
        p = Process(target=f, args=(lock, num))
        p.start()
        p.join()

Output:

140580736370432
hello world 0
140251759281920
hello world 1
140398066042624
hello world 2

The Lock works in my code. However, the ids of lock make me confused. Since idare different, are they still same one lock or there are multiple locks and they somehow communicate secretly? Is id() still hold the position in multiprocessing, I quote “CPython implementation detail: id is the address of the object in memory.”?

If I use “fork” method, set_start_method("fork"), it prints out identical id which totally make sense for me.

Advertisement

Answer

id is implemented as but not required to be the memory location of the given object. when using fork, the separate process does not get it’s own memory space until it modifies something (copy on write), so the memory location does not change because it “is” the same object. When using spawn, an entire new process is created and the __main__ file is imported as a library into the local namesapce, so all your same functions, classes, and module level variables are accessable (sans any modifications from anything that results from if __name__ == "__main__":). Then python creates a connection between the processes (pipe) in which it can send which function to call, and the arguments to call it with. everything passing through this pipe must be pickle‘d then unpickle‘d. Locks specifically are re-created when un-pickling by asking the operating system for a lock with a specific name (which was created in the parent process when the lock was created, then this name is sent across using pickle). This is how the two locks are synchronized, because it is backed by an object the operating system controls. Python then stores this lock along with some other data (the PyObject as it were) in the memory of the new process. calling id now will get the location of this struct which is different because it was created by a different process in a different chunk of memory.

here’s a quick example to convince you that a “spawn’ed” lock is still synchronized:

from multiprocessing import Process, Lock, set_start_method

def foo(lock):
    with lock:
        print(f'child process lock id: {id(lock)}')

if __name__ ==  "__main__":
    set_start_method("spawn")
    lock = Lock()
    print(f'parent process lock id: {id(lock)}')
    lock.acquire() #lock the lock so child has to wait
    p = Process(target=foo, args=(lock,))
    p.start()
    input('press enter to unlock the lock')
    lock.release()
    p.join()

The different “id’s” are the different PyObject locations, but have little to do with the underlying mutex. I am not aware that there’s a direct way to inspect the underlying lock which the operating system manages.

User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement