Skip to content
Advertisement

Share RLock between multiple instances of Python with multiprocessing

Consider this MWE:

JavaScript

When this script is executed, it instantiates the_setup and serves it. Then I want clients to be able to do things like this from other scripts:

JavaScript

However, I get RuntimeError: RLock objects should only be shared between processes through inheritance. If the with the_setup.hold_hardware(): is removed, it works fine but then I cannot guarantee that the hardware wasn’t used by someone else in the middle.

Is it possible to do what I want? I.e. having the_setup running 24/7 and allowing interaction with it at any time from other Python instances. How?

Advertisement

Answer

Update : I have included a patch for multiprocessing.managers to assimilate RLocks seamlessly within. Scroll below to the next section.

Multiprocessing does not like it when you pass objects used for synchronization inside other such objects. That means you cannot put semaphores, locks, queues, pipes inside other queues and pipes (those offered through multiprocessing library). When you create a manager using BaseManager, it uses pipes internally to establish communication between different processes and the manager process. So when you do the_setup.hold_hardware(), you are essentially attempting to pass an Rlock through a pipe, which as discussed, does not work.

Why workarounds don’t work

The most simplest fix one would think of using here would be to create a manager and use manager.Rlock. This uses threading.Rlock instead of the one available through multiprocessing (and therefore can be shared using queues/pipes), but works in a multiprocessing environment because access to it is synchronized though pipes (again, it’s using a manager).

Hence, this code should at least execute:

server.py

JavaScript

client.py

JavaScript

Note that we need to set the manager and the client’s authkey to the same value otherwise there will be an authentication error when attempting to unpickle the lock.

But regardless, even though the code will run, it won’t do what you think it should do. It will block when trying to run the_setup.do_thing_with_hardware(). This is because the Rlock is actually inside manager process, and when you get the lock using with the_setup.hold_hardware() you are actually getting a proxy of the lock instead (try doing print(type(the_setup.hold_hardware())) ). When you attempt to acquire the lock using the proxy, the appropriate function name is sent to the manager process and is executed on the managed object via threading. This defeats the whole purpose of Rlock and the implementation of Rlock inside managers is useless.

Patching multiprocessing to work with RLocks

If you really want to use RLocks inside managers, then you will need to make your own managers and proxies, by subclassing BaseManager and BaseProxy, to relay process identifiers (like pids) which can then be used to create RLocks. Consider the below “patch”:

JavaScript

Over here, PIDProxy is identical to BaseProxy, except for the fact that apart from only sending the method name and arguments to be called on the managed object, it sends an additional identifier with the value str(os.getpid()) as well. Similarly, ForwardPIDManager is identical to BaseManager, except for the fact the it uses a modified subclass of multiprocessing.managers.Server rather than the default parent, to start the server process. PIDServer modifies it’s parent to accept an extra variable (the process identifier) when unpacking requests from proxies and stores it’s value inside the current thread’s local storage (created after executing init inside the server process; further reading about thread local storage here). This happens before the requested function is executed, meaning that all methods of the managed object will have access to this storage. Lastly, MyAutoProxy and MyMakeProxyType override the default methods to create proxies using our subclasses instead.

All you need to do now is to use ForwardPIDManager instead of BaseManager, and specify the proxytype kwarg explicitly as MyAutoProxy. All managed objects will then have access to the pid of the process that called the function by doing

JavaScript

from within the functions on the managed object.

Creating RLock

Using this, we can create our own implementation of RLocks, which closely mirrors that of threading.RLock, but uses this pid_registry.forwarded_for to verify owners instead of threading.get_ident:

JavaScript

Keep in mind that objects of ManagerRLock are not picklable, and therefore should not be passed around by value from manager to proxies. You can, however, expose it’s methods (acquire, release) to proxies without risk (example in the below section). Also, these are meant to be created directly, therefore do not use another manager to create them.

Example implementation

Using our patch and ManagerRLock, our implementation of TheSetup class becomes like this:

JavaScript

One important thing to notice here is that you are not exposing the whole lock, only its methods (look at hold_hardware and release_hardware). This is important because ManagerRLock just like multiprocessing.RLock, is unpicklable. One side effect of this is that you can’t directly use context managers for the lock and you would have to do

JavaScript

inside client.py instead of using a with block. However, if this bothers you can create a wrapper for the lock from within the client and use that instead (example given in the next section).

Lastly, also notice how we call init from inside the main module itself. This is because init must be executed in the server process, and since we are retrieving the server and calling serve_forever inside the main process itself, we must execute init there too. If instead you are using the .start() method of managers to create the server in another process, here is the equivalent code:

JavaScript

Final solution

Combine the whole patch provided above (classes PIDProxy, PIDServer, ForwardPIDManager, and functions init, MyMakeProxyType, MyAutoProxy), the ManagerRLock class, and code related to your implementation (class TheSetup along with the if __name__... block) into one single server.py file. Then, an example client.py which uses the RLock can be like below:

client.py

JavaScript

Notice the use of LockWrapper as a way to use the lock with a context manager.

A word about threads…

The solution relies on using threading.local, therefore, trying to access pid_registry.forwarded_for from inside another thread would fail. Hence, if you are using threading inside your shared class, then make sure you explicitly pass the pid to thread upon starting.

Additionally, ManagerRLock expects single-threaded processes (processes can be more than one) to access it. This means that if you are running multiple processes, which each are also multi-threaded, then using ManagerRLock might be unsafe (untested). However, if this is the case then you can trivially extend the patch by passing not only the pid, but the thread identifer as well (from inside PIDProxy) and store this inside pid_registry (from inside PIDServer). Then you will have access to the thread as well as the pid which sent the request to the manager inside ManagerRLock, and you can then decide the current owner of the lock based on both these variables.

Edit: A quick note about proxies, if you want to use your own (and not use MyAutoProxy), then you can do so, just make sure that the proxy subclasses PIDProxy.

Advertisement