Store objects between remote functions in Ray

Question

I'm writing a project which writes using the same data a ton of times, and I have been using ray to scale this up in a cluster setting, however the files are too large to send back and forth/save on the ray object store all the time. Is there a way to save the python objects on the local nodes

Accepted Answer

Writing to files always tends to be tricky in distributed systems since regular file systems aren&#8217;t shared between machines. Ray generally doesn&#8217;t interfere with the file system, but I think you have a few options here.Expand the object store size: You can change the plasma store size and where it&#8217;s stored to a larger file by setting the --object-store-memory and --plasma-directory flags.Use a distributed file system: Distributed filesystems like NFS allow you to share part of your filesystem across machines. If you manually set up an NFS share, you can direct Ray to write to a file within NFS.Don&#8217;t use a filesystem: While this is technically a non-answer, this is arguably the most typical approach to distributed systems. Instead of writing to your filesystem, consider writing to S3 or similar KV store or Blob Store.Downsides of these approaches:The biggest downside of (1) is that if you aren&#8217;t careful, you could badly affect your performance.The biggest downside of (2) is that it can be slow. In particular, if you need to read and write data from multiple nodes. A secondary downside is that you will have to setup NFS yourself.The biggest downside to (3) is that you&#8217;re now relying on an external service, and it arguably isn&#8217;t a direct solution to your problem.

Advertisement

Answer