Writing dask bag to DB using custom function

Question

I&#8217;m running a function on dask bag to dump data into NoSQL DB like: Now when I look at the dask task graph, after each partition completes the write_to_db function, it is being shown as memory instead ofreleased. My Questions: How to tell dask that there is no return value and hence mark memory as relea…

Accepted Answer

Yes, Dask is holding the implicit return None values as the result in memory, but these are small, and I wouldn&#8217;t worry. The output of your compute() will be a set of Nones (actually, to keep the bag paradign, you might want to make this a list)Dask does not release the GIL for you, but the DB function you call might &#8211; read the docs of that project; if it does not release the GIL, you might see better performance with more processes and fewer threads/processThis seems like a fine way to go. A version using dask.delayed would likely be about the same number of lines.

Advertisement

Answer