Skip to content
Advertisement

Fastest way to store a numpy array in redis

I’m using redis on an AI project.

The idea is to have multiple environment simulators running policies on a lot of cpu cores. The simulators write experience (a list of state/action/reward tuples) to a redis server (replay buffer). Then a training process reads the experience as a dataset to generate a new policy. New policy is deployed to the simulators, data from previous run is deleted, and the process continues.

The bulk of the experience is captured in the “state”. Which is normally represented as a large numpy array of dimension say, 80 x 80. The simulators generate these as fast as the cpu will allow.

To this end, does anyone have good ideas or experience of the best/fastest/simplest way to write a lot of numpy arrays to redis. This is all on the same machine, but later, could be on a set of cloud servers. Code samples welcome!

Advertisement

Answer

I don’t know if it is fastest, but you could try something like this…

Storing a Numpy array to Redis goes like this – see function toRedis():

  • get shape of Numpy array and encode
  • append the Numpy array as bytes to the shape
  • store the encoded array under supplied key

Retrieving a Numpy array goes like this – see function fromRedis():

  • retrieve from Redis the encoded string corresponding to supplied key
  • extract the shape of the Numpy array from the string
  • extract data and repopulate Numpy array, reshape to original shape

JavaScript

You could add more flexibility by encoding the dtype of the Numpy array along with the shape. I didn’t do that because it may be the case that you already know all your arrays are of one specific type and then the code would just be bigger and harder to read for no reason.

Rough benchmark on modern iMac:

JavaScript

Keywords: Python, Numpy, Redis, array, serialise, serialize, key, incr, unique

Advertisement