Skip to content
Advertisement

Save a list of dictionaries with numpy arrays

I have a dataset composed as:

JavaScript

Each element of the list is a dictionary containing a key “sample” and its value is a numpy array that has shape (2048,3) and the category is the class of that sample. The dataset len is 8000.

I tried to save in JSON but it said it can’t serialize numpy arrays.

What’s the best way to save this list? I can’t use np.save("file", dataset) because there is a dictionary and I can’t use JSON because there is the numpy array. Should I use HDF5? What format should I use if I have to use the dataset for machine learning? Thanks!

Advertisement

Answer

Creating an example specific to your data requires more details about the dictionaries in the list. I created an example that assumes every dictionary has:

  • A unique value for the category key. The value is used for the dataset name.
  • There is a sample key with the array you want to save.

Code below creates some data, loads to a HDF5 file with h5py package, then reads the data back into a new list of dictionaries. It is a good starting point for your problem.

JavaScript

Here is a second method when category values aren’t unique.

JavaScript
User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement