Extracting datasets from 1 HDF5 file to multiple files

Question

I have actually raised a question in generating img from HDF5. Now, another problem I have is to generate the h5 from the existing. For instance, I have a [ABC.h5], inside, there is the dataset for image and its gt_density map. The keys would be [images, density_maps] I want to have [GT_001.h5], [GT_002.h5]... instead of the single h5 file. This

Accepted Answer

This answers your question based on my interpretation of your data. If it doesn’t solve your problem, please clarify the summary below.First, please be careful with the term “dataset”. It has a specific meaning with h5py. You use “dataset” to refer to a set of data used for training and testing a CNN. That makes it difficult when there are also datasets IN a HDF5 file.Based on your explanation, this is my understanding of the different files you have for training and testing.Your original set of training and testing data in the CRSNet:image files: IMG_###.jpgground truth density map files: IMG_###.h5 with attributes: name=”density”; shape=(544, 932); type=”You have pairs of image and density files — 1 .jpg and .h5 file for IMG_001 thru IMG_NNN.Your new set of training and testing data:H5 Filename: [ABC.h5]H5 Dataset 1: name=”images”: shape=(300, 380, 676, 1), type=”|u1″H5 Dataset 2: name=”density_maps”, shape=(300, 380, 676, 1), type=”You have extracted the data from the “images” dataset in this .h5 file to create IMG_###.jpg (like your original set of training and testing data). Now you want to extract arrays from the “density_maps” dataset in the .h5 file to create IMG_###.h5.If so, the process is the same as the image extraction procedure. The only difference is you write the data to a .h5 file instead of .jpg file. See below for a pseudo-code.with h5py.File('yourfile.h5','r') as h5r: for i in range(h5r['density_maps'].shape[0]): dmap_arr = h5r['density_maps'][i,:] h5w=h5py.File(f'IMG_{i:03}.h5','w') h5w.create_dataset('density_maps',data=dmap_arr) h5w.close() Note, when you read dmap_arr you may get shape=(380, 676, 1). If so, you can reshape with .reshape(380, 676). Like this: dmap_arr = h5r['density_maps'][i,:].reshape(380, 676)

Advertisement

Answer