I have been trying to find a way to load the EMNIST-letters dataset but without much success. I have found interesting stuff in the structure and can’t wrap my head around what is happening. Here is what I mean:
I downloaded the .mat format in here
I can load the data using
import scipy.io mat = scipy.io.loadmat('letter_data.mat') # renamed for conveniance
it is a dictionnary with the keys as follow:
dict_keys(['__header__', '__version__', '__globals__', 'dataset'])
the only key with interest is dataset, which I havent been able to gather data from. printing the shape of it give this:
>>>print(mat['dataset'].shape) (1, 1)
I dug deeper and deeper to find a shape that looks somewhat like a real dataset and came across this:
>>>print(mat['dataset'][0][0][0][0][0][0].shape) (124800, 784)
which is exactly what I wanted but I cant find the labels nor the test data, I tried many things but cant seem to understand the structure of this dataset.
If someone could tell me what is going on with this I would appreciate it
Advertisement
Answer
Because of the way the dataset is structured, the array of image arrays can be accessed with mat['dataset'][0][0][0][0][0][0]
and the array of label arrays with mat['dataset'][0][0][0][0][0][1]
. For instance, print(mat['dataset'][0][0][0][0][0][0][0])
will print out the pixel values of the first image, and print(mat['dataset'][0][0][0][0][0][1][0])
will print the first image’s label.
For a less…convoluted dataset, I’d actually recommend using the CSV version of the EMNIST dataset on Kaggle: https://www.kaggle.com/crawford/emnist, where each row is a separate image, there are 785 columns where the first column = class_label and each column after represents one pixel value (784 total for a 28 x 28 image).