I’m following this tutorial from Nabeel Ahmed to create your own emotion detector using Keras (I’m a noob) and I’ve found a strange behaviour that I’d like to understand. The input data is a bunch of 48×48 images, each one with an integer value between 0 and 6 (each number stands for an emotion label), which represents the emotion present in the image.
train_X.shape -> (28709, 2304) // training-data, 28709 images of 48x48 train_Y.shape -> (28709,) //The emotion present in each image as an integer, 1 = happiness, 2 = sadness, etc. val_X.shape -> (3589, 2304) val_Y.shape -> (3589, )
In order to feed the data into the model, train_X
and val_X
are reshaped (as the tutorial explains)
train_X.shape -> (28709, 48, 48, 1) val_X.shape -> (3589, 48, 48, 1)
The model, as it is in the tutorial, is this one:
model = Sequential() input_shape = (48,48,1) #1st convolution layer model.add(Conv2D(64, (5, 5), input_shape=input_shape,activation='relu', padding='same')) model.add(Conv2D(64, (5, 5), activation='relu', padding='same')) model.add(BatchNormalization()) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Dropout(0.5)) #2nd convolution layer model.add(Conv2D(128, (5, 5),activation='relu',padding='same')) model.add(Conv2D(128, (5, 5),activation='relu',padding='same')) model.add(BatchNormalization()) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Dropout(0.5)) #3rd convolution layer model.add(Conv2D(256, (3, 3),activation='relu',padding='same')) model.add(Conv2D(256, (3, 3),activation='relu',padding='same')) model.add(BatchNormalization()) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Dropout(0.5)) model.add(Flatten()) model.add(Dense(128)) model.add(BatchNormalization()) model.add(Activation('relu')) model.add(Dropout(0.2)) ################################################################ model.add(Dense(7)) # <- problematic line ################################################################ model.add(Activation('softmax')) my_optimiser = tf.keras.optimizers.Adam( learning_rate=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-07, amsgrad=False, name='Adam') model.compile(loss='categorical_crossentropy', metrics=['accuracy'],optimizer=my_optimiser)
However, when I try to use it, using the tutorial snippet, I get an error in the line of the validation_data
like this
history = model.fit(train_X, train_Y, batch_size=64, epochs=80, verbose=1, validation_data=(val_X, val_Y), shuffle=True) ValueError: Shapes (None, 1) and (None, 7) are incompatible
After reviewing the code and the documentation about the fit
method, my only idea was to change the 7
in the last Dense
layer of the model to 1, which mysteriously works. I’d like to know what is happening here if anyone could give me a hint.
Advertisement
Answer
You seem to be working with sparse integer labels, where each sample belongs to one of seven classes {0, 1, 2, 3, 4, 5, 6}, so I would recommend using SparseCategoricalCrossentropy
instead of CategoricalCrossentropy
as your loss function. Just change this parameter and your model should work fine. If you want to use CategoricalCrossentropy
, you will have to one-hot encode your labels, for example with:
train_Y = tf.keras.utils.to_categorical(train_Y, num_classes=7)