Keras model fits on data with the wrong shape

Question

I've created the following model: and the following dummy data: with the shapes of (4, None, 2) and (4, 3). Looking at the model structure one can see that the model has 3 outputs of shape (None, 1). I was wondering how come the fit works, when I expected they to be of shape (4, 3, 1) and not (4,

Accepted Answer

Both are correct. Take a look at this and this. As you can see that this says &#8216;Squeeze or expand last dimension if needed&#8217; and so after doing that if the dimensions match then it&#8217;s all good.First of all remember that everything depends on your loss function.Below I will show one example:# I get a preds from the model at output x.preds = m(x)# Let's print the shapespreds = np.array(preds)x.shape, y.shape, y2.shape, preds.shape# Result -> (TensorShape([4, None, 2]), (4, 3), (4, 5), (3, 4, 1))# Let's take an individual look at the y2[0] and preds[0]y2[0], preds[0]'''(array([1, 1, 1, 1, 1]), array([[-0.1815457 ],        [-1.0390669 ],        [ 0.27160883],        [-0.3232715 ]], dtype=float32))So, now the thing to notice here is what will happen if we do y2[0] - preds[0]?As the shapes are different the arrays will first be broadcasted and the y2[0] will become :[[1,1,1,1,1][1,1,1,1,1][1,1,1,1,1][1,1,1,1,1]]and preds[0] will become:array([[-0.1815457 , -0.1815457 , -0.1815457 , -0.1815457 , -0.1815457 ],       [-1.03906691, -1.03906691, -1.03906691, -1.03906691, -1.03906691],       [ 0.27160883,  0.27160883,  0.27160883,  0.27160883,  0.27160883],       [-0.32327151, -0.32327151, -0.32327151, -0.32327151, -0.32327151]])'''# Doing y2[0] - preds[0]y2[0] - preds[0]'''due to above mentioned broadcasting the results of this will bearray([[1.1815457 , 1.1815457 , 1.1815457 , 1.1815457 , 1.1815457 ],       [2.03906691, 2.03906691, 2.03906691, 2.03906691, 2.03906691],       [0.72839117, 0.72839117, 0.72839117, 0.72839117, 0.72839117],       [1.32327151, 1.32327151, 1.32327151, 1.32327151, 1.32327151]])'''# Now we take the meannp.mean(y2[0] - preds[0])# Result -> 1.3180688247084618# After doing the whole process with the whole y2 and predstemp = y2 - predsnp.mean(temp)# result -> 1.9192037958030899# So that was the case of y2. now let's see the case of y1# speeding things up, if I were to do y[0] - preds[0]y[0]-preds[0]'''The results will be:array([[1.1815457 , 1.1815457 , 1.1815457 ],       [2.03906691, 2.03906691, 2.03906691],       [0.72839117, 0.72839117, 0.72839117],       [1.32327151, 1.32327151, 1.32327151]])Can you see the answer? Well now as soon as take the mean the results will be equal to y2.'''np.mean(y[0] - preds[0])# Results  -> 1.3180688247084618And hence both are working fine in this case.

Advertisement

Answer