Simplest examples which replicates the error:
import tensorflow as tf def loss(y, logits): loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(labels=y, logits=logits)) return loss Input = tf.keras.layers.Input(dtype=tf.float32, shape=(20,), name="X") hidden = tf.keras.layers.Dense(40, activation=tf.keras.activations.relu, name="hidden1")(Input) logits = tf.keras.layers.Dense(10, name="outputs")(hidden) optimizer = tf.keras.optimizers.Adam() model = tf.keras.Model(inputs=Input, outputs=logits) model.summary() model.compile(optimizer=optimizer, loss=loss)
I understand, that in this case, output of model is (batch_size, 10) while my labels have (batch_size,) dimensions. This is why I use tf.nn.sparse_softmax_cross_entropy_with_logits
.
Before I can provide any kind of labels to this model, compilation fails with the following error:
C:StasDevelopmentAnaconda3libsite-packagestensorflow_corepythonopsnn_ops.py in sparse_softmax_cross_entropy_with_logits(_sentinel, labels, logits, name) 3445 raise ValueError("Rank mismatch: Rank of labels (received %s) should " 3446 "equal rank of logits minus 1 (received %s)." % -> 3447 (labels_static_shape.ndims, logits.get_shape().ndims)) 3448 if (static_shapes_fully_defined and 3449 labels_static_shape != logits.get_shape()[:-1]): ValueError: Rank mismatch: Rank of labels (received 2) should equal rank of logits minus 1 (received 2).
After some investigation, I see that compilation fails because tensorflow somehow thinks that my “target_output” has shape of (None, None), while my output has shape of (None, 10), so because of equal number of dimensions, sparse cross entropy cannot be applied.
I learned that in TF 2.1, it was possible to directly give target_output as a parameter to compile which is not possible now.
What would be correct way for me to proceed with this?
Advertisement
Answer
According to the docs, you just have to make sure your labels have the shape [batch_size]
. Here is a working example with tf.squeeze
:
import tensorflow as tf def loss(y, logits): y = tf.squeeze(y, axis=-1) loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(labels=y, logits=logits)) return loss Input = tf.keras.layers.Input(dtype=tf.float32, shape=(20,), name="X") hidden = tf.keras.layers.Dense(40, activation=tf.keras.activations.relu, name="hidden1")(Input) logits = tf.keras.layers.Dense(10, name="outputs")(hidden) optimizer = tf.keras.optimizers.Adam() model = tf.keras.Model(inputs=Input, outputs=logits) model.summary() model.compile(optimizer=optimizer, loss=loss) x = tf.random.normal((50, 20)) y = tf.random.uniform((50, 1), maxval=10, dtype=tf.int32) model.fit(x, y, epochs=2)