Tensorflow accuracy from model.predict does not match final epoch val_accuracy of model.fit

Question

I am trying to match the accuracy of a model.predict call to the final val_accuracy of model.fit(). I am using tf dataset. The dataset setup for train_ds is similar. I prefetch both&#8230; Than I get the labels for the val_ds so I can use them later My model Compiles fine Seems to fit fine The last epoch outp…

Accepted Answer

What you are missing is that your validation dataset is shuffled at every iteration.tf.keras.utils.image_dataset_from_directory has shuffle=True by default. And that shuffle method for a TensorFlow dataset has an argument reshuffle_each_iteration which is None by default. Therefore it is shuffled everytime.The seed=38 parameter is used for tracking the samples that reserved for training and validation separately. In other words, with seed argument we can follow which samples will be used for validation dataset and vice versa.As an example:dataset = tf.data.Dataset.range(6)dataset = dataset.shuffle(6, reshuffle_each_iteration=None, seed=154).batch(2)print("First time iteration:")for x in dataset:    print(x)print("n")print("Second time iteration")  for x in dataset:    print(x)This will print:First time iteration:tf.Tensor([2 1], shape=(2,), dtype=int64)tf.Tensor([3 0], shape=(2,), dtype=int64)tf.Tensor([5 4], shape=(2,), dtype=int64)Second time iterationtf.Tensor([4 3], shape=(2,), dtype=int64)tf.Tensor([0 5], shape=(2,), dtype=int64)tf.Tensor([2 1], shape=(2,), dtype=int64)Relevant source code for tf.keras.utils.image_dataset_from_directory can be found here.If you want to match predictions with their respective labels, then you can loop over the dataset:predictions = []labels = []for x, y in val_ds:    predictions.append(np.argmax(model(x), axis=-1))    labels.append(y.numpy())predictions = np.concatenate(predictions, axis=0)labels = np.concatenate(labels, axis=0)Then you can check accuracy.

Advertisement

Answer