Skip to content
Advertisement

Behavior of steps_per_epoch and validation_steps in Keras Model

I’m a little bit confused on the behavior of steps_per_epoch and validation_steps in the fit function. More particularly, if I set steps_per_epoch to be smaller than total_records/batch_size, would it be that a) the model only trains on the same subset of training data for every epoch or b) the model will use different training data for each epoch and will eventually cover all the training data?

The same question for validation_steps.

For example, if I have 10000 rows for training with a batch size of 10 and steps_per_epoch set to 10, would the model be training only on the first 100 rows for all the epochs or it will be training on different 100 rows in each epoch and go through all the rows after 100 epochs?

I tried to search for it in the documentation and online but have not find any mention of this.

Thanks.

Advertisement

Answer

Since the docs state that using the parameter shuffle in model.fit(...) has no effect when using a generator and when steps_per_epoch is not None, it is essentially up to your data generator to shuffle the rows everytime it is called otherwise you will always get the same results. Check for example, how the ImageDataGenerator works:

import tensorflow as tf

BATCH_SIZE = 2

flowers = tf.keras.utils.get_file(
    'flower_photos',
    'https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz',
    untar=True)

img_gen = tf.keras.preprocessing.image.ImageDataGenerator(rescale=1./255, rotation_range=20)

ds = img_gen.flow_from_directory(flowers, batch_size=BATCH_SIZE, shuffle=True)

for x, y in ds:
  print(x.shape, y)
  break

Using shuffle=True results in different values everytime, whereas shuffle=False always returns the same values.

User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement