I am new to the tf.data API and trying to use it to load images from disk in the Dogs vs. Cats Redux: Kernels Edition Kaggle competition. To do this, I first created a pandas DataFrame named train_df with two columns – file_path containing the relative path of images and target containing the target labels 0 (for cat) and 1(for dog). Here’s how the first 10 rows of the DataFrame looks like:
Then, I tried loading the images with the following code:
import tensorflow as tf BATCH_SIZE = 128 IMG_HEIGHT = 224 IMG_WIDTH = 224 def read_images(X, y): X = tf.io.read_file(X) X = tf.io.decode_image(X, expand_animations=False, dtype=tf.float32, channels=3) X = tf.image.resize(X, [IMG_HEIGHT, IMG_WIDTH]) X = tf.keras.applications.efficientnet.preprocess_input(X, data_format="channels_last") return (X, y) def build_data_pipeline(X, y): data = tf.data.Dataset.from_tensor_slices((X, y)) data = data.map(read_images) data = data.batch(BATCH_SIZE) data = data.prefetch(tf.data.AUTOTUNE) return data tf_data = build_data_pipeline(train_df["file_path"], train_df["target"])
After this, I tried training my model using the following code
model.fit(tf_data, epochs=10)
but got a training accuracy of only 50% whereas with ImageDataGenerator, I am getting an accuracy of 99%. Thus, the problem lies somewhere in the data loading part which I am not able find out.
I have used EfficientNetB0 with weights trained from imagenet as feature extractor and single neuron layer at the end as classifier.
Pretrained EfficientNetB0 model:
pretrained_model = tf.keras.applications.EfficientNetB0( input_shape=(IMG_HEIGHT, IMG_WIDTH, 3), include_top=False, weights="imagenet" ) for layer in pretrained_model.layers: layer.trainable = False
Dense layer with one neuron at the end of the EfficientNetB0:
pretrained_output = pretrained_model.get_layer('top_activation').output x = tf.keras.layers.GlobalAveragePooling2D()(pretrained_output) x = tf.keras.layers.BatchNormalization()(x) x = tf.keras.layers.Dense(1, activation="sigmoid")(x) model = tf.keras.models.Model(pretrained_model.input, x)
Compiling the model:
model.compile( optimizer="adam", loss="binary_crossentropy", metrics=["accuracy"] )
Advertisement
Answer
In the above notebook, change the input reading function read_images
as follows:
def read_images(X, y): X = tf.io.read_file(X) X = tf.image.decode_jpeg(X, channels = 3) X = tf.image.resize(X, [IMG_HEIGHT, IMG_WIDTH]) #/255.0 return (X, y)
Also note that, tf.keras.applications.EfficientNet-Bx
has in-built normalization layer. So, it’s better not to normalize the data in the above function (i.e. /255.0
).