Skip to content
Advertisement

TensorFlow: How do I generate a dataset from two arrays?

I’ve been trying to generate a custom dataset from two arrays. One with the shape (128,128,6) (satellite data with 6 channels), and the other with the shape (128,128,1) (binary mask). I have been using the function tf.data.Dataset.from_tensor_slices:

train_dataset = tf.data.Dataset.from_tensor_slices((train_raster, train_mask))

What I get is this:

<PrefetchDataset element_spec=(TensorSpec(shape=(128, 128, 6), dtype=tf.float32, name=None), TensorSpec(shape=(128, 128, 1), dtype=tf.float32, name=None))>

However, when I try to run this through my model I get this error:

ValueError ValueError: `Shapes (None, 128, 128, 1) and (None, 2) are incompatible

(None, 2) since my output is one of 2 classes.

In a tutorial I’ve seen the dataset as <PrefetchDataset shapes: ((None, 128, 128, 3), (None, 128, 128, 1)), types: (tf.float32, tf.float32)>. Is there a difference, and if so, how do I fix it? It seems like only one of the two tensors is being run through the model, but I don’t quite understand why.

Model definition:

model = tf.keras.Sequential([ 
tf.keras.layers.Conv2D(32, (3,3), padding='same', activation=tf.nn.relu, input_shape=(128, 128, 6)), tf.keras.layers.MaxPooling2D((2, 2), strides=2), 
tf.keras.layers.Conv2D(64, (3,3), padding='same', activation=tf.nn.relu), 
tf.keras.layers.MaxPooling2D((2, 2), strides=2), 
tf.keras.layers.Flatten(), 
tf.keras.layers.Dense(128, activation=tf.nn.relu), 
tf.keras.layers.Dense(2, activation=tf.nn.sigmoid) 
])

Advertisement

Answer

Adding @Kaveh comments in the answer section for the benefit of the community as this fixed the user’s issue. (Thank you @Kaveh)

I can guess that your last layer outputs 2 neurons (for binary mask). But since you have specified an array with the shape (128,128,1) in your dataset, it leads to an error. If you just passed the train_dataset to your model, it considers the first part of tuple (first array) as input and the second array (128,128,1) as your labels

If you print your model summary, you will see your model output is just two numbers (None,2). But you need output as (128,128,1). So, your dataset is ok. You need to modify your model architecture. For example an architecture like U-Net.

User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement