I have two numpy Arrays (X, Y) which I want to convert to a tensorflow dataset. According to the documentation it should be possible to run
train_dataset = tf.data.Dataset.from_tensor_slices((X, Y)) model.fit(train_dataset)
When doing this however I get the error:
ValueError: Shapes (15, 1) and (768, 15) are incompatible
This would make sense if the shapes of the numpy Arrays would be incompatible to the expected inputs/outputs.
But if I run it with the numpy arrays by using model.fit(X,Y)
it runs without any problems, so the shapes seem to be okay.
In a next step I checked the output sizes:
>>> train_dataset.batch(4) <BatchDataset shapes: ((None, 768), (None, 15)), types: (tf.int64, tf.uint8)>
The input layer for the neural network expect (None, None) and the output (None, 15). So this also seems to match.
My dataset is rather large, so it’s difficult to share that, but here is a minimal reproducible example which shows the problem. It’s the same error, and the fit with just the numpy arrays works.
import tensorflow as tf from tensorflow.keras.layers import * from tensorflow.keras import Model import numpy as np a = np.random.randint(10,size=(10,20,1)) b = np.random.rand(10,15) train_dataset = tf.data.Dataset.from_tensor_slices((a,b)) inp = Input(shape=(None,), dtype="int32") embedding = Embedding(12, 300, trainable=False, mask_zero=True)(inp) gru = Bidirectional(GRU(128, recurrent_dropout=0.5))(embedding) out = Dense(64, activation=tf.nn.relu)(gru) out = Dropout(0.5)(out) out = Dense(15, activation='sigmoid')(out) m = Model(inputs=inp, outputs = out) m.compile("adam", 'categorical_crossentropy') m.fit(a,b) m.fit(train_dataset)
Can someone point me into the right direction on how to solve this?
Tensorflow version is 2.3.1.
Advertisement
Answer
It will work if you batch your dataset:
train_dataset = tf.data.Dataset.from_tensor_slices((a,b)).batch(4)