Skip to content
Advertisement

Model was constructed with shape (None, 65536) but it was called on an input with incompatible shape (None, 65536, None)

For reference the full error is here:

WARNING:tensorflow:Model was constructed with shape (None, 65536) for input KerasTensor(type_spec=TensorSpec(shape=(None, 65536), dtype=tf.float32, name='input_1'), name='input_1', description="created by layer 'input_1'"), but it was called on an input with incompatible shape (None, 65536, None).

I am using kymatio to classify audio signals. Before constructing the model I use tensorflow’s tf.keras.utils.audio_dataset_from_directory to create the training and testing sets.

The audio samples are of shape (65536,) before the sets are created. To create the sets I use the following code:

T = 2**16
J = 8
Q = 12
log_eps = 1e-6
SEED = 42

train_dataset = tf.keras.utils.audio_dataset_from_directory(
    '../train',
    labels='inferred',
    label_mode='int',
    class_names=['x', 'y', 'z', 'xy', 'xz', 'yz', 'xyz'],
    batch_size=32,
    output_sequence_length=T,
    ragged=False,
    shuffle=True,
    seed=SEED,
    follow_links=False
)

The element_spec of the train_dataset is (TensorSpec(shape=(None, 65536, None), dtype=tf.float32, name=None), TensorSpec(shape=(None,), dtype=tf.int32, name=None)).

So at some point the shape is changing in the TensorSpec to (None, 65536, None) for some reason…

The model is constructed as follows and the error points to model.fit(...).

x_in = layers.Input(shape=(T))
x = Scattering1D(J, Q=Q)(x_in)
x = layers.Lambda(lambda x: x[..., 1:, :])(x)
x = layers.Lambda(lambda x: tf.math.log(tf.abs(x) + log_eps))(x)
x = layers.GlobalAveragePooling1D(data_format='channels_first')(x)
x = layers.BatchNormalization(axis=1)(x)
x_out = layers.Dense(7, activation='softmax')(x)
model = tf.keras.models.Model(x_in, x_out)
model.summary()
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(train_dataset, epochs=50)

Advertisement

Answer

Check the docs regarding tf.keras.utils.audio_dataset_from_directory:

[…] audio has shape (batch_size, sequence_length, num_channels)

Just use tf.squeeze to remove the additional dimension if you are only working on single channel audios:

train_dataset = train_dataset.map(lambda x, y: (tf.squeeze(x, axis=-1), y))

If you want to keep the dimension, try:

x_in = layers.Input(shape=(T, 1))

I would recommend going through this tutorial.

User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement