I’m trying to build a custom loss function in Keras v2.4.3: (as explained in this answer)
def vae_loss(x: tf.Tensor, x_decoded_mean: tf.Tensor, original_dim=original_dim): z_mean = encoder.get_layer('mean').output z_log_var = encoder.get_layer('log-var').output xent_loss = original_dim * metrics.binary_crossentropy(x, x_decoded_mean) kl_loss = - 0.5 * K.sum( 1 + z_log_var - K.square(z_mean) - K.exp(z_log_var), axis=-1) vae_loss = K.mean(xent_loss + kl_loss) return vae_loss
But I think it’s behaving much different than expected (perhaps because of my Keras version?), I’m getting this error:
TypeError: Cannot convert a symbolic Keras input/output to a numpy array. This error may indicate that you're trying to pass a symbolic value to a NumPy call, which is not supported. Or, you may be trying to pass Keras symbolic inputs/outputs to a TF API that does not register dispatching, preventing Keras from automatically converting the API call to a lambda layer in the Functional Model.
And I think that’s because encoder.get_layer('mean').output
is returning a KerasTensor
object instead of a tf.Tensor
object (as the other answer indicates).
What am I doing wrong here? How can I access the output of a given layer from inside a custom loss function?
Advertisement
Answer
I think it’s very simple using model.add_loss()
. this functionality enables you to pass multiple inputs to your custom loss.
To make a reliable example I produce a simple VAE where I add the VAE loss using model.add_loss()
The full model structure is like below:
def sampling(args): z_mean, z_log_sigma = args batch_size = tf.shape(z_mean)[0] epsilon = K.random_normal(shape=(batch_size, latent_dim), mean=0., stddev=1.) return z_mean + K.exp(0.5 * z_log_sigma) * epsilon def vae_loss(x, x_decoded_mean, z_log_var, z_mean): xent_loss = original_dim * K.binary_crossentropy(x, x_decoded_mean) kl_loss = - 0.5 * K.sum(1 + z_log_var - K.square(z_mean) - K.exp(z_log_var)) vae_loss = K.mean(xent_loss + kl_loss) return vae_loss def get_model(): ### encoder ### inp = Input(shape=(n_features,)) enc = Dense(64)(inp) z = Dense(32, activation="relu")(enc) z_mean = Dense(latent_dim)(z) z_log_var = Dense(latent_dim)(z) encoder = Model(inp, [z_mean, z_log_var]) ### decoder ### inp_z = Input(shape=(latent_dim,)) dec = Dense(64)(inp_z) out = Dense(n_features)(dec) decoder = Model(inp_z, out) ### encoder + decoder ### z_mean, z_log_sigma = encoder(inp) z = Lambda(sampling)([z_mean, z_log_var]) pred = decoder(z) vae = Model(inp, pred) vae.add_loss(vae_loss(inp, pred, z_log_var, z_mean)) # <======= add_loss vae.compile(loss=None, optimizer='adam') return vae, encoder, decoder
The running notebook is available here: https://colab.research.google.com/drive/18day9KMEbH8FeYNJlCum0xMLOtf1bXn8?usp=sharing