I’m trying to build a custom loss function in Keras v2.4.3: (as explained in this answer)
def vae_loss(x: tf.Tensor, x_decoded_mean: tf.Tensor,
original_dim=original_dim):
z_mean = encoder.get_layer('mean').output
z_log_var = encoder.get_layer('log-var').output
xent_loss = original_dim * metrics.binary_crossentropy(x, x_decoded_mean)
kl_loss = - 0.5 * K.sum(
1 + z_log_var - K.square(z_mean) - K.exp(z_log_var), axis=-1)
vae_loss = K.mean(xent_loss + kl_loss)
return vae_loss
But I think it’s behaving much different than expected (perhaps because of my Keras version?), I’m getting this error:
TypeError: Cannot convert a symbolic Keras input/output to a numpy array. This error may indicate that you're trying to pass a symbolic value to a NumPy call, which is not supported. Or, you may be trying to pass Keras symbolic inputs/outputs to a TF API that does not register dispatching, preventing Keras from automatically converting the API call to a lambda layer in the Functional Model.
And I think that’s because encoder.get_layer('mean').output
is returning a KerasTensor
object instead of a tf.Tensor
object (as the other answer indicates).
What am I doing wrong here? How can I access the output of a given layer from inside a custom loss function?
Advertisement
Answer
I think it’s very simple using model.add_loss()
. this functionality enables you to pass multiple inputs to your custom loss.
To make a reliable example I produce a simple VAE where I add the VAE loss using model.add_loss()
The full model structure is like below:
def sampling(args):
z_mean, z_log_sigma = args
batch_size = tf.shape(z_mean)[0]
epsilon = K.random_normal(shape=(batch_size, latent_dim), mean=0., stddev=1.)
return z_mean + K.exp(0.5 * z_log_sigma) * epsilon
def vae_loss(x, x_decoded_mean, z_log_var, z_mean):
xent_loss = original_dim * K.binary_crossentropy(x, x_decoded_mean)
kl_loss = - 0.5 * K.sum(1 + z_log_var - K.square(z_mean) - K.exp(z_log_var))
vae_loss = K.mean(xent_loss + kl_loss)
return vae_loss
def get_model():
### encoder ###
inp = Input(shape=(n_features,))
enc = Dense(64)(inp)
z = Dense(32, activation="relu")(enc)
z_mean = Dense(latent_dim)(z)
z_log_var = Dense(latent_dim)(z)
encoder = Model(inp, [z_mean, z_log_var])
### decoder ###
inp_z = Input(shape=(latent_dim,))
dec = Dense(64)(inp_z)
out = Dense(n_features)(dec)
decoder = Model(inp_z, out)
### encoder + decoder ###
z_mean, z_log_sigma = encoder(inp)
z = Lambda(sampling)([z_mean, z_log_var])
pred = decoder(z)
vae = Model(inp, pred)
vae.add_loss(vae_loss(inp, pred, z_log_var, z_mean)) # <======= add_loss
vae.compile(loss=None, optimizer='adam')
return vae, encoder, decoder
The running notebook is available here: https://colab.research.google.com/drive/18day9KMEbH8FeYNJlCum0xMLOtf1bXn8?usp=sharing