I would like to add a custom metric to model with Keras, I’m debugging my working code and I don’t find a method to do the operations I need.
The problem could be described as a multi classification trough logistic multinomial regression. The custom metric I would like to implement is this:
(1/Number_of_Classes)*(TruePositivesClass1/TotalElementsClass1 + TruePositivesClass2/TotalElementsClass2 + ... + TruePositivesClassN/TotalElementsClassN)
Where Number_of_Classes must be calculate from batch, i.e something like np.unique(y_true).count() and
and every summation item would be something like
len(np.where(y_true==class_i,1,0) == np.where(y_pred==class_i,1,0) )/np.where(y_true==class_i,1,0).sum()
In terms of confusion matrix (in the minimal form of 2 variables)
True False True 15 3 False 12 1
The formula would be 0.5*(15)/(15+12) + 0.5*(1/(1+3))=0.4027
The code could be something like
def custom_metric(y_true,y_pred):
    total_classes = Unique(y_true) #How calculate total unique elements?
    summation = 0
    for _ in unique_value_on_target:
        # calculates Number of y_predict that are _
        true_predics_of_class = Count(y_predict,_) 
        # calculates total number of items of class _ in batch y_true
        true_values = Count(y_true,_) 
        value = true_predicts/true_values
       summation + = value
    return summation
My preprocessed data is a numpy array  like x=[v1,v2,v3,v4,...,vn], and my
objetive column is a nompy array y=[1, 0, 1, 0, 1, 0, 0, 1 ,..., 0, 1]
then, they are converted to tensors:
x_train = tf.convert_to_tensor(x) y_train = tf.convert_to_tensor(tf.keras.utils.to_categorical(y))
Then, they are converted to tensorflow dataset objects:
train_ds = tf.data.Dataset.zip((tf.data.Dataset.from_tensor_slices(x_train),
                                tf.data.Dataset.from_tensor_slices(y_train)))
Later, I take a iterator:
 train_itr = iter(
          train_ds.shuffle(len(y_train) * 5, reshuffle_each_iteration=True).batch(len(y_train)))
and last, I take one element of iterator and train
x_train, y_train = train_itr.get_next()
model.fit(x=x_train, y=y_train, batch_size=batch_size, epochs=epochs,
          callbacks=[custom_callback], validation_data=test_itr.get_next())
So, since objects are dataset iterators, I can’t find functions to operate them as I would like, in order to get the custom metric described.
Advertisement
Answer
So you want calculate average recall wrt multiclass in the batch, here is my example code using numpy and tensorflow:
import tensorflow as tf
import numpy as np
y_t = np.array([[1, 0, 0, 0], [0, 1, 0, 0], [0, 1, 0, 0], [0, 0, 0, 1], [0, 0, 0, 1]], dtype=np.float32)
y_p = np.array([[1, 0, 0, 0], [1, 0, 0, 0], [0, 1, 0, 0], [0, 0, 0, 1], [0, 0, 0, 1]], dtype=np.float32)
def average_recall(y_true, y_pred):
    # Get indexes of both labels and predictions
    labels = np.argmax(y_true, axis=1)
    predictions = np.argmax(y_pred, axis=1)
    # Get confusion matrix from labels and predictions
    confusion_matrix = tf.math.confusion_matrix(labels, predictions).numpy()
    # Get number of all true positives in each class
    all_true_positives = np.diag(confusion_matrix)
    # Get number of all elements in each class
    all_class_sum = np.sum(confusion_matrix, axis=1)
    # Get rid of classes that don't show in batch
    zero_index = np.where(all_class_sum == 0)[0]
    all_true_positives = np.delete(all_true_positives, zero_index)
    all_class_sum = np.delete(all_class_sum, zero_index)
    print("confusion_matrix:n {},n all_true_positives:n {},n all_class_sum:n {}".format(
                                            confusion_matrix, all_true_positives, all_class_sum))
    # Average TruePositives / TotalElements wrt all classes that show in batch
    return np.mean(all_true_positives / all_class_sum)
avg_recall = average_recall(y_t, y_p)
print(avg_recall)
Outputs:
confusion_matrix: [[1 0 0 0] [1 1 0 0] [0 0 0 0] [0 0 0 2]], all_true_positives: [1 1 2], all_class_sum: [1 2 2] 0.8333333333333334
Implement using only tensorflow:
import tensorflow as tf
y_t = tf.constant([[1, 0, 0, 0], [0, 1, 0, 0], [0, 1, 0, 0], [0, 0, 0, 1], [0, 0, 0, 1]], dtype=tf.float32)
y_p = tf.constant([[1, 0, 0, 0], [1, 0, 0, 0], [0, 1, 0, 0], [0, 0, 0, 1], [0, 0, 0, 1]], dtype=tf.float32)
def average_recall(y_true, y_pred):
    # Get indexes of both labels and predictions
    labels = tf.argmax(y_true, axis=1)
    predictions = tf.argmax(y_pred, axis=1)
    # Get confusion matrix from labels and predictions
    confusion_matrix = tf.math.confusion_matrix(labels, predictions)
    # Get number of all true positives in each class
    all_true_positives = tf.linalg.diag_part(confusion_matrix)
    # Get number of all elements in each class
    all_class_sum = tf.reduce_sum(confusion_matrix, axis=1)
    # Get rid of classes that don't show in batch
    mask = tf.not_equal(all_class_sum, tf.constant(0))
    all_true_positives = tf.boolean_mask(all_true_positives, mask)
    all_class_sum = tf.boolean_mask(all_class_sum, mask)
    print("confusion_matrix:n {},n all_true_positives:n {},n all_class_sum:n {}".format(
                                            confusion_matrix, all_true_positives, all_class_sum))
    # Average TruePositives / TotalElements wrt all classes that show in batch
    return tf.reduce_mean(all_true_positives / all_class_sum)
avg_recall = average_recall(y_t, y_p)
print(avg_recall)
Outputs:
confusion_matrix: [[1 0 0 0] [1 1 0 0] [0 0 0 0] [0 0 0 2]], all_true_positives: [1 1 2], all_class_sum: [1 2 2] tf.Tensor(0.8333333333333334, shape=(), dtype=float64)
Reference:
Calculate precision and recall for multiclass classification using confusion matrix
 
						