[Tensorflow2.0] Evaluation indicators metrics

The loss function can be used as an optimization target for model training, as well as an evaluation index for model quality. But usually people will evaluate the quality of the model from other angles.

This is the evaluation index. Usually loss functions can be used as evaluation indicators, such as MAE, MSE, CategoricalCrossentropy, etc. are also commonly used evaluation indicators.

But the evaluation index may not be used as a loss function, such as AUC, Accuracy, Precision. Because the evaluation index does not require continuous derivation, the loss function usually requires continuous derivation.

When compiling the model, you can specify multiple evaluation indicators in the form of a list.

If necessary, you can also customize the evaluation indicators.

The custom evaluation index needs to receive two tensors y_true and y_pred as input parameters, and output a scalar as the evaluation value.

You can also subclass tf.keras.metrics.Metric, rewrite the initialization method, update_state method, and result method to realize the calculation logic of the evaluation indicator, so as to obtain the implementation form of the evaluation indicator class.

Since the training process is usually carried out in batches, the evaluation index can only be obtained after running an epoch. Therefore, evaluation indicators in the form of categories are more common. That is, you need to write an initialization method to create some intermediate variables related to the calculation of the indicator result, write an update_state method to update the state of the relevant intermediate variable after each batch, and write a result method to output the final indicator result.

If you write an evaluation index in the form of a function, you can only take the average of the evaluation index results calculated by each batch in the epoch as the evaluation index result on the entire epoch. This result usually deviates from the result of calculating the entire epoch data at once.

One, commonly used built-in evaluation indicators

  • MeanSquaredError (square difference error, used for regression, can be abbreviated as MSE, function form is mse)

  • MeanAbsoluteError (absolute value error, used for regression, can be abbreviated as MAE, function form is mae)

  • MeanAbsolutePercentageError (mean percentage error, used for regression, can be abbreviated as MAPE, function form is mape)

  • RootMeanSquaredError (root mean square error, used for regression)

  • Accuracy (Accuracy, used for classification, can be represented by the string "Accuracy", Accuracy = (TP + TN) / (TP + TN + FP + FN), y_true and y_pred are required to be coded by the category serial number)

  • Precision (precision rate, used for binary classification, Precision = TP / (TP + FP))

  • Recall (Recall rate, used for binary classification, Recall = TP / (TP + FN))

  • TruePositives (true examples, used for binary classification)

  • TrueNegatives (True negative examples, used for binary classification)

  • FalsePositives (false positives, used for binary classification)

  • FalseNegatives (false negative examples, used for binary classification)

  • AUC (area under ROC curve (TPR vs FPR), used for binary classification, intuitively interpreted as randomly drawing a positive sample and a negative sample, the predicted value of the positive sample is greater than the probability of the negative sample)

  • CategoricalAccuracy (classification accuracy rate, which has the same meaning as Accuracy, and requires y_true (label) to be onehot encoding form)

  • SparseCategoricalAccuracy (sparse classification accuracy rate, which has the same meaning as Accuracy, and requires y_true (label) to be the serial number encoding form)

  • MeanIoU (Intersection-Over-Union, commonly used for image segmentation)

  • TopKCategoricalAccuracy (Multi-category TopK accuracy, requires y_true (label) to be onehot encoding form)

  • SparseTopKCategoricalAccuracy (sparse multi-category TopK accuracy, requires y_true (label) to be the serial number encoding form)

  • Mean

  • Sum (summary)

Second, custom evaluation indicators

We use the KS indicator commonly used in the field of financial risk control as an example to demonstrate custom evaluation indicators.

The KS index is suitable for binary classification problems, and its calculation method is KS = max (TPR-FPR).

Where TPR = TP / (TP + FN) and FPR = FP / (FP + TN)

The TPR curve is actually the cumulative distribution curve (CDF) of positive samples, and the FPR curve is actually the cumulative distribution curve (CDF) of negative samples.

The KS indicator is the maximum value of the difference between the cumulative distribution curve of positive samples and negative samples.

import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow.keras import layers,models,losses,metrics
 
# 函数形式的自定义评估指标
@tf.function
def ks(y_true,y_pred):
    y_true = tf.reshape(y_true,(-1,))
    y_pred = tf.reshape(y_pred,(-1,))
    length = tf.shape(y_true)[0]
    t = tf.math.top_k(y_pred,k = length,sorted = False)
    y_pred_sorted = tf.gather(y_pred,t.indices)
    y_true_sorted = tf.gather(y_true,t.indices)
    cum_positive_ratio = tf.truediv(
        tf.cumsum(y_true_sorted),tf.reduce_sum(y_true_sorted))
    cum_negative_ratio = tf.truediv(
        tf.cumsum(1 - y_true_sorted),tf.reduce_sum(1 - y_true_sorted))
    ks_value = tf.reduce_max(tf.abs(cum_positive_ratio - cum_negative_ratio)) 
    return ks_value
y_true = tf.constant([[1],[1],[1],[0],[1],[1],[1],[0],[0],[0],[1],[0],[1],[0]])
y_pred = tf.constant([[0.6],[0.1],[0.4],[0.5],[0.7],[0.7],[0.7],
                      [0.4],[0.4],[0.5],[0.8],[0.3],[0.5],[0.3]])
tf.print (pcs (y_true)

0.625

# 类形式的自定义评估指标
class KS(metrics.Metric):
 
    def __init__(self, name = "ks", **kwargs):
        super(KS,self).__init__(name=name,**kwargs)
        self.true_positives = self.add_weight(
            name = "tp",shape = (101,), initializer = "zeros")
        self.false_positives = self.add_weight(
            name = "fp",shape = (101,), initializer = "zeros")
 
    @tf.function
    def update_state(self,y_true,y_pred):
        y_true = tf.cast(tf.reshape(y_true,(-1,)),tf.bool)
        y_pred = tf.cast(100*tf.reshape(y_pred,(-1,)),tf.int32)
 
        for i in tf.range(0,tf.shape(y_true)[0]):
            if y_true[i]:
                self.true_positives[y_pred[i]].assign(
                    self.true_positives[y_pred[i]]+1.0)
            else:
                self.false_positives[y_pred[i]].assign(
                    self.false_positives[y_pred[i]]+1.0)
        return (self.true_positives,self.false_positives)
 
    @tf.function
    def result(self):
        cum_positive_ratio = tf.truediv(
            tf.cumsum(self.true_positives),tf.reduce_sum(self.true_positives))
        cum_negative_ratio = tf.truediv(
            tf.cumsum(self.false_positives),tf.reduce_sum(self.false_positives))
        ks_value = tf.reduce_max(tf.abs(cum_positive_ratio - cum_negative_ratio)) 
        return ks_value
 
y_true = tf.constant([[1],[1],[1],[0],[1],[1],[1],[0],[0],[0],[1],[0],[1],[0]])
y_pred = tf.constant([[0.6],[0.1],[0.4],[0.5],[0.7],[0.7],
                      [0.7],[0.4],[0.4],[0.5],[0.8],[0.3],[0.5],[0.3]])
 
myks = KS()
myks.update_state(y_true,y_pred)
tf.print(myks.result())

0.625

 

reference:

Open source e-book address: https://lyhue1991.github.io/eat_tensorflow2_in_30_days/

GitHub project address: https://github.com/lyhue1991/eat_tensorflow2_in_30_days

Guess you like

Origin www.cnblogs.com/xiximayou/p/12689915.html