1. Code
1.1 Source code (copy and use directly)
:star: Copy the following code directly to the model for use (it has been tested and can be discussed in the comment area)
import keras.backend as K
def my_tp_tn_fp_fn(y_true, y_pred):
true_posi_sum = K.cast(K.sum(y_true), "int32")
true_nag_sum = K.cast(K.sum(y_true-1), "int32")*(-1)
pred_posi_sum = K.sum(K.cast(K.greater(y_pred, 0.5), "int32"))
tp = K.sum(K.cast(K.greater(K.clip(y_true * y_pred, 0.0, 1.0), 0.50), "int32"))
fn = true_posi_sum - tp
fp = pred_posi_sum - tp
tn = true_nag_sum - fp
tp = K.cast(tp, "float32")
tn = K.cast(tn, "float32")
fp = K.cast(fp, "float32")
fn = K.cast(fn, "float32")
return tp, tn, fp, fn
def keras_hanmingloss(y_true, y_pred):
tp, tn, fp, fn = my_tp_tn_fp_fn(y_true, y_pred)
num_wrong = fp + fn
total = tp + tn + fp + fn
hanming_loss = (num_wrong + K.epsilon())/ total
return hanming_loss
复制代码
1.2 Construction ideas
- flow chart
flowchart TD
step1[建立函数求出TP_TN_FP_FN]
step2[根据汉明损失原理构造损失函数]
step3[ 结合Sci-kitlearn库的hanminglossAPI进行验证]
step1 --> step2
step2 --> step3
-
First find all TP, TN, FP, FN
-
Then construct the loss function according to all TP, TN, FP, FN combined with the Hamming loss formula
-
Validate with the hanmingloss API of the Sci-kitlearn library
1.3 Verification with sklearn
- The custom code effect, as shown in the figure below, is:
0.06353355
- The Hamming loss effect of sklearn, as shown in the figure below, is:
0.06353354978354979
(Because the former is float32, the latter float64 precision will be higher)
1.4 Experimental effect
- The effect of the training phase
- Evaluate Phase Effects
2. Hamming loss
2.1 Introduction
- Hamming distance: find the number of different elements in the corresponding position of two arrays of the same size
- Hamming loss: Hamming distance divided by number of array elements
- Hamming loss is actually the Hamming distance divided by the total number of elements
1. 汉明距离是使用在数据传输差错控制编码里面的,汉明距离是一个概念,它表示两个(相同长度)字符串对应位置的
不同字符的数量,我们以d(x,y)表示两个字x,y之间的汉明距离。
2. 对两个字符串进行异或运算,并统计结果为1的个数,那么这个数就是汉明距离。
复制代码
2.2 Official
- To predict the result , is thereal result, For the i-th element in the j-th column , the maximum value of i is n , and the maximum value of j is m , in other words, 均为 n*m大小的矩阵
2.3 应用场景
- 一般应用在多标签分类任务中
- 一般用作损失函数或者评价函数
3.参考资料
-
CSDN: TP、TN、FP、FN超级详细解析
-
百度百科: 汉明距离
-
CSDN: 可能是最全的机器学习模型评估指标总结
-
未知来源: 常用数学符号的 LaTeX 表示方法