mmcls multi-label classification practice (3): multi-label classification indicators

The previous two articles respectively introduced the production of multi-label data and resnet multi-label classification . Next, I will introduce the indicators of multi-label classification and share some details about multi-label classification, that is, how to operate can be mentioned.

Before that, I want to mention the difference between the loss function cross_entropy and binary_cross_entropy. The formula of bce is as follows, where y is the label, x is the predicted value, w is the weight, and y can be the soft label.
![Insert picture description here](https://img-blog.csdnimg.cn/9d6048fc37264cfe8b1d88a819109c84.png

The mathematical formula of cross entropy is shown above, P means target, Q means prediction, and H is cross entropy loss. In pytorch, F.cross_entropy(input, target) can be called to realize the calculation of cross entropy loss. In fact, the decomposition is the following formula. First, the prediction is calculated according to the final dim to find the softmax, and then their log is calculated. Finally, they are multiplied by the corresponding target and it is ok.

loss = -label * F.log_softmax(pred, dim=-1)

The biggest difference between binary_cross_entropy and cross_entropy is that binary_cross_entropy does not use softmax, but uses sigmoid to normalize prediction. In multi-label classification, this is very important, because softmax.sum()=1, in multi-label, a picture has multiple attributes, so the assumption of softmax.sum()=1 is not valid. Therefore, F.binary_cross_entropy_with_logits is used in multi-label classification.

Multi-label classification evaluation index

	precision_class = tp.sum(axis=0) / np.maximum(tp.sum(axis=0) + fp.sum(axis=0), eps)
    recall_class = tp.sum(axis=0) / np.maximum(tp.sum(axis=0) + fn.sum(axis=0), eps)
	CP = precision_class.mean() * 100.0
    CR = recall_class.mean() * 100.0
    CF1 = 2 * CP * CR / np.maximum(CP + CR, eps)
    OP = tp.sum() / np.maximum(tp.sum() + fp.sum(), eps) * 100.0
    OR = tp.sum() / np.maximum(tp.sum() + fn.sum(), eps) * 100.0
    OF1 = 2 * OP * OR / np.maximum(OP + OR, eps)

The evaluation index of multi-label classification is relatively simple. Generally, I pay more attention to CP and CR, that is, the average accuracy rate and recall rate. When calculating CP and CR, you need to set the threshold thr (generally set thr=0.5), tp refers to positive samples whose predicted value is greater than thr, fp refers to negative samples whose predicted value is larger than thr, and fn refers to positive samples whose predicted value is smaller than thr. Accuracy precision=tp/(tp+fp), recall=tp/(tp+fn).

Multi-label classification tips
I believe that everyone will encounter problems such as insufficient data volume and uneven distribution of data samples when doing projects. Below I will provide you with some tips based on my own experience
1. When the amount of data is limited (when the train data is less than 1w), the effect of the small model will be due to the large model (overfitting).
2. Using ClassBalancedDataset to repeatedly sample categories with few samples can effectively improve recall.
3. loss, I have experimented with focal loss and asymmetric loss, and the effect is very poor (the recall is high, but the accuracy rate is ridiculously low). Both focal and asymmetric loss are to reduce the proportion of negative samples in loss, so as to balance positive and negative samples. Asymmetric divides gamma into gamma+ and gamma-. Compared with gamma=2 in focal loss, asymmetric reduces the loss of negative samples more exaggeratedly (because my data category distribution is extremely uneven. In the experiment, the positive sample loss It is probably dozens or even hundreds of times of the negative sample loss), which causes the model to focus on the positive samples extremely. The model only knows what is correct and does not care about the wrong ones. Therefore, the recall is improved, but the precision is ridiculously low. . Focal loss also has such a problem, but alpha can alleviate this situation slightly.
insert image description here
4. Magically modify the model structure, attention mechanism, dropout, framework, etc. You can try it.

Guess you like

Origin blog.csdn.net/litt1e/article/details/125458965