The concept and implementation of Top-K accuracy (source code explanation)

1. Concept

There are four common indicators in classification tasks: accuracy, precision, recall and F value. But what is the Top-K accuracy rate? A brief summary: The Top-K accuracy rate is used to calculate the proportion of the top K results with the highest probability in the prediction results containing the correct label . In other words, the accuracy rate we usually refer to is actually the Top-1 accuracy rate. Let's illustrate with an example:

If there is a classifier (10 classifications) for handwriting recognition, a picture with a correct label of 3 is input into the classifier and a probability distribution is obtained as follows:

p=[0.1,0.05,0.1,0.2,0.35,0.01,0.03,0.05,0.01,0.1]

Obviously, according to the prediction results, the label corresponding to the maximum probability of 0.35 is 4, which means that if the previous standard (Top-1 accuracy rate) is used, the classifier’s prediction result for this picture is Incorrect. But if we look at the Top-2 standard, the classifier's prediction for this picture is correct, because the first two with the largest probability values ​​in p contain real labels. In other words, although the label corresponding to 0.35 is wrong, the label corresponding to the second-ranked probability value of 0.2 is correct, so we also regard the above results as correct predictions when calculating the Top-2 accuracy rate .

Therefore, we can see that the Top-K accuracy rate considers whether the most likely K results in the prediction results contain real labels. If they contain them, the prediction is correct, and if they do not, it is a prediction error. So here we can know that the larger the K value is, the higher the accuracy rate of Top-K will be. In extreme cases, if the K value is taken as the number of classifications, then the obtained accuracy rate must be 1. But usually we only look at the Top-1, Top-3 and Top-5 accuracy of the model.

2. Implementation

Let's take a look at the implementation:

Input to the function:

output: The output of the model, that is, the score of the model for different categories. shape: [batch_size, num_classes]

target: the actual category label. shape: [batch_size, ]

topk: Need to calculate the k value in the top_k accuracy rate, tuple type. The default is (1, 5), that is, the function returns the classification accuracy of top1 and top5

Let's give an example first:

import torch
output=torch.Tensor([[0.1,0.05,0.1,0.2,0.35,0.01,0.03,0.05,0.01,0.1],
        [0.2,0.05,0.1,0.35,0.2,0.01,0.02,0.04,0.01,0.0],
        [0.1,0.05,0.1,0.15,0.05,0.01,0.03,0.4,0.01,0.1],
        [0.1,0.05,0.1,0.15,0.05,0.01,0.08,0.1,0.01,0.35]])# 模型预测的概率分布
target=torch.Tensor([[4],[3],[7],[3]]) # 实际的类别索引
# output预测的值:每一行最大值对应的索引,为torch.Tensor([[4],[3],[7],[9]]) 
topk=(1,3)# 这里预测top-1和top-3
maxk = max(topk) # 按topk最大值构建张量
batch_size = target.size(0) # 这里批量数等于样本数4
_, pred = output.topk(maxk, 1, True, True) # topk返回两个张量:values和indices,分别对应前k大值的数值和索引
print(_)
print(pred) # size:batch_size*maxk=4*3

output of topk:

tensor([[0.3500, 0.2000, 0.1000],
        [0.3500, 0.2000, 0.2000],
        [0.4000, 0.1500, 0.1000],
        [0.3500, 0.1500, 0.1000]])
tensor([[4, 3, 0],
        [3, 0, 4],
        [7, 3, 0],
        [9, 3, 0]])

pred stores the index values ​​of the first three digits of the predicted probability of each sample. Let's change the dimension of target for comparison:

pred = pred.t() # 转置,size:maxk*batch_size=3*4
correct = pred.eq(target.view(1, -1).expand_as(pred))
# eq输出元素相等的布尔值
# expand_as将张量扩展为pred的大小
# view()的作用相当于numpy中的reshape,重新定义矩阵的形状
print(pred) # size:maxk*batch_size=3*4
print(target.view(1, -1).expand_as(pred)) # 扩展维度和pred一样
print(correct)

The output of correct: size: max(topk)*batch_size, the number of rows represents the highest probability, so the first n rows of correct represent the prediction of the first n most probable. Look at the column again: if the first n rows in the column have True, it means that the topn prediction is correct. For example, for the fourth sample represented by the fourth column, the real label is the second row, but the largest label predicted by the model is the first row, and the second largest is the second row. Then the top-1 accuracy rate is False, the prediction fails, and it is not included in the top-1 accuracy rate. But the top-3 accuracy rate is True, the prediction is successful, and it is included in the top-3 accuracy rate.

# pred转置
tensor([[4, 3, 7, 9],
        [3, 0, 3, 3],
        [0, 4, 0, 0]])
# target转置并改变维度
tensor([[4., 3., 7., 3.],
        [4., 3., 7., 3.],
        [4., 3., 7., 3.]])
# correct输出:比较pred和target
tensor([[ True,  True,  True, False],
        [False, False, False,  True],
        [False, False, False, False]])

Then we output topk accuracy:

res = []
for k in topk:
    correct_k = correct[:k].reshape(-1).float().sum(0, keepdim=True)
    res.append(correct_k.mul_(100.0 / batch_size)) # 以百分比形式输出
print(res)

output:

[tensor([75.]), tensor([100.])]

Finally, it is packaged into a function, which can be directly copied later:

def accuracy(output, target, topk=(1,5)):
    """Computes the accuracy over the k top predictions for the specified values of k"""
    # 根据指定值k,计算top-k准确度
    with torch.no_grad():
        maxk = max(topk)
        batch_size = target.size(0)

        _, pred = output.topk(maxk, 1, True, True) 
        # topk取一个tensor的topk元素(降序后的前k个大小的元素值及索引)
        # 返回两个张量:values和indices,分别对应前k大值的数值和索引
        pred = pred.t() # 转置
        correct = pred.eq(target.view(1, -1).expand_as(pred))
        # eq输出元素相等的布尔值,expand_as将张量扩展为参数tensor的大小,view()的作用相当于numpy中的reshape,重新定义矩阵的形状。

        res = []
        for k in topk:
            correct_k = correct[:k].view(-1).float().sum(0, keepdim=True)
            res.append(correct_k.mul_(100.0 / batch_size))
        return res


 

Guess you like

Origin blog.csdn.net/qq_54708219/article/details/129428423