Confusion Matrix in Image Segmentation and Computing Indicators Using Confusion Matrix

Table of contents

1 Introduction

2. Create a confusion matrix

2.1 update method

2.2 compute method

2.3 str method

3. Test

4. Complete code


1 Introduction

In semantic segmentation, performance indicators can be calculated using the confusion matrix

The method implemented here is different from that in image classification. If you need it, you can refer to: Confusion Matrix

 

The test data used here is as follows:

 

2. Create a confusion matrix

The implementation of the confusion matrix is ​​as follows

init is to initialize the confusion matrix

update Update the value of the confusion matrix

reset resets the value of the matrix to zero

compute calculates the corresponding performance indicators based on the confusion matrix calculated by update

str is the returned string, which is the value of print after instantiating the confusion matrix

 

Here we explain that the test data of the confusion matrix class are all above:

2.1 update method

as follows:

a is passed in the real label, and b is passed in the value predicted by the network. Note: the predicted value here is also an integer array like the label

First, assign the size of the confusion matrix to n ( the number of classification categories + 1 background ) through the init initialization method, and then create the confusion matrix mat, which is initialized to 0 first.

Next, k finds the corresponding index in the real label a,

The purpose here is to set the uninteresting region to False, and the rest of the specific segmentation labels should be sorted in the order of 1, 2, and 3. Because usually 0 is the background and 255 is the area of ​​no interest.

For example, when the segmentation category is 2(1,2), then adding the background n is 3(0,1,2), and the uninteresting area is set to 255(0,1,2,255). Then the index of k in the real label a(0,1,2,255) will set a>=0 & a<n, that is, the area of ​​0, 1, 2 to True, thus satisfying the segmentation requirements and shooting Out of 255 not interested

Therefore, in the dataset loading data, the foreground should be sorted from 1, 2, 3

 

Then, through the following operations, the confusion matrix with the abscissa as true and the ordinate as pred can be updated

The inds in the middle is probably to change a and b into one-dimensional vectors, then n*a will change the one-dimensional vectors into a group of n, and then perform calculations in it, and finally reshape them into n*n matrices up. Specifically, you can debug it yourself

 For example, true = 1, the number of pred = 0 is one, and the value in the confusion matrix is ​​also 1 (row 0, column 1)

 

2.2 compute method

Compute is to use the confusion matrix generated by update to calculate the performance indicators in the segmentation task. For the performance indicators of the segmentation task, you can check: Common evaluation indicators for semantic segmentation

Confusion matrix: the abscissa is true, the ordinate is pred

Pixel accuracy = diagonal of confusion matrix / sum of confusion matrix

acc here refers to the recall rate of each category = the value of each diagonal / true value (the behavior of the matrix is ​​ture, so sum the rows)

recall The recall rate is the number of recalls, ... is the label, and the number of recalls is the number of correct predictions. So the recall rate is the proportion of the correct number of predictions in the label

iou is the value of each diagonal / (corresponding row + corresponding column - the value of the repeated diagonal)

2.3 str method

The str method in the python class is to return the value of the print of the instantiated class

 

Therefore, the str method in the confusion matrix class returns the performance index calculated by compute.

Because the str method here automatically calls compute, and compute is calculated based on update. So before calling str, be sure to call the update method to update the value of the confusion matrix

The recall and iou here are for different categories, so the return is a list

 

3. Test

The code for the test is as follows:

 The samples tested are:

Here manually calculate the parameters of the segmentation and verify the confusion matrix

The first is pixel accuracy: 4 / 9 = 0.4444

Then there is the recall rate of each category: here are three categories 0 1 2 

Then iou:

For 0: 1/3 = 0.3333

For 1: 1/6 = 0.1667

For 2: 2 / 5 = 0.4

Finally, mean iou is the mean of iou: (0.3333+0.1667+0.4) / 3 = 0.9 / 3 = 0.3

4. Complete code

The code for the confusion matrix:

import torch


# 混淆矩阵
class ConfusionMatrix(object):
    def __init__(self, num_classes):
        self.num_classes = num_classes      # 分类个数(加了背景之后的)
        self.mat = None         # 混淆矩阵

    def update(self, a, b):      # 计算混淆矩阵,a = Ture,b = Predict
        n = self.num_classes
        if self.mat is None:         # 创建混淆矩阵
            self.mat = torch.zeros((n, n), dtype=torch.int64, device=a.device)
        with torch.no_grad():
            k = (a >= 0) & (a < n)
            inds = n * a[k].to(torch.int64) + b[k]      # 统计像素真实类别a[k]被预测成类别b[k]的个数(这里的做法很巧妙)
            self.mat += torch.bincount(inds, minlength=n**2).reshape(n, n)

    def reset(self):
        if self.mat is not None:
            self.mat.zero_()

    def compute(self):      # 计算分割任务的性能指标
        h = self.mat.float()

        acc_global = torch.diag(h).sum() / h.sum()     # 计算全局预测准确率(混淆矩阵的对角线为预测正确的个数)
        acc = torch.diag(h) / h.sum(1)                 # 计算每个类别的 recall
        iou = torch.diag(h) / (h.sum(1) + h.sum(0) - torch.diag(h))     # 计算iou
        return acc_global, acc, iou

    def __str__(self):
        acc_global, acc, iou = self.compute()
        return (
            'global correct: {:.4f}\n'
            'recall: {}\n'
            'IoU: {}\n'
            'mean IoU: {:.4f}').format(
            acc_global.item() ,
            ['{:.4f}'.format(i) for i in acc.tolist()],
            ['{:.4f}'.format(i) for i in iou.tolist()],
            iou.mean().item())

Tested code:

confmat = ConfusionMatrix(num_classes=3)    # 实例化混淆矩阵

ture = torch.LongTensor([[1,2,1],[0,2,2],[0,1,1]])
pred = torch.LongTensor([[1,2,0],[1,2,1],[0,2,2]])

confmat.update(ture, pred)  # update 混淆矩阵的值
print(confmat)
'''
global correct: 0.4444
recall: ['0.5000', '0.2500', '0.6667']
IoU: ['0.3333', '0.1667', '0.4000']
mean IoU: 0.3000
'''

Guess you like

Origin blog.csdn.net/qq_44886601/article/details/130127502