About the implementation of precision rate and recall rate of yolov5 evaluation indicators

1. Principle formula

Insert image description here

The main thing to say three times is that precision and accuracy are not the same thing! Precision and accuracy are not the same thing! Precision and accuracy are not the same thing! When we usually measure the performance of a model, we usually use precision and recall.
TP is the number of positive samples predicted by positive samples.
FP is the number of positive samples predicted by negative samples.
FN is the number of negative samples predicted by positive samples.

Second, for multi-target detection tasks, how to code precision and recall by yourself?
(The prerequisite must be annotation information.)
1. Idea analysis: For multi-target detection tasks, TP (true positive) represents the correct box predicted, that is, the box predicted by the model is calculated one by one with the annotated box of the image. If the maximum IOU generated by the annotation box is greater than the previously set IOU threshold, and the label corresponding to this prediction box is consistent with the label of the annotation box found through the IOU operation, the prediction box is considered to be a true positive .
For the recall rate , the denominator TP+FN refers to the number of positive samples predicted by positive samples plus the number of negative samples predicted by positive samples. Then it is all positive samples, that is, the number of all labeled boxes .
For accuracy, the denominator TP+FP refers to the number of positive samples predicted by positive samples plus the number of positive samples predicted by negative samples, which is true positive plus false positive, then it is the predicted positive sample, that is, all positive, that is The number of all prediction boxes .
2. Implementation steps:
1) Feed data to the model and obtain inference information;

pred = model(img, augment=opt.augment)[0]

2) NMS processing obtains prediction information, and all prediction boxes will be obtained at this time

pred = non_max_suppression(pred, opt.conf_thres, opt.iou_thres, classes=opt.classes, agnostic=opt.agnostic_nms)

3) Traverse the prediction information, obtain annotation information at the same time, and count TP, TP+FN, TP+FP

for i, det in enumerate(pred):  # detections per image
			p, s, im0 = path[i], '%g: ' % i, im0s[i].copy()
			base_txt_name = os.path.splitext(os.path.basename(path))[0] + '.txt'  #获取图像名称对应的标注文件名称	
			txt_label=txtpath+"/"+base_txt_name   #获取此图像对应的标注文件
			vecter_labels = []    # 存与原图同样尺寸下标注框
			cls_labels=[]         # 存标签
			person_labels=[]     #存类别1标签
			fire_labels=[]       #存类别2标签
			smoke_labels=[]    #存类别3标签
			with open(txt_label, 'r+', encoding='UTF-8') as f1:       # 打开标注文件,获取标注信息
				lines = f1.readlines()
				for j in range(len(lines)):
					vecter_label = []
					lines[j] = lines[j].replace('\n', '')
					line = lines[j].split(' ')
					x1_=float(line[1])*im0.shape[1]-float(line[3])*im0.shape[1]/2
					vecter_label.append(x1_)
					y1_=float(line[2])*im0.shape[0]-float(line[4])*im0.shape[0]/2
					vecter_label.append(y1_)
					x2_=float(line[1])*im0.shape[1]+float(line[3])*im0.shape[1]/2
					vecter_label.append(x2_)
					y2_=float(line[2])*im0.shape[0]+float(line[4])*im0.shape[0]/2
					vecter_label.append(y2_)
					vecter_labels.append(vecter_label)
					cls_labels.append(int(line[0]))
					if int(line[0])==0:
						person_labels.append(int(line[0]))
					if int(line[0])==1:
						fire_labels.append(int(line[0]))
					if int(line[0])==2:
						smoke_labels.append(int(line[0]))

			total_kuang=total_kuang+len(cls_labels)    #统计所有标注框数量
			person_kuang=person_kuang+len(person_labels)   #统计所有类别1标注框数量
			fire_kuang=fire_kuang+len(fire_labels)       #统计所有类别2标注框数量
			smoke_kuang=smoke_kuang+len(smoke_labels)    #统计所有类别3标注框数量
			
			s += '%gx%g ' % img.shape[2:]  # print string
			gn = torch.tensor(im0.shape)[[1, 0, 1, 0]]  # normalization gain whwh
			
			if det is not None and len(det):     #遍历预测框
				det[:, :4] = scale_coords(img.shape[2:], det[:, :4], im0.shape).round()
				# Print results
				for c in det[:, -1].unique():
					n = (det[:, -1] == c).sum()  # detections per class
					s += '%g %ss, ' % (n, names[int(c)])  # add to string

				for *xyxy, conf, cls in det:
					xywh = (xyxy2xywh(torch.tensor(xyxy).view(1, 4))).view(-1).tolist()  # normalized xywh
					pre_xywh=[]      #存与原图同样尺寸下预测框
					x1=xywh[0]-xywh[2]/2
					pre_xywh.append(x1)
					y1=xywh[1]-xywh[3]/2
					pre_xywh.append(y1)
					x2=xywh[0]+xywh[2]/2
					pre_xywh.append(x2)
					y2=xywh[1]+xywh[3]/2
					pre_xywh.append(y2)
					ious, i_libie= box12_iou(pre_xywh, vecter_labels,cls_labels)  #筛选出此预测框与此图像标注框最大iou以及对应的标签类别				
					if i_libie==int(cls):  #如果此预测框的类别与最大iou框标注类别一致
						print("label=%s	IOU=%.2f	conf=%.2f" % (names[int(cls)],ious,conf))
						if i_libie==0:       #类别1处理
							person_Positive_num=person_Positive_num+1  #统计所有类别1预测框数量
							if conf>0.3 and ious > 0.4:
								person_Positive_true=person_Positive_true+1    # #统计所有类别1预测框预测正确数量
						if i_libie==1:    #类别2处理
							fire_Positive_num=fire_Positive_num+1    #统计所有类别2预测框数量
							if ious > 0.25:
								fire_Positive_true=fire_Positive_true+1    # #统计所有类别2预测框预测正确数量
						if i_libie==2:    #类别3处理
							smoke_Positive_num=smoke_Positive_num+1   #统计所有类别3预测框数量
							if ious > 0.25:
								smoke_Positive_true=smoke_Positive_true+1    #统计所有类别3预测框预测正确数量

Guess you like

Origin blog.csdn.net/jiafeier_555/article/details/111287054