YOLOv5 realizes the target classification count and displays it on the image

        A classmate sent me a private letter in the background, and wanted to use YOLOv5 to achieve the classification counting of the target, so this article will add some codes on the basis of the previous target counting blog to realize the classification counting. Please read that blog before reading this article, the link is as follows:

YOLOv5 achieves target count_Albert_yeager's blog

1. Classification implementation

        Taking the coco dataset as an example, its categories are as follows (a total of 80 categories). Note that each category corresponds to a sequence number, such as: 'person' has a sequence number of 0, 'bicycle' has a sequence number of 1, 'car' has a sequence number of 2...This will be used in subsequent calls.

         Find the counting module written before (see the previous blog for details), and replace it with the following code to realize the classification and counting function. I will explain it in detail below.

# Write results+计数
# count=0
person_count = 0
tie_count = 0
for *xyxy, conf, cls in reversed(det):
    if save_txt:  # Write to file
        xywh = (xyxy2xywh(torch.tensor(xyxy).view(1, 4)) / gn).view(-1).tolist()  # normalized xywh
        line = (cls, *xywh, conf) if opt.save_conf else (cls, *xywh)  # label format
        with open(txt_path + '.txt', 'a') as f:
            f.write(('%g ' * len(line)).rstrip() % line + '\n')

    if save_img or view_img:  # Add bbox to image
        #c = int(cls)# integer class分类数
        #label = '%s %.2f  num: %d' % (names[int(cls)], conf, person_count)
        label = f'{names[int(cls)]} {conf:.2f}'
        plot_one_box(xyxy, im0, label=label, color=colors[int(cls)], line_thickness=3)
    ##########################分类计数##########################
        if int(cls) == 0:
            person_count += 1
        if int(cls) == 27:
            tie_count += 1
        # count = count+1

         The main codes added are 1, 2, 4, among which 1, 2 are the number of initialized two categories, here I choose people ('person') and tie ('tie') as the two counting classes (can be based on requirements add your own classes). 3 is the format of the display label. The line below is official, and you can also change it to what you like.

        Here comes the point! The code in 4 is two judgments, and int(cls) represents the serial number of the category. In the category of coco, the serial number of 'person' is 0, so when int(cls) == 0, that is, when a person is recognized, the person's counter person_count+1; the serial number of 'tie' is 27, so when int When (cls) == 27, that is, when a tie is recognized, the tie counter tie_count+1. In this way, classification counting can be realized.

         The inference model used here is yolov5s.pt, which is trained with the coco data set, so the recognized classes are the 80 classes shown above, and the serial numbers are from 0-79 (this is nonsense). Then, if you want to classify and count your own data set, you must use your own trained model to infer. The serial number is the sequence in the names array during training and deployment, starting from 0 and increasing in turn.

        In order to make it more clear, let me give another example (like and follow (>﹏<), 555~)

        The following is the content of the dataset file for my own training and deployment. You can see that I want to identify five targets, which are HEWSN. According to the sequence in the names array, the serial number corresponding to 'H' is 0, E corresponds to 1, and W corresponds to 2, S corresponds to 3, and N corresponds to 4.

         If I want to classify and count W and S, then the above code should be changed to the following, first define two counters W_count and S_count, then when int(cls) == 2 W_count+1, int(cls) == At 3 o'clock, S_count+1. Of course, remember to change the inference model to your own (change the default value of the '--weights' parameter).

# Write results+计数
# count=0
W_count = 0
S_count = 0
for *xyxy, conf, cls in reversed(det):
    if save_txt:  # Write to file
        xywh = (xyxy2xywh(torch.tensor(xyxy).view(1, 4)) / gn).view(-1).tolist()  # normalized xywh
        line = (cls, *xywh, conf) if opt.save_conf else (cls, *xywh)  # label format
        with open(txt_path + '.txt', 'a') as f:
            f.write(('%g ' * len(line)).rstrip() % line + '\n')

    if save_img or view_img:  # Add bbox to image
        #c = int(cls)# integer class分类数
        #label = '%s %.2f  num: %d' % (names[int(cls)], conf, person_count)
        label = f'{names[int(cls)]} {conf:.2f}'
        plot_one_box(xyxy, im0, label=label, color=colors[int(cls)], line_thickness=3)
    ##########################分类计数##########################
        if int(cls) == 2:
            W_count += 1
        if int(cls) == 3:
            S_count += 1
        # count = count+1

2. Image/video recognition display counting content

        In order to display the counting result on the image, you need to use the cv2.putText() function, and the specific adding method is as follows:

       Just add the following lines of code after "if save_img:". Here I only print the count of people and ties. You can change it according to your needs.

##############################视频识别显示计数内容####################################
text = 'person_num:%d ' % (person_count)
cv2.putText(im0, text, (180, 50), cv2.FONT_HERSHEY_SIMPLEX, 2, (0, 0, 255), 5)
text = 'tie_num:%d ' % (tie_count)
cv2.putText(im0, text, (180, 120), cv2.FONT_HERSHEY_SIMPLEX, 2, (255, 0, 0), 5)
####################################################################################

        In addition, pay attention to the meaning of several parameters of the cv2.putText() function, and then slowly adjust the parameters to make the printed picture beautiful.

cv2.putText(im0, text, (40, 100), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 0, 255), 4)
# 要绘制的图像(im0)
# 要绘制的文本字符串(text)
# 文本的位置(x, y),窗口左上角为(0,0)
# 要使用的字体类型(font),这里用OpenCV的内嵌字体
# 字体大小(font_scale),在此处为1
# 字体颜色(font_color),在此处为红色(0, 0, 255)
# 字体线宽(thickness),在此处为4   

        The final effect is as follows (video recognition is the same):

         

3. The real-time detection window prints the counting content

        In order to display the counting results in the real-time detection window, the cv2.putText() function also needs to be used. The specific modification method is as follows:

        Add the following lines of code after "if view_img:" (here I only print the count of people, you can change it according to your needs).

##############################实时检测窗口打印计数内容#################################
text = 'person_num:%d ' % (person_count)
cv2.putText(im0, text, (180, 40), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 0, 255), 4)
####################################################################################

        

        I hope this article can help everyone. Students who ask me questions in other comment areas and private messages should not worry. I have been studying your questions. If I finish it, I will send it out and notify you as soon as possible (≧∇≦ )/

On the way to school, you and I encourage each other (๑•̀ㅂ•́)و✧

Guess you like

Origin blog.csdn.net/Albert_yeager/article/details/130694180