[Target detection] YOLOv5: label Chinese display / custom color

foreword

This article is mainly used to convert the labels output by YOLOv5 into Chinese and customize the label color.
I am using YOLOv5-5.0 version.

Source code logic analysis

In detect.py, these two lines of code set the label name and color.

# Get names and colors
names = model.module.names if hasattr(model, 'module') else model.names
colors = [[random.randint(0, 255) for _ in range(3)] for _ in names]

It can be found that the category name is not imported when running the test, but embedded in the saved model parameter file.

Create a new load_model.pyfile and load the trained model:

import torch
ckpt1 = torch.load('runs/train/exp21/weights/best.pt')
print("Done")

Start breakpoint debugging:

insert image description here

As you can see, the category name is included inside the model:

insert image description here
As for the color, every time it runs, the program will randomly generate three RGB values, which is not stable.

Idea analysis

After understanding the above loading logic, in order to realize the demand of Chinese display, there are two main ideas.

Idea one

Idea 1: data.yamlChange the names to Chinese directly in .
This way of thinking needs to be noted that the default file opening is not UTF-8 encoding, and the file reading encoding needs to be modified.
In train.py, put

with open(opt.data) as f:

changed to

with open(opt.data, encoding='UTF-8') as f:

In test.py, put

with open(data) as f:

changed to

with open(data, encoding='UTF-8') as f:

This kind of thinking means that the model needs to be retrained, and there will still be some minor problems later.

Idea two

Idea 2: Perform text conversion directly when rendering the label.
But opencv does not support Chinese by default, so the following steps are required:

  1. Convert opencv image format to PIL image format;
  2. Use PIL to draw text;
  3. Convert PIL image format to oepncv image format;

Idea realization

Use the second idea to operate.

download fonts

The first is to download fonts that support Chinese. I use the font SimHei. The download link:
http://www.font5.com.cn/ziti_xiazai.php?id=151&part=1237887120&address=0

Confusion matrix font modification

In the utils/metrics.py file, add the code at the beginning:

plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False

After that, put this code

sn.set(font_scale=1.0 if self.nc < 50 else 0.8)  # for label size

changed to

sn.set(font='SimHei', font_scale=1.0 if self.nc < 50 else 0.8)  # for label size

Chinese label/color modification

In detect.pythe Write results, add this part

 # Write results
for *xyxy, conf, cls in reversed(det):
    if save_txt:  # Write to file
        xywh = (xyxy2xywh(torch.tensor(xyxy).view(1, 4)) / gn).view(-1).tolist()  # normalized xywh
        line = (cls, *xywh, conf) if opt.save_conf else (cls, *xywh)  # label format
        with open(txt_path + '.txt', 'a') as f:
            f.write(('%g ' * len(line)).rstrip() % line + '\n')

    if save_img or view_img:  # Add bbox to image
        # label = f'{names[int(cls)]} {conf:.2f}'
        # label = None  # 修改隐藏标签
        # plot_one_box(xyxy, im0, label=label, color=colors[int(cls)], line_thickness=3)

        # 增加中文标签
        label = '%s %.2f' % (names[int(cls)], conf)
        # 设置固定颜色
        color_dict = {
    
    '1': [0, 131, 252], '2': [190, 90, 92], '3': [142, 154, 78], '4': [2, 76, 82], '5': [119, 80, 5], '6': [189, 163, 234]}
        # 中文输出
        if names[int(cls)] == 'truck':
            ch_text = '%s %.2f' % ('卡车', conf)
            color_single = color_dict['1']
        elif names[int(cls)] == 'SUV':
            ch_text = '%s %.2f' % ('越野车', conf)
            color_single = color_dict['2']

        im0 = plot_one_box(xyxy, im0, label=label, ch_text=ch_text, color=color_single, line_thickness=3)

Among them, I draw and filter the colors according to the palette I have organized.

insert image description here

After that, utils/plots.pyimport the library in

from PIL import Image, ImageDraw, ImageFont

Modify the plot_one_box function:

def cv2ImgAddText(img, text, left, top, textColor=(0, 255, 0), textSize=25):
    # 图像从OpenCV格式转换成PIL格式
    if (isinstance(img, np.ndarray)):  # 判断是否OpenCV图片类型
        img = Image.fromarray(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
    draw = ImageDraw.Draw(img)
    fontText = ImageFont.truetype("Font/simhei.ttf", textSize, encoding="utf-8")
    draw.text((left, top - 2), text, textColor, font=fontText)
    return cv2.cvtColor(np.asarray(img), cv2.COLOR_RGB2BGR)



def plot_one_box(x, img, color=None, label=None, ch_text=None, line_thickness=None):
    # Plots one bounding box on image img
    tl = line_thickness or round(0.002 * (img.shape[0] + img.shape[1]) / 2) + 1  # line/font thickness
    color = color or [random.randint(0, 255) for _ in range(3)]
    c1, c2 = (int(x[0]), int(x[1])), (int(x[2]), int(x[3]))
    cv2.rectangle(img, c1, c2, color, thickness=tl, lineType=cv2.LINE_AA)
    if label:
        tf = max(tl - 1, 1)  # font thickness
        t_size = cv2.getTextSize(label, 0, fontScale=tl / 3, thickness=tf)[0]
        c2 = c1[0] + t_size[0], c1[1] - t_size[1]
        cv2.rectangle(img, c1, c2, color, -1, cv2.LINE_AA)  # filled
        # cv2.putText(img, label, (c1[0], c1[1] - 2), 0, tl / 3, [225, 255, 255], thickness=tf, lineType=cv2.LINE_AA)
        img_text = cv2ImgAddText(img, ch_text, c1[0], c2[1], (255, 255, 255), 25)
    return img_text

View the effect

Before modification:
insert image description here
After modification:
insert image description here
The result can be displayed successfully, but there is a small problem that the label width is too long, and it will be optimized later when there is time.


2022.8.9 More

Custom Width Optimization

I also studied the logic of this drawing function. The following function getTextSizeis used to calculate the size of characters. It seems that Chinese is not supported, so I want to customize the label width adjustment according to the length of names of different categories.

There are two frame operations in the function below cv2.rectangle, the first one draws the target rectangle, and the second one is used to fill the background of the label. The main modification is in the filling parameters of the second box. The reduced length can be adjusted according to different categories. The improved code is as follows:

def plot_one_box(x, img, color=None, label=None, ch_text=None, line_thickness=None):
    # Plots one bounding box on image img
    tl = line_thickness or round(0.002 * (img.shape[0] + img.shape[1]) / 2) + 1  # line/font thickness
    color = color or [random.randint(0, 255) for _ in range(3)]
    c1, c2 = (int(x[0]), int(x[1])), (int(x[2]), int(x[3]))
    cv2.rectangle(img, c1, c2, color, thickness=tl, lineType=cv2.LINE_AA)   # 绘制目标框
    if label:
        tf = max(tl - 1, 1)  # font thickness
        t_size = cv2.getTextSize(label, 0, fontScale=tl / 3, thickness=tf)[0]
        if label.split()[0] == "类别一":
            sublength = 50  # 缩减方框的长度
            c2 = c1[0] + t_size[0] - sublength, c1[1] - t_size[1]
        elif label.split()[0] == "类别二":
            sublength = 30  # 缩减方框的长度
            c2 = c1[0] + t_size[0] - sublength, c1[1] - t_size[1]
        cv2.rectangle(img, c1, c2, color, -1, cv2.LINE_AA)  # 填充方框背景色
        img_text = cv2ImgAddText(img, ch_text, c1[0], c2[1], (255, 255, 255), 25)
    return img_text

2022.8.13 more

YOLOv5-6.x version adds Chinese

Some readers asked how to add Chinese labels to TPH-YOLOv5. The drawing code of TPH-YOLOv5 is the same as that of YOLOv5-6.0.

First look at the drawing source code, which is also located utils/plots.pyin the file:

class Annotator:
    if RANK in (-1, 0):
        check_font()  # download TTF if necessary

    # YOLOv5 Annotator for train/val mosaics and jpgs and detect/hub inference annotations
    def __init__(self, im, line_width=None, font_size=None, font='Arial.ttf', pil=False, example='abc'):
        assert im.data.contiguous, 'Image not contiguous. Apply np.ascontiguousarray(im) to Annotator() input images.'
        self.pil = pil or not is_ascii(example) or is_chinese(example)
        if self.pil:  # use PIL
            self.im = im if isinstance(im, Image.Image) else Image.fromarray(im)
            self.draw = ImageDraw.Draw(self.im)
            self.font = check_font(font='Arial.Unicode.ttf' if is_chinese(example) else font,
                                   size=font_size or max(round(sum(self.im.size) / 2 * 0.035), 12))
        else:  # use cv2
            self.im = im
        self.lw = line_width or max(round(sum(im.shape) / 2 * 0.003), 2)  # line width

    def box_label(self, box, label='', color=(128, 128, 128), txt_color=(255, 255, 255)):
        # Add one xyxy box to image with label
        if self.pil or not is_ascii(label):
            self.draw.rectangle(box, width=self.lw, outline=color)  # box
            if label:
                w, h = self.font.getsize(label)  # text width, height
                outside = box[1] - h >= 0  # label fits outside box
                self.draw.rectangle([box[0],
                                     box[1] - h if outside else box[1],
                                     box[0] + w + 1,
                                     box[1] + 1 if outside else box[1] + h + 1], fill=color)
                self.draw.text((box[0], box[1] - h if outside else box[1]), label, fill=txt_color, font=self.font)
        else:  # cv2
            p1, p2 = (int(box[0]), int(box[1])), (int(box[2]), int(box[3]))
            cv2.rectangle(self.im, p1, p2, color, thickness=self.lw, lineType=cv2.LINE_AA)
            if label:
                tf = max(self.lw - 1, 1)  # font thickness
                w, h = cv2.getTextSize(label, 0, fontScale=self.lw / 3, thickness=tf)[0]  # text width, height
                outside = p1[1] - h - 3 >= 0  # label fits outside box
                p2 = p1[0] + w, p1[1] - h - 3 if outside else p1[1] + h + 3
                cv2.rectangle(self.im, p1, p2, color, -1, cv2.LINE_AA)  # filled
                self.draw.text((box[0], box[1] - h if outside else box[1]), label, fill=txt_color, font=self.font)
                cv2.putText(self.im, label, (p1[0], p1[1] - 2 if outside else p1[1] + h + 2), 0, self.lw / 3, txt_color,
                            thickness=tf, lineType=cv2.LINE_AA)

It can be seen that in the 6.x version, the author has carried out a drawing optimization, that is, the input label will be judged first, if it is Chinese, then go to this self.pil or not is_ascii(label)branch, call to PILdraw, if it is English, then go to the following branch, That is to say, if the label is Chinese when the model is trained, it is supported by default.

So if the label is in English during training, what should I do if I want to display it in Chinese? It's also easy.
Just copy the drawing code of the first branch to the second branch, and the translation judgment is performed in the latter branch. Note that the font here is still the same as above and needs to be downloaded and modified.

Modified code:

class Annotator:
    if RANK in (-1, 0):
        check_font()  # download TTF if necessary

    # YOLOv5 Annotator for train/val mosaics and jpgs and detect/hub inference annotations
    def __init__(self, im, line_width=None, font_size=None, font='Font/simhei.ttf', pil=False, example='abc'):
        assert im.data.contiguous, 'Image not contiguous. Apply np.ascontiguousarray(im) to Annotator() input images.'
        self.pil = pil or not is_ascii(example) or is_chinese(example)
        if self.pil:  # use PIL
            self.im = im if isinstance(im, Image.Image) else Image.fromarray(im)
            self.draw = ImageDraw.Draw(self.im)
            self.font = check_font(font='Arial.Unicode.ttf' if is_chinese(example) else font,
                                   size=font_size or max(round(sum(self.im.size) / 2 * 0.035), 12))
        else:  # use cv2
            # 补充
            self.im = im if isinstance(im, Image.Image) else Image.fromarray(im)
            self.draw = ImageDraw.Draw(self.im)
            self.font = check_font(font='Arial.Unicode.ttf' if is_chinese(example) else font,
                                   size=font_size or max(round(sum(self.im.size) / 2 * 0.035), 12))

            # self.im = im
        self.lw = line_width or max(round(sum(im.shape) / 2 * 0.003), 2)  # line width

    def box_label(self, box, label='', color=(128, 128, 128), txt_color=(255, 255, 255)):
        # Add one xyxy box to image with label
        if self.pil or not is_ascii(label):
            self.draw.rectangle(box, width=self.lw, outline=color)  # box
            if label:
                w, h = self.font.getsize(label)  # text width, height
                outside = box[1] - h >= 0  # label fits outside box
                self.draw.rectangle([box[0],
                                     box[1] - h if outside else box[1],
                                     box[0] + w + 1,
                                     box[1] + 1 if outside else box[1] + h + 1], fill=color)
                # self.draw.text((box[0], box[1]), label, fill=txt_color, font=self.font, anchor='ls')  # for PIL>8.0
                # print(label)
                self.draw.text((box[0], box[1] - h if outside else box[1]), label, fill=txt_color, font=self.font)
        else:  # cv2
            # p1, p2 = (int(box[0]), int(box[1])), (int(box[2]), int(box[3]))
            # cv2.rectangle(self.im, p1, p2, color, thickness=self.lw, lineType=cv2.LINE_AA)
            # if label:
            #     tf = max(self.lw - 1, 1)  # font thickness
            #     w, h = cv2.getTextSize(label, 0, fontScale=self.lw / 3, thickness=tf)[0]  # text width, height
            #     outside = p1[1] - h - 3 >= 0  # label fits outside box
            #     p2 = p1[0] + w, p1[1] - h - 3 if outside else p1[1] + h + 3
            #     cv2.rectangle(self.im, p1, p2, color, -1, cv2.LINE_AA)  # filled
            #     self.draw.text((box[0], box[1] - h if outside else box[1]), label, fill=txt_color, font=self.font)
            #     cv2.putText(self.im, label, (p1[0], p1[1] - 2 if outside else p1[1] + h + 2), 0, self.lw / 3, txt_color,
            #                 thickness=tf, lineType=cv2.LINE_AA)

            self.draw.rectangle(box, width=self.lw, outline=color)  # box
            if label:
                w, h = self.font.getsize(label)  # text width, height
                outside = box[1] - h >= 0  # label fits outside box
                self.draw.rectangle([box[0],
                                     box[1] - h if outside else box[1],
                                     box[0] + w + 1,
                                     box[1] + 1 if outside else box[1] + h + 1], fill=color)
                # self.draw.text((box[0], box[1]), label, fill=txt_color, font=self.font, anchor='ls')  # for PIL>8.0
                # print(label.split()[0])
                if label.split()[0] == "person":
                    label = "行人" + label.split()[1]
                elif label.split()[0] == "bus":
                    label = "公交车" + label.split()[1]
                self.draw.text((box[0], box[1] - h if outside else box[1]), label, fill=txt_color, font=self.font)

Comparison of effects before and after modification:

insert image description here
Note: The width of the color box here has not been adjusted, and the adjustment method is similar to the previous section.

References

[1]https://blog.csdn.net/bu_fo/article/details/114668184
[2]https://blog.csdn.net/oJiWuXuan/article/details/109337713


2022.10.15 More

postscript

In the project, I encountered the problem of adding Chinese again. Looking back at the method two months ago, I think it is too stupid. One problem with modifying the width of the labels one by one is that if you change a computer, the width display is still the same under different resolutions. is different. The root of the problem is that OpenCV cannot calculate the width of Chinese. However, using PIL font.getsize, the width of Chinese can be calculated. Since YOLOv5-6.0, Chinese has been adapted in this way.

The following is the core method of version 6.0. YOLOv5 of version 5.0 can directly replace the function of drawing labels with 6.0.

Version 6.0 plots.pyencapsulates the drawing function into a class in:

class Annotator:
    if RANK in (-1, 0):
        check_font()  # download TTF if necessary

    # YOLOv5 Annotator for train/val mosaics and jpgs and detect/hub inference annotations
    def __init__(self, im, line_width=None, font_size=None, font='Arial.ttf', pil=False, example='abc'):
        assert im.data.contiguous, 'Image not contiguous. Apply np.ascontiguousarray(im) to Annotator() input images.'
        self.pil = pil or not is_ascii(example) or is_chinese(example)
        if self.pil:  # use PIL
            self.im = im if isinstance(im, Image.Image) else Image.fromarray(im)
            self.draw = ImageDraw.Draw(self.im)
            self.font = check_font(font='Font/simhei.ttf' if is_chinese(example) else font,
                                   size=font_size or max(round(sum(self.im.size) / 2 * 0.035), 12))
        else:  # use cv2
            self.im = im
        self.lw = line_width or max(round(sum(im.shape) / 2 * 0.003), 2)  # line width

    def box_label(self, box, label='', color=(128, 128, 128), txt_color=(255, 255, 255)):
        # Add one xyxy box to image with label
        if self.pil or not is_ascii(label):
            self.draw.rectangle(box, width=self.lw, outline=color)  # box
            if label:
                w, h = self.font.getsize(label)  # text width, height
                outside = box[1] - h >= 0  # label fits outside box
                self.draw.rectangle([box[0],
                                     box[1] - h if outside else box[1],
                                     box[0] + w + 1,
                                     box[1] + 1 if outside else box[1] + h + 1], fill=color)
                # self.draw.text((box[0], box[1]), label, fill=txt_color, font=self.font, anchor='ls')  # for PIL>8.0
                self.draw.text((box[0], box[1] - h if outside else box[1]), label, fill=txt_color, font=self.font)
        else:  # cv2
            p1, p2 = (int(box[0]), int(box[1])), (int(box[2]), int(box[3]))
            cv2.rectangle(self.im, p1, p2, color, thickness=self.lw, lineType=cv2.LINE_AA)
            if label:
                tf = max(self.lw - 1, 1)  # font thickness
                w, h = cv2.getTextSize(label, 0, fontScale=self.lw / 3, thickness=tf)[0]  # text width, height
                outside = p1[1] - h - 3 >= 0  # label fits outside box
                p2 = p1[0] + w, p1[1] - h - 3 if outside else p1[1] + h + 3
                cv2.rectangle(self.im, p1, p2, color, -1, cv2.LINE_AA)  # filled
                cv2.putText(self.im, label, (p1[0], p1[1] - 2 if outside else p1[1] + h + 2), 0, self.lw / 3, txt_color,
                            thickness=tf, lineType=cv2.LINE_AA)

    def rectangle(self, xy, fill=None, outline=None, width=1):
        # Add rectangle to image (PIL-only)
        self.draw.rectangle(xy, fill, outline, width)

    def text(self, xy, text, txt_color=(255, 255, 255)):
        # Add text to image (PIL-only)
        w, h = self.font.getsize(text)  # text width, height
        self.draw.text((xy[0], xy[1] - h + 1), text, fill=txt_color, font=self.font)

    def result(self):
        # Return annotated image as array
        return np.asarray(self.im)

It can be seen that in the code, a Chinese judgment is made for the label is_chinese(example). If it is Chinese, go to the PIL branch above. Therefore, it is only necessary to convert the English label into Chinese before detection.

Modify the detect.py part as follows:

# names读取的是模型的英文names,这里强制改成中文,直接定义自己的类别
# names = model.module.names if hasattr(model, 'module') else model.names
names = ['类别一', '类别二']

# 在len前面添加
annotator = Annotator(im0, line_width=3, example=str(names))
if len(det):
   # Rescale boxes from img_size to im0 size
   det[:, :4] = scale_coords(img.shape[2:], det[:, :4], im0.shape).round()

   # Print results
   for c in det[:, -1].unique():
       n = (det[:, -1] == c).sum()  # detections per class
       s += f"{
      
      n} {
      
      names[int(c)]}{
      
      's' * (n > 1)}, "  # add to string
       
   # Write results
   for *xyxy, conf, cls in reversed(det):
       if save_img:
           c = int(cls)
           label = f'{
      
      names[int(cls)]} {
      
      conf:.2f}'
           # print(label)
           annotator.box_label(xyxy, label, color=colors(c, True))
print(f'{
      
      s}')


# 不要忘了在保存图片前添加这一句
im0 = annotator.result()
# Save results (image with detections)
if save_img:

In this way, English can be replaced perfectly with Chinese.

Guess you like

Origin blog.csdn.net/qq1198768105/article/details/126227069