【旭日x3】多任务学习YOLOP

1、前言

多任务网络可以通过一个网络来完成多种任务，YOLOP可以同时进行目标检测、可行驶区域分割、车道线检测三个任务放在同一个网络中进行。本文将对网络结构进行简要说明，并通过地平线的AI工具链将YOLOP部署到X3派上。

YOLOP项目:https://github.com/hustvl/YOLOP

本文测试代码:https://github.com/Rex-LK/ai_arm_learning

2、网络结构

YOLOP的网络结构十分清晰，与YOLO系列的网络基本一致，主要参考YOLOV4的模型结构，YOLOP的网络主要分为下面几个部分:

backbone 特征提取
neck 特征融合
Detect head 目标检测预测头
Drivable areas segment head
Lane line segment head

3、模型量化

官方自带的onnx模型的opset_version=12，因此需要重新导出onnx，将export_onnx.py 里面的 opset_version的值修改为11即可，默认使用640*640尺寸的模型。导出完毕后就可以进行量化了，下面为YOLOP量化的配置文件。

model_parameters:
  onnx_model: 'yolop-640-640.onnx'
  output_model_file_prefix: 'yolop-640-640'
  march: 'bernoulli2'
input_parameters:
  input_type_train: 'rgb'
  input_layout_train: 'NCHW'
  input_type_rt: 'nv12'
  norm_type: 'data_mean_and_scale'
  mean_value: '123.675 116.28 103.53'

  scale_value: '0.0171 0.0175 0.0174'

  input_layout_rt: 'NCHW'
calibration_parameters:
  cal_data_dir: './calibration_data_rgb_f32'
  calibration_type: 'max'
  max_percentile: 0.9999
compiler_parameters:
  compile_mode: 'latency'  
  optimize_level: 'O3'
  debug: False
  core_num: 2

4、上板测试

下面为部分测试代码


def infer_yolop(weight,img_path):

    model_path = weight
    model = pyeasy_dnn.load(model_path)
    print("Load model_path done!")

    save_det_path = "./pictures/detect_onnx.jpg"
    save_da_path = "./pictures/da_onnx.jpg"
    save_ll_path = "./pictures/ll_onnx.jpg"
    save_merge_path = "./pictures/output_onnx.jpg"

    img_bgr = cv2.imread(img_path)
    height, width, _ = img_bgr.shape

    img0 = img_bgr.copy().astype(np.uint8)
    img_rgb = img_bgr[:, :, ::-1].copy()
    
    h, w = get_hw(model[0].inputs[0].properties)

    canvas, r, dw, dh, new_unpad_w, new_unpad_h = resize_unscale(img0, (h, w))
    img_input = bgr2nv12_opencv(canvas)

    preds = model[0].forward(img_input)

    det_out = preds[0].buffer[...,0]
    da_seg_out = preds[1].buffer
    ll_seg_out = preds[2].buffer

    det_out = torch.from_numpy(det_out).float()
    boxes = non_max_suppression(det_out)[0]  # [n,6] [x1,y1,x2,y2,conf,cls]
    boxes = boxes.cpu().numpy().astype(np.float32)

    if boxes.shape[0] == 0:
        print("no bounding boxes detected.")
        return

    # scale coords to original size.
    boxes[:, 0] -= dw
    boxes[:, 1] -= dh
    boxes[:, 2] -= dw
    boxes[:, 3] -= dh
    boxes[:, :4] /= r

    print("detect {boxes.shape[0]} bounding boxes.")

    img_det = img_rgb[:, :, ::-1].copy()
    for i in range(boxes.shape[0]):
        x1, y1, x2, y2, conf, label = boxes[i]
        x1, y1, x2, y2, label = int(x1), int(y1), int(x2), int(y2), int(label)
        img_det = cv2.rectangle(img_det, (x1, y1), (x2, y2), (0, 255, 0), 2, 2)

    cv2.imwrite(save_det_path, img_det)

    # select da & ll segment area.
    da_seg_out = da_seg_out[:, :, dh:dh + new_unpad_h, dw:dw + new_unpad_w]
    ll_seg_out = ll_seg_out[:, :, dh:dh + new_unpad_h, dw:dw + new_unpad_w]

    da_seg_mask = np.argmax(da_seg_out, axis=1)[0]  # (?,?) (0|1)
    ll_seg_mask = np.argmax(ll_seg_out, axis=1)[0]  # (?,?) (0|1)
    print(da_seg_mask.shape)
    print(ll_seg_mask.shape)

    color_area = np.zeros((new_unpad_h, new_unpad_w, 3), dtype=np.uint8)
    color_area[da_seg_mask == 1] = [0, 255, 0]
    color_area[ll_seg_mask == 1] = [255, 0, 0]
    color_seg = color_area

    # convert to BGR
    color_seg = color_seg[..., ::-1]
    color_mask = np.mean(color_seg, 2)
    img_merge = canvas[dh:dh + new_unpad_h, dw:dw + new_unpad_w, :]
    img_merge = img_merge[:, :, ::-1]

    # merge: resize to original size
    img_merge[color_mask != 0] = \
        img_merge[color_mask != 0] * 0.5 + color_seg[color_mask != 0] * 0.5
    img_merge = img_merge.astype(np.uint8)
    img_merge = cv2.resize(img_merge, (width, height),
                           interpolation=cv2.INTER_LINEAR)
    for i in range(boxes.shape[0]):
        x1, y1, x2, y2, conf, label = boxes[i]
        x1, y1, x2, y2, label = int(x1), int(y1), int(x2), int(y2), int(label)
        img_merge = cv2.rectangle(img_merge, (x1, y1), (x2, y2), (0, 255, 0), 2, 2)

    # da: resize to original size
    da_seg_mask = da_seg_mask * 255
    da_seg_mask = da_seg_mask.astype(np.uint8)
    da_seg_mask = cv2.resize(da_seg_mask, (width, height),
                             interpolation=cv2.INTER_LINEAR)

    # ll: resize to original size
    ll_seg_mask = ll_seg_mask * 255
    ll_seg_mask = ll_seg_mask.astype(np.uint8)
    ll_seg_mask = cv2.resize(ll_seg_mask, (width, height),
                             interpolation=cv2.INTER_LINEAR)

    cv2.imwrite(save_merge_path, img_merge)
    cv2.imwrite(save_da_path, da_seg_mask)
    cv2.imwrite(save_ll_path, ll_seg_mask)

    print("detect done.")

检测结果如下:
在这里插入图片描述

从上面的图中可以看出，三个视觉任务都有不错的效果，尤其实在晚上的情况下，可以看出YOLOP是一个十分强大的模型。

5、总结

本文回顾了经典的多任务学习网络YOLOP，了解了一个模型如何同时输出多种结果，并完成了在X3上的部署，有需要的同学进行进行尝试，后续有机会会尝试其他的多任务的网络。