Rising Sun X3 Pai BPU Deployment Tutorial Series Takes You Out of the Model Deployment Novice Village Easily

Installation preparation

This part mainly introduces the necessary environmental preparations before using the toolchain, including two parts: development machine deployment (personal computer) and development board deployment (such as the Rising Sun development board, which includes BPU devices).

Development Machine Deployment (PC)

The development machines of the official sample tutorials are all Linux systems. In fact, Windows systems are also possible. The most recommended way is to use docker. The model conversion process is mainly based on the CPU and does not use the GPU, so docker is enough.

(1) Install docker

Considering that most users are based on personal computers, the configuration of related environments is based on Windows. The Docker Desktop Installer.exe installation file is provided in the relevant documents (see the Horizon Developer Community). After installation, start it as an administrator to get the following interface.

We can obtain the CentOS Docker image required for deployment from the horizon Tiangong Kaiwu cpu docker hub. Use the latest image v1.13.6, run CMD in administrator mode, enter docker, and the help information of docker can be displayed.

Select the latest version, then enter the command docker pull openexplorer/ai_toolchain_centos_7:v1.13.6 in cmd to automatically start the installation of docker.

After the installation is successful, you can view the successfully installed toolchain image in docker:

(2) Configure Tiangong Kaiwu OpenExplorer

The download of the OpenExplorer toolkit requires wget support. The download link of wget is GNU Wget for Windows. After installation, you can download the toolkit through the following command in cmd. After decompression, the content of the toolkit is as follows. If you need other versions Yes, you can refer to the information download section of the official website.

In addition to mounting the OpenExplorer toolkit, docker also needs to mount the dataset folder. You can download the official dataset through the following command, or download it from the OpenExplorer/dataset folder in the relevant documents. Remember to decompress it after downloading.

# cifar
wget -c ftp://vrftp.horizon.ai/Open_Explorer/eval_dataset/cifar-10.tar.gz 
# cityscapes
wget -c ftp://vrftp.horizon.ai/Open_Explorer/eval_dataset/cityscapes.tar.gz 
# coco
wget -c ftp://vrftp.horizon.ai/Open_Explorer/eval_dataset/coco.tar.gz 
# imagenet
wget -c ftp://vrftp.horizon.ai/Open_Explorer/eval_dataset/imagenet.tar.gz 
# VOC
wget -c ftp://vrftp.horizon.ai/Open_Explorer/eval_dataset/VOC.tar.gz

(PS. Since the author decompressed under Windows, some soft links disappeared, so some necessary soft link updates were added)

# 重新构建model_zoo软连接
rm /open_explorer/ddk/samples/ai_toolchain/horizon_model_convert_sample/01_common/model_zoo
ln -s /open_explorer/ddk/samples/ai_toolchain/model_zoo /open_explorer/ddk/samples/ai_toolchain/horizon_model_convert_sample/01_common/model_zoo

(3) Start Docker

According to the tutorial, you need to execute run_docker.sh to start docker, and you can directly configure the instructions according to the tutorial in this article. Before entering docker, record two things:

Tiangong Kaiwu OpenExplorer root directory: In my environment, it is "D:\05 - Project\01 - Rising Sun x3 Pie\horizon_xj3_open_explorer_v2.2.3_20220617", remember to add double quotes to prevent spaces, this directory should be mounted in docker / Under the open_explorer directory;

The root directory of the dataset: "D:\01 - datasets" in my environment, remember to add double quotes to prevent spaces, this directory needs to be mounted in the /data/horizon_x3/data directory in docker;

*Auxiliary folder root directory: The official tutorial actually does not have this process. I mount this in docker to act as a medium similar to a U disk. For example, in my environment it is "D:\05 - Project\01 - Rising Sun x3 faction\BPUCodes", I can copy data to this folder in windows, and these data can be used in docker, the path in docker for /data/horizon_x3/codes.

Then, enter the following command in cmd (administrator) to enter docker (remember to ensure that the software docker desktop just installed is enabled), it is worth noting that CMD does not support line breaks, remember to delete the following \ and organize it into a line , then we can see the 3 directories mounted by the command line.

import cv2
# 打开摄像头并显示
docker run -it --rm \
-v "D:\05 - 项目\01 - 旭日x3派\horizon_xj3_open_explorer_v2.2.3_20220617":/open_explorer \
-v "D:\01 - datasets":/data/horizon_x3/data \
-v "D:\05 - 项目\01 - 旭日x3派\BPUCodes":/data/horizon_x3/codes \
openexplorer/ai_toolchain_centos_7:v1.13.6

So far, it has successfully entered the complete tool chain development environment through the Docker image. You can type the hb_mapper --help command to verify whether you can get help information normally. hb_mapper is a common tool in the tool chain, and it will be introduced in detail in the model conversion section later.

In addition to copying files by mounting an additional folder, there is another way to copy files directly to the target directory.

If we want to copy a file "C:\Users\Zhaoxi-Li\Downloads\Pangolin-0.8.tar.gz" to /root/downloads in docker (the directory must exist), then open a new cmd with administrator privileges , enter docker ps, record the CONTAINER ID, and then enter docker cp "C:\Users\Zhaoxi-Li\Downloads\Pangolin-0.8.tar.gz in the form of docker cp local file path container_id:<path inside the docker container> " 677de3a8b719:/root/downloads to complete the file copy.

Development board deployment (Rising Sun 3 faction as an example)

Before using it, be sure to follow the tutorial to play "Horizon's Newly Released AIoT Development Board - Sunrise x3 Pi (Sunrise x3 Pi)" (you can search and view it in "Horizon Developer Community" - "Developer Forum") to complete the system start.

Some supplementary tools of the tool chain are not included in the system image, and these tools have been placed in the Open Explorer release package. So in the docker we just pulled, enter cd /open_explorer/ddk/package/board/, and execute the command bash install.sh 192.168.0.104, where 192.168.0.104 is the IP address of the development board, which can be viewed with ifconfig.

This function is mainly to copy hrt_bin_dump and hrt_model_exec to the development board, and add several environments in /etc/profile of the development board. The added content is as follows.

#Horizon Open Explorer ENV
export PATH=/userdata/.horizon/:/userdata/.horizon/ai_express_webservice_display/sbin/:$PATH
export HORIZON_APP_PATH=/userdata/.horizon/:$HORIZON_APP_PATH
#Horizon Open Explorer ENV

Enter hrt_model_exec in the development board, if there is the following output, it means that the deployment of the development board is completed.

model deployment

The tool chain of BPU is very long, and it is necessary to understand the meaning of each process before deployment.

model preparation

Support Caffe model and ONNX model, the support of Caffe model is the highest. Our commonly used Pytorch model can be converted to ONNX model. In fact, the dnn module integrated in OpenCV is also based on caffe, so although Caffe is not popular in the academic circle, it has been widely used in the industrial circle.

validation model

Verify that the layers used in the model can be used in the BPU. You need to use the hb_mapper checker followed by a bunch of parameters to configure the model. The configuration information is as follows:

--model-type: input model type, onnx or caffe;

--march: chip type, this board can only fill in bernoulli2;

--proto: If the model is caffe, fill in the prototxt file required by caffe. The onnx model does not need to write this parameter;

--model: model file, caffe is *.caffemodel, onnx model is .onnx;

--input-shape: The name and dimension of the model data input, for example, the name of the input layer is input1, and the dimension is 1x3x128x128, then this parameter can be written as --input-shape input1 1x3x128x128. If our model has multiple inputs, for example, the name of the second input layer is input2, and the dimension is 1x96x28x28, then the parameter setting is written as --input-shape input1 1x3x128x128 --input-shape input2 1x96x28x28 (this parameter is optional, not If you write, the program will automatically recognize the parameters, if specified, the specified one is the main);

--output: Set the output log file (has been removed, it exists in hb_mapper_checker.log in the root directory by default);

Note: If the model check fails, the console will have obvious ERROR information. Generally, it will be checked that some layers do not support BPU. At this time, you can write a custom layer to solve it. An example will be provided later to show the failure. Solution.

conversion model

After the model check is passed, you can configure a yaml file to convert the model file into a file that can run on the BPU, which will be described in detail later when configuring the model.

--model-type: Specify caffe or onnx according to the model type;

--config: The configuration file for model compilation, the content is in yaml format, and the file name uses the .yaml suffix.

Model performance, accuracy analysis and tuning

At the beginning of the BPU, I always wondered why the precision would change after converting the model? Because after the model conversion, the calculation is converted from float to int8, and there must be a loss of precision in this process. If the accuracy difference is large, you need to tune it according to the official tutorial.

Yolov3 deployment example

Place yolov3 in the /open_explorer/ddk/samples/ai_toolchain/horizon_model_convert_sample/04_detection/02_yolov3_darknet53/mapper path in the docker file, and use the official example to get a preliminary understanding of the relevant operation process of the BPU.

model preparation

The prototxt and caffemodel files are placed in the /open_explorer/ddk/samples/ai_toolchain/model_zoo/mapper/detection/yolov3_darknet53 path in docker.

validation model

/open_explorer/ddk/samples/ai_toolchain/horizon_model_convert_sample/04_detection/02_yolov3_darknet53/mapper, after entering this path, enter ./01_check.sh, if you encounter the following output, it means the conversion is complete.

As mentioned above, model verification needs to use hb_mapper checker followed by a bunch of parameters to configure the model. The following are the main contents of ./01_check.sh.

Let's take you to understand these parameters:

--model-type: Our models are Caffe, so fill in caffe;

--march: Rising Sun X3 faction can only fill in bernoulli2;

--proto: Fill in the prototxt file path

That is../../../01_common/model_zoo/mapper/detection/yolov3_darknet53/yolov3_transposed.prototxt;

--model: Fill in the caffemodel file path

That is../../../01_common/model_zoo/mapper/detection/yolov3_darknet53/yolov3.caffemodel;

--input-shape: Not specified here, the code can automatically find it.

 

conversion model

Before converting the model, you need to prepare calibration data. Enter ./02_preprocess.sh to automatically extract data from the open_explorer package of docker; and then enter ./03_build.sh to output a lot of command lines, which will be output after waiting for a while.

Here we can find that each layer of the network must evaluate a similarity, which is why it is necessary to prepare calibration data, because the BPU is calculated by INT8, so there is bound to be a loss of accuracy. Moreover, these errors can also be transmitted, so the accuracy is getting lower and lower in the future. If the network depth is too high, it will also lead to a decrease in overall accuracy.

In order to better understand these conversion processes, a complete interpretation of the process of preparing calibration data and model conversion will be given.

(1) Principle Interpretation: Prepare Calibration Data

This process calls the script ./02_preprocess.sh. The core of this script calls the python file. You can view the source code of data_preprocess.py by yourself.

python3 ../../../data_preprocess.py \
  --src_dir ../../../01_common/calibration_data/coco \
  --dst_dir ./calibration_data_rgb_f32 \
  --pic_ext .rgb \
  --read_mode opencv

However, data_preprocess.py is not suitable for beginners to read, because it is compatible with too many things, and some very simple functions are complicated to write, so around this model, I will show you what to prepare for the calibration data.

First of all, we need to figure out what kind of calibration data we want to prepare: The calibration data should store the image data according to the target size, target color (rgb or bgr, etc.), target arrangement (CHW or HWC). Then, let's deal with these problems, and first build a basic processing flow:

①Load all image address information under a folder. The image directory is /open_explorer/ddk/samples/ai_toolchain/horizon_model_convert_sample/01_common/calibration_data/coco;

② Output each image according to the calibration format. From the prototxt, we know that the size of the image is 416x416. From the yaml file called by ./03_build.sh, we know that the image input format is rgb, and the data arrangement is CHW;

③Use the numpy.tofile function to save the converted image to the target folder (where you converted, you must save the calibration data folder calibration_data in which directory).

Let's start writing our own Python code, with comments for each step, so you can understand it directly.

# prepare_calibration_data.py
import os
import cv2
import numpy as np

src_root = '/open_explorer/ddk/samples/ai_toolchain/horizon_model_convert_sample/01_common/calibration_data/coco'
cal_img_num = 100  # 想要的图像个数
dst_root = '/open_explorer/ddk/samples/ai_toolchain/horizon_model_convert_sample/04_detection/02_yolov3_darknet53/mapper/calibration_data'


## 1. 从原始图像文件夹中获取100个图像作为校准数据
num_count = 0
img_names = []
for src_name in sorted(os.listdir(src_root)):
    if num_count > cal_img_num:
        break
    img_names.append(src_name)
    num_count += 1

# 检查目标文件夹是否存在,如果不存在就创建
if not os.path.exists(dst_root):
    os.system('mkdir {0}'.format(dst_root))

## 2 为每个图像转换
# 参考了OE中/open_explorer/ddk/samples/ai_toolchain/horizon_model_convert_sample/01_common/python/data/下的相关代码
# 转换代码写的很棒,很智能,考虑它并不是官方python包,所以我打算换一种写法

## 2.1 定义图像缩放函数,返回为np.float32
# 图像缩放为目标尺寸(W, H)
# 值得注意的是,缩放时候,长宽等比例缩放,空白的区域填充颜色为pad_value, 默认127
def imequalresize(img, target_size, pad_value=127.):
    target_w, target_h = target_size
    image_h, image_w = img.shape[:2]
    img_channel = 3 if len(img.shape) > 2 else 1

    # 确定缩放尺度,确定最终目标尺寸
    scale = min(target_w * 1.0 / image_w, target_h * 1.0 / image_h)
    new_h, new_w = int(scale * image_h), int(scale * image_w)

    resize_image = cv2.resize(img, (new_w, new_h))

    # 准备待返回图像
    pad_image = np.full(shape=[target_h, target_w, img_channel], fill_value=pad_value)

    # 将图像resize_image放置在pad_image的中间
    dw, dh = (target_w - new_w) // 2, (target_h - new_h) // 2
    pad_image[dh:new_h + dh, dw:new_w + dw, :] = resize_image

    return pad_image

## 2.2 开始转换
for each_imgname in img_names:
    img_path = os.path.join(src_root, each_imgname)

    img = cv2.imread(img_path)  # BRG, HWC
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)  # RGB, HWC
    img = imequalresize(img, (416, 416))
    img = np.transpose(img, (2, 0, 1))  # RGB, CHW

    # 将图像保存到目标文件夹下
    dst_path = os.path.join(dst_root, each_imgname + '.rgbchw')
    print("write:%s" % dst_path)
    # 图像加载默认就是uint8,但是不加这个astype的话转换模型就会出错
    # 转换模型时候,加载进来的数据竟然是float64,不清楚内部是怎么加载的。
    img.astype(np.uint8).tofile(dst_path) 

print('finish')

(2) Interpretation of principle: conversion configuration

The core of model conversion lies in the yaml file of the configuration target. The official also provides a yolov3_darknet53_config.yaml for users to try directly. Each parameter is commented, and I can feel the sincerity of the developer. However, there are too many parameters in the configuration file for model conversion. If you want to change the parameters, you don't know how to start.

The purpose of this program is to guide you to get started quickly, so I will not explain the meaning of some parameters for the time being, just use the default. This template can compress more than 30 parameters to be configured to 9 parameters, which is convenient for you to quickly configure simple models. The model that this yaml template applies to has the following properties:

  • No custom layers, in other words, BPU supports all layers of the model;
  • There is only one input node, and the input is an image.

First copy this template to the root directory of the code, name it "yolov3_simple.yaml", and then configure specific parameters according to the following mind map.

model_parameters:
  # [待配置参数],见思维导图"模型参数组"部分
  prototxt: '***.prototxt'
  caffe_model: '****.caffemodel'
  onnx_model: '****.onnx'
  output_model_file_prefix: 'mobilenetv1'
  
  # 默认参数,暂不需要理解
  march: 'bernoulli2'

input_parameters:
  # [待配置参数],见思维导图"输入信息参数组/原始模型参数"部分
  input_type_train: 'bgr'
  input_layout_train: 'NCHW'

  # [待配置参数],见思维导图"输入信息参数组/转换后模型参数"部分
  input_type_rt: 'yuv444'

  # [待配置参数],见思维导图"输入信息参数组/输入数据预处理"部分
  norm_type: 'data_mean_and_scale'
  mean_value: '103.94 116.78 123.68'
  scale_value: '0.017'
  
  # 默认参数,暂不需要理解
  input_layout_rt: 'NHWC'
  

# 校准参数组,全部默认
calibration_parameters:
  cal_data_dir: './calibration_data'
  calibration_type: 'max'
  max_percentile: 0.9999

# 编译参数组,全部默认
compiler_parameters:
  compile_mode: 'latency'
  optimize_level: 'O3'
  debug: False # 别看官网写的可选,实际上不写这个出bug

The mind map is shown below. With this picture, please follow me to configure it step by step patiently. Only 9 configurations are required.

Model parameter group parameter model_parameters configuration:

output_model_file_prefix: Give the converted model a name, here is called 'yolov3_selfyaml', note that there must be single quotation marks before and after the string;

prototxt: caffe's prototxt, here is '../../../01_common/model_zoo/mapper/detection/yolov3_darknet53/yolov3_transposed.prototxt';

caffe_model: The model file of caffe, here is '../../../01_common/model_zoo/mapper/detection/yolov3_darknet53/yolov3.caffemodel';

onnx_model: delete. Because we are using Caffe.

Input information group parameter configuration input_parameters:

input_type_train:原始浮点模型的输入数据格式,支持多种图像格式,这里设置为'rgb'(这就是前文校准模型时为什么要将BGR转为RGB);

input_layout_train:从前文的prototxt可以看出,数据输入排布为'NCHW'(所以在模型校准时我们将图像数据由HWC转为CHW)

input_type_rt:模型转换后,我们期望输入的图像格式。我们在训练模型和部署模型的时候,图像输入格式是可以变的,NV12是一些相机返回的原始数据格式,作为尝试设置为'nv12';

norm_type:网络不可能拿原始图像数据作为输入的,一般都要进行一个归一化操作。这里用的模型对应的归一化代码为inpBlob = cv2.dnn.blobFromImage(frame, 1.0 / 255, (inWidth, inHeight), (0, 0, 0), swapRB=False, crop=False),无减均值项,只有尺度项。因此,该属性设置为'data_scale';

mean_value:删掉,因为网络没有均值项;

scale_value:尺度为1.0 / 255,因此设置为0.003921568627451。

Ultimately, our yaml file content looks like this:

model_parameters:
  prototxt: '../../../01_common/model_zoo/mapper/detection/yolov3_darknet53/yolov3_transposed.prototxt'
  caffe_model: '../../../01_common/model_zoo/mapper/detection/yolov3_darknet53/yolov3.caffemodel'
  output_model_file_prefix: 'yolov3_selfyaml'
  march: 'bernoulli2'
input_parameters:
  input_type_train: 'rgb'
  input_layout_train: 'NCHW'
  input_type_rt: 'nv12'
  norm_type: 'data_scale'
  scale_value: 0.003921568627451
  input_layout_rt: 'NHWC'
calibration_parameters:
  cal_data_dir: './calibration_data'
  calibration_type: 'max'
  max_percentile: 0.9999
compiler_parameters:
  compile_mode: 'latency'
  optimize_level: 'O3'
  debug: False

After that, use the calibration data prepared by us and the configured lightweight yaml for model conversion, and enter the command hb_mapper makertbin --config yolov3_simple.yaml --model-type caffe in the console.

model reasoning

In the official demo, 04_inference.sh can directly call the executed model for inference, but I think this solution is meaningless for how to deploy your own model in the future. So after I read the demo of the official reasoning, I wrote a complete reasoning process myself. The model inference process can be divided into the following three steps:

① Data preprocessing to generate data required for reasoning;

② Use the processed data to perform model reasoning and obtain output;

③Convert the output into final data, which is the post-processing process.

(PS. The test image path used is /open_explorer/ddk/samples/ai_toolchain/horizon_model_convert_sample/01_common/test_data/det_images/kite.jpg)

In the previous section, there are three key files after model conversion:

yolov3_selfyaml_original_float_model.onnx:图像量化前的模型;

yolov3_selfyaml_quantized_model.onnx:图像量化后的模型;

yolov3_selfyaml.bin:在BPU上用于推理的模型文件,输出结果与yolov3_selfyaml_quantized_model.onnx一致。

The following will give the relevant code for inferring an image, in which I encapsulate the image format conversion and yolo post-processing details in a package, and the relevant code has been put in the community for your reference.

The following are the code details of inference_model.py, with relevant comments given in each key process:

From the code, we can know that the model will output three layers, and the dimensions of each layer are (1, 13, 13, 255) (1, 26, 26, 255) (1, 52, 52, 255). Facing the figure below, you can easily It is easy to correspond to which layer of the network.

import numpy as np
import cv2
import os
from horizon_tc_ui import HB_ONNXRuntime
from bputools.format_convert import imequalresize, bgr2nv12_opencv, nv122yuv444
from bputools.yolo_postproc import modelout2predbbox, recover_boxes, nms, draw_bboxs

modelpath_prefix = '/open_explorer/ddk/samples/ai_toolchain/horizon_model_convert_sample'

# img_path 图像完整路径
img_path = os.path.join(modelpath_prefix, '01_common/test_data/det_images/kite.jpg')
# model_path 量化模型完整路径
model_root = os.path.join(modelpath_prefix, '04_detection/02_yolov3_darknet53/mapper/model_output')
model_path = os.path.join(model_root, 'yolov3_selfyaml_quantized_model.onnx')

# 1. 加载模型,获取所需输出HW
sess = HB_ONNXRuntime(model_file=model_path)
sess.set_dim_param(0, 0, '?')
model_h, model_w = sess.get_hw()

# 2 加载图像,根据前面模型,转换后的模型是以NV12作为输入的
# 但在OE验证的时候,需要将图像再由NV12转为YUV444
imgOri = cv2.imread(img_path)
img = imequalresize(imgOri, (model_w, model_h))
nv12 = bgr2nv12_opencv(img)
yuv444 = nv122yuv444(nv12, [model_w, model_h])

# 3 模型推理
input_name = sess.input_names[0]
output_name = sess.output_names
output = sess.run(output_name, {input_name: np.array([yuv444])}, input_offset=128)
print(output_name)
print(output[0].shape, output[1].shape, output[2].shape)
# ['layer82-conv-transposed', 'layer94-conv-transposed', 'layer106-conv-transposed']
# (1, 13, 13, 255) (1, 26, 26, 255) (1, 52, 52, 255)

# 4 检测结果后处理
# 由output恢复416*416模式下的目标框
pred_bbox = modelout2predbbox(output)
# 将目标框恢复到原始分辨率
bboxes = recover_boxes(pred_bbox, (imgOri.shape[0], imgOri.shape[1]),
                       input_shape=(model_h, model_w), score_threshold=0.3)
# 对检测出的框进行非极大值抑制,抑制后得到的框就是最终检测框
nms_bboxes = nms(bboxes, 0.45)
print("detected item num: {0}".format(len(nms_bboxes)))

# 绘制检测框
draw_bboxs(imgOri, nms_bboxes)
cv2.imwrite('detected.png', imgOri)

Upper board transportation

We drag some of the files shown in the figure below to the Rising Sun X3 development board. Note that there are minor changes in inference_model_bpu.py and docker.

Note that some packages should be installed before execution: sudo pip3 install EasyDict pycocotools, remember to add sudo, so the installation path is not the user directory, and sudo must be added when running the BPU model.

The source code of inference_model_bpu.py is shown below. Unlike in docker, nv12 does not need to be converted to yuv444. There are some differences in the operation of the model, and there is almost no change in the post-processing.

import numpy as np
import cv2
import os
from hobot_dnn import pyeasy_dnn as dnn
from bputools.format_convert import imequalresize, bgr2nv12_opencv, nv122yuv444
from bputools.yolo_postproc import modelout2predbbox, recover_boxes, nms, draw_bboxs

def get_hw(pro):
    if pro.layout == "NCHW":
        return pro.shape[2], pro.shape[3]
    else:
        return pro.shape[1], pro.shape[2]

modelpath_prefix = ''

# img_path 图像完整路径
img_path = 'COCO_val2014_000000181265.jpg'
# model_path 量化模型完整路径
model_path = 'yolov3_selfyaml.bin'

# 1. 加载模型,获取所需输出HW
models = dnn.load(model_path)
model_h, model_w = get_hw(models[0].inputs[0].properties)

# 2 加载图像,根据前面模型,转换后的模型是以NV12作为输入的
# 但在OE验证的时候,需要将图像再由NV12转为YUV444
imgOri = cv2.imread(img_path)
img = imequalresize(imgOri, (model_w, model_h))
nv12 = bgr2nv12_opencv(img)

# 3 模型推理
t1 = cv2.getTickCount()
outputs = models[0].forward(nv12)
t2 = cv2.getTickCount()
outputs = (outputs[0].buffer, outputs[1].buffer, outputs[2].buffer)
print(outputs[0].shape, outputs[1].shape, outputs[2].shape)
# (1, 13, 13, 255) (1, 26, 26, 255) (1, 52, 52, 255)
print('time consumption {0} ms'.format((t2-t1)*1000/cv2.getTickFrequency()))

# 4 检测结果后处理
# 由output恢复416*416模式下的目标框
pred_bbox = modelout2predbbox(outputs)
# 将目标框恢复到原始分辨率
bboxes = recover_boxes(pred_bbox, (imgOri.shape[0], imgOri.shape[1]),
                       input_shape=(model_h, model_w), score_threshold=0.3)
# 对检测出的框进行非极大值抑制,抑制后得到的框就是最终检测框
nms_bboxes = nms(bboxes, 0.45)
print("detected item num: {0}".format(len(nms_bboxes)))

# 绘制检测框
draw_bboxs(imgOri, nms_bboxes)
cv2.imwrite('detected.png', imgOri)

Hand Keypoint Detection Network

Hand key point detection is a key process of gesture recognition. The code is based on Caffe and has no custom layer. Therefore, as an introduction, I will lead you to use BPU first.

model preparation

In the pre-installation preparation, we mounted a directory -v "D:\05 - Project\01 - Rising Sun x3 faction\BPUCodes":/data/horizon_x3/codes, after downloading the codes, place the relevant files as follows, At this point, you can find that these files are also in docker.

validation model

Before verification, switch the docker root directory to the model root directory cd /data/horizon_x3/codes/HandKeypointDetection/hand/. Model verification requires the use of hb_mapper checker followed by a bunch of parameters to configure the model. Let's take you to configure these parameters:

--model-type: Our models are Caffe, so fill in caffe

--march: Rising Sun 3 faction can only fill in bernoulli2

--proto: Fill in the prototxt file name, namely pose_deploy.prototxt

--model: Fill in the caffemodel file name, namely pose_iter_102000.caffemodel

--input-shape: Open the prototxt file, search for the input attribute, you can find that the model has only one input, the name of the input layer is image, and the dimension of the input image is 1x3x368x368, then this parameter setting is written as image 1x3x368x368.

 

To sum up, the following commands need to be entered in docker to complete the model verification process:

hb_mapper checker \
--model-type caffe \
--march bernoulli2 \
--proto pose_deploy.prototxt \
--model pose_iter_102000.caffemodel \
--input-shape image 1x3x368x368

The output results are as follows, you can see the conversion status of the entire process and whether each node is running on BPU or CPU. The running results of the entire console are stored in hb_mapper_checker.log in the root directory by default.

 

conversion model

Different from the previous process, the yaml file is first configured here, and then the calibration data is prepared.

(1) Configure the yaml file

Model parameter group parameter model_parameters configuration:

output_model_file_prefix: Give the converted model a name, here is called 'handkpdet' (hand keypoint detection);

prototxt: prototxt of caffe, here is 'pose_deploy.prototxt';

caffe_model: the model file of caffe, here is 'pose_iter_102000.caffemodel'

onnx_model: delete. Because we are using Caffe.

Input information group parameter configuration input_parameters:

input_type_train: The input data format of the original floating-point model, supporting multiple image formats. In our model, the input is a color image. Considering that the image loaded by opencv is a BGR channel by default, it is set to 'bgr' here;

input_layout_train: As can be seen from the previous prototxt, the data input arrangement is 'NCHW';

input_type_rt: After model conversion, we expect the input image format. When we train the model and deploy the model, the image input format can be changed. NV12 is the original data format returned by some cameras. Considering that our test is still based on local images, it is still set to 'bgr' here;

norm_type: It is impossible for the network to take the original image data as input, and a normalization operation is generally required. The normalization code corresponding to the model used here is inpBlob = cv2.dnn.blobFromImage(frame, 1.0 / 255, (inWidth, inHeight), (0, 0, 0), swapRB=False, crop=False), no reduction Mean term, only scale term. Therefore, the attribute is set to 'data_scale';

mean_value: delete, because the network does not have a mean item;

scale_value: The scale is 1.0 / 255, so set to '0.0039'.

Finally, the content of our yaml file handpoint.yaml is:

model_parameters:
  prototxt: 'pose_deploy.prototxt'
  caffe_model: 'pose_iter_102000.caffemodel'
  output_model_file_prefix: 'handkpdet'
  march: 'bernoulli2'
input_parameters:
  input_type_train: 'bgr'
  input_layout_train: 'NCHW'
  input_type_rt: 'bgr'
  norm_type: 'data_scale'
  scale_value: '0.0039'
  input_layout_rt: 'NHWC'
calibration_parameters:
  cal_data_dir: './calibration_data'
  calibration_type: 'max'
  max_percentile: 0.9999
compiler_parameters:
  compile_mode: 'latency'
  optimize_level: 'O3'
  debug: False

(2) Prepare calibration data

Considering that there is only one input for this model, the code for preparing the calibration data part can refer to the content in the previous section. There are only two places that need to be modified, the original data address, and the color conversion part (the process of converting BGR to RGB is cancelled) , the data set is FreiHAND_pub_v2_eval.zip.

 

In docker, the calibration data format is as shown in the figure below, with a total of 100 pieces.

(3) Start converting

The data is ready, enter the command hb_mapper makertbin --config handpoint.yaml --model-type caffe to start converting our model! After waiting for a while, the model conversion is successful. From the results, it can be seen that the loss of the model is not very high! ! I feel like there is drama, (☆▽☆).

model reasoning

Since this model is similar to the previous model and takes an image as input, there are two main points to be added:

  • Complete the image preprocessing part. The previous yaml file indicates that the quantized model is input in BGR and NHWC formats. Therefore, you only need to call resize to the size of the target model, and opencv defaults to the HWC format when loading images.
  • Finish the image post-processing part. Image post-processing generally has little to do with the inference platform, and the entire process will have this process.

The full code for inference in docker looks like this:

import numpy as np
import cv2
import os
from horizon_tc_ui import HB_ONNXRuntime
import copy

# img_path 图像完整路径
img_path = '/data/horizon_x3/codes/HandKeypointDetection/hand/FreiHAND_pub_v2_eval/evaluation/rgb/00000253.jpg'
# model_path 量化模型完整路径
model_path = '/data/horizon_x3/codes/HandKeypointDetection/hand/model_output/handkpdet_quantized_model.onnx'

# 1. 加载模型,获取所需输出HW
sess = HB_ONNXRuntime(model_file=model_path)
sess.set_dim_param(0, 0, '?')
model_h, model_w = sess.get_hw()

# 2 加载图像,根据前面yaml,量化后的模型以BGR NHWC形式输入
imgOri = cv2.imread(img_path)
img = cv2.resize(imgOri, (model_w, model_h))

# 3 模型推理
input_name = sess.input_names[0]
output_name = sess.output_names
output = sess.run(output_name, {input_name: np.array([img])}, input_offset=128)
print(output_name)
print(output[0].shape)
# ['net_output']
# (1, 22, 46, 46)

# 4 检测结果后处理
# 绘制关键点
nPoints = 22
threshold = 0.1
POSE_PAIRS = [[0, 1], [1, 2], [2, 3], [3, 4], [0, 5], [5, 6], [6, 7], [7, 8], [0, 9], [9, 10], [10, 11], [11, 12],
              [0, 13], [13, 14], [14, 15], [15, 16], [0, 17], [17, 18], [18, 19], [19, 20]]

imgh, imgw = imgOri.shape[:2]
points = []
imgkp = copy.deepcopy(imgOri)
for i in range(nPoints):
    probMap = output[0][0, i, :, :]
    probMap = cv2.resize(probMap, (imgw, imgh))
    minVal, prob, minLoc, point = cv2.minMaxLoc(probMap)

    if prob > threshold:
        cv2.circle(imgkp, (int(point[0]), int(point[1])), 8, (0, 255, 255), thickness=-1, lineType=cv2.FILLED)
        cv2.putText(imgkp, "{}".format(i), (int(point[0]), int(point[1])), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 0, 255), 2,
                    lineType=cv2.LINE_AA)
        points.append((int(point[0]), int(point[1])))
    else:
        points.append(None)

# 绘制骨架
imgskeleton = copy.deepcopy(imgOri)
for pair in POSE_PAIRS:
    partA = pair[0]
    partB = pair[1]
    if points[partA] and points[partB]:
        cv2.line(imgskeleton, points[partA], points[partB], (0, 255, 255), 2)
        cv2.circle(imgskeleton, points[partA], 8, (0, 0, 255), thickness=-1, lineType=cv2.FILLED)
        cv2.circle(imgskeleton, points[partB], 8, (0, 0, 255), thickness=-1, lineType=cv2.FILLED)
# 保存关键点和骨架图
cv2.imwrite('handkeypoint.png', imgkp)
cv2.imwrite('imgskeleton.png', imgskeleton)

Upper board transportation

The program running on the development board is not much different from the above reasoning code. Just pay attention to the input data format of the model. It should be noted here that the output outputs are different from those in docker. To do output = (outputs[0].buffer,) Conversion, which can be directly compatible with the subsequent post-processing part, and then generate the result map.

import numpy as np
import cv2
import os
from hobot_dnn import pyeasy_dnn as dnn
import copy

def get_hw(pro):
    if pro.layout == "NCHW":
        return pro.shape[2], pro.shape[3]
    else:
        return pro.shape[1], pro.shape[2]

# img_path 图像完整路径
img_path = '20220806023323.jpg'
# model_path 量化模型完整路径
model_path = 'handkpdet.bin'

# 1. 加载模型,获取所需输出HW
models = dnn.load(model_path)
model_h, model_w = get_hw(models[0].inputs[0].properties)

# 2 加载图像,根据前面yaml,量化后的模型以BGR NHWC形式输入
imgOri = cv2.imread(img_path)
img = cv2.resize(imgOri, (model_w, model_h))

# 3 模型推理
t1 = cv2.getTickCount()
outputs = models[0].forward(img)
t2 = cv2.getTickCount()
output = (outputs[0].buffer,)
print(outputs[0].buffer.shape)
# (1, 22, 46, 46)
print('time consumption {0} ms'.format((t2-t1)*1000/cv2.getTickFrequency()))


# 4 检测结果后处理
# 绘制关键点
nPoints = 22
threshold = 0.1
POSE_PAIRS = [[0, 1], [1, 2], [2, 3], [3, 4], [0, 5], [5, 6], [6, 7], [7, 8], [0, 9], [9, 10], [10, 11], [11, 12],
              [0, 13], [13, 14], [14, 15], [15, 16], [0, 17], [17, 18], [18, 19], [19, 20]]

imgh, imgw = imgOri.shape[:2]
points = []
imgkp = copy.deepcopy(imgOri)
for i in range(nPoints):
    probMap = output[0][0, i, :, :]
    probMap = cv2.resize(probMap, (imgw, imgh))
    minVal, prob, minLoc, point = cv2.minMaxLoc(probMap)

    if prob > threshold:
        cv2.circle(imgkp, (int(point[0]), int(point[1])), 8, (0, 255, 255), thickness=-1, lineType=cv2.FILLED)
        cv2.putText(imgkp, "{}".format(i), (int(point[0]), int(point[1])), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 0, 255), 2,
                    lineType=cv2.LINE_AA)
        points.append((int(point[0]), int(point[1])))
    else:
        points.append(None)

# 绘制骨架
imgskeleton = copy.deepcopy(imgOri)
for pair in POSE_PAIRS:
    partA = pair[0]
    partB = pair[1]
    if points[partA] and points[partB]:
        cv2.line(imgskeleton, points[partA], points[partB], (0, 255, 255), 2)
        cv2.circle(imgskeleton, points[partA], 8, (0, 0, 255), thickness=-1, lineType=cv2.FILLED)
        cv2.circle(imgskeleton, points[partB], 8, (0, 0, 255), thickness=-1, lineType=cv2.FILLED)
# 保存关键点和骨架图
cv2.imwrite('handkeypoint.png', imgkp)
cv2.imwrite('imgskeleton.png', imgskeleton)

I took two pictures by myself for testing. The first row was taken at night, and the fingers have a bit of smell haha. The overall detection time is about 480ms. The network depth is not as high as yolo, maybe there are more horizontal features.

 

Original author: Xiao Xixi
Original link: This article is transferred from the Horizon developer community (click here for detailed documents and codes)

Guess you like

Origin blog.csdn.net/xuguosheng1992/article/details/128133847