1 Introduction
Recently, I tried different algorithms on x3, but the multi-target tracking algorithm has not been tried. In the past few days, I finally chose yolov5-deepsort for algorithm transplantation testing. While transplanting, I also reviewed the implementation process of deepsort.
yolov5-deepsort address: yolov5-deepsort
The test code address of this article: https://github.com/Rex-LK/ai_arm_learning/tree/master/x3/yolov5-deepsort-x3
Complete test code, video and bin model Baidu cloud link: https://pan.baidu.com/s/19rabju72cYRZNwNCBpjIXQ?pwd=iyg8 Extraction code: iyg8
CV students who are studying or want to learn are welcome to join the group to discuss and study together, v: Rex1586662742, q group: 468713665
2. Original code test
2.1 Test in torch environment
Before running on x3, you must first ensure that there is no problem running in the pytorch environment. After the clone code is down, modify the video path in demo.py, and then run
python demo.py
Download the yolov5 model and deepsort model in the code library. If you run it directly, in addition to the library problem, the following two errors will be reported. Just modify the code in the corresponding place.
- Question 1
change into:
- Question 2
Modify /home/rex/miniconda3/lib/python3.9/site-packages/torch/nn/modules/upsampling.py
2.2. Export onnx
It is necessary to export two onnx models, the yolov5 detection model and the feature extraction model used by deepsort, and add the code for exporting onnx in the following two places.
- For the onnx model exported by yolov5s, modify the init_model function in AIDetector_pytorch.py:
def init_model(self):
self.weights = 'weights/yolov5s.pt'
self.device = 'cpu'
self.device = select_device(self.device)
model = attempt_load(self.weights, map_location=self.device)
model.to(self.device).eval()
model.float()
# torch.save(model, 'test.pt')\\
dummy = torch.zeros(1, 3, 384, 640).float()
torch.onnx.export(
model, (dummy,),
"deepsort_yolov5.onnx",
input_names=["image"],
output_names=["output"],
opset_version=11
)
self.m = model
self.names = model.module.names if hasattr(
model, 'module') else model.names
def preprocess(self, img):
# 因为实在cpu上运行的,不支持half()
img = img.float() # 半精度
To export the onnx model of deepsort, you need to modify the init and call functions of Extractor in deep_sort/deep_sort/deep/feature_extractor.py
def __init__:
self.device = "cpu"
def __call__(self, im_crops):
im_batch = self._preprocess(im_crops)
with torch.no_grad():
im_batch = im_batch.to(self.device)
features = self.net(im_batch)
torch.onnx.export(
self.net, (im_batch,),
"deepsort_feature.onnx",
input_names=["image"],
output_names=["output"],
opset_version=11
)
return features.cpu().numpy()
Run python demo.py after the modification, and two onnx models are generated in the root directory. After exporting, it is recommended to use onnxsim to adjust it, which can avoid some subsequent errors.
3. Model quantification
3.1 yolov5s quantization
For the yolov5s quantization process, please refer to the previous article: https://developer.horizon.ai/forumDetail/118363914936418940
3.2 deepsort model quantification
Since the input size of the feature extraction model of deepsort is h=128, w=64, it is necessary to modify the target_size=(128,64) in preprocess.py first, then run sh 02_preprocess.sh, and then use 03_03_build.sh to perform Quantification of the model. The configuration file of the deepsort feature extraction model is as follows:
model_parameters:
onnx_model: 'deepsort_feature.onnx'
output_model_file_prefix: 'deepsort_feature'
march: 'bernoulli2'
input_parameters:
input_type_train: 'rgb'
input_layout_train: 'NCHW'
input_type_rt: 'nv12'
norm_type: 'data_mean_and_scale'
mean_value: '123.675 116.28 103.53'
scale_value: '58.395 57.12 57.375'
input_layout_rt: 'NHWC'
calibration_parameters:
cal_data_dir: './calibration_data_rgb_f32'
calibration_type: 'max'
max_percentile: 0.9999
compiler_parameters:
compile_mode: 'latency'
optimize_level: 'O3'
debug: False
core_num: 2
4. Run yolov5-deepsort on x3
After preparing the above two bin files, the video test can be performed on x3. The following briefly shows some codes of yolo target detection and deepsort feature extraction. The target detection model is in AIDetector_x3.py
# yolo5目标检测
class Detector(baseDet):
def __init__(self):
super(Detector, self).__init__()
self.init_model()
self.build_config()
def init_model(self):
# 加载模型
self.m = dnn.load('deepsort_yolov5.bin')
def preprocess(self, img):
img0 = img.copy()
img = cv2.resize(img,(640,384))
nv12_data = bgr2nv12_opencv(img)
return img0, nv12_data
def detect(self, im):
im0, nv12_data = self.preprocess(im)
outputs = self.m[0].forward(nv12_data)
pred = outputs[0].buffer
pred = pred[:,:,:,0]
pred = torch.from_numpy(pred)
# nms
pred = non_max_suppression(pred, self.threshold, 0.4)
pred_boxes = []
# 这里模型的输入为384,w为640
for det in pred:
if det is not None and len(det):
det[:, :4] = scale_coords(
(384,640), det[:, :4], im0.shape).round()
for *x, conf, cls_id in det:
lbl = self.names[int(cls_id)]
# 所需要的类别
if not lbl in ['person', 'car', 'truck']:
continue
x1, y1 = int(x[0]), int(x[1])
x2, y2 = int(x[2]), int(x[3])
pred_boxes.append(
(x1, y1, x2, y2, lbl, conf))
return im, pred_boxes
After using yolov5 to detect a specific target frame, you can use the deepsort feature extraction model combined with Kalman filter to track the target. The feature extraction implementation path is deep_sort/deep_sort/deep/feature_extractor.py
class Extractor(object):
def __init__(self, model_path, use_cuda=True):
self.net = dnn.load('deepsort_feature.bin')
self.size = (64, 128)
def _preprocess(self, im_crops):
def _resize(im, size):
return cv2.resize(im, size)
ims = [_resize(im, self.size) for im in im_crops]
return ims
def __call__(self, im_crops):
'''
原代码是利用动态batch进行推理的,x3目前只能使用固定的batch,于是这个采用的是batch=1进行推理
'''
im_batch = self._preprocess(im_crops)
sample = np.empty([len(im_batch),512])
for i,im in enumerate(im_batch):
nv12_data = bgr2nv12_opencv(im)
outputs = self.net[0].forward(nv12_data)
pred = outputs[0].buffer
pred = pred[:,:,0,0]
sample[i,:] = pred
return sample
5. Test results
From the point of view of the accuracy of the test, it should be good, but there is still a lot of room for improvement in terms of speed. The test results are as follows.
6. Summary
This time, the yolov5-deepsort multi-target tracking algorithm was implemented on x3, which further deepened the proficiency in algorithm transplantation on x3. At the same time, I reviewed the implementation process of deepsort. I hope that I will have the opportunity to conduct more in-depth learning in the future. .