Target detection model training based on AI Studio and PaddleDetection

0 Preface

Target detection is a typical computer vision task, which refers to the location calibration and category recognition of an area of ​​interest in an image, and sometimes the confidence of the labeled area. This article will take the partial detection of trucks as an example, as shown in the figure below.
insert image description here

1 Introduction to AI Studio

AI Studio is an artificial intelligence learning and training community based on Paddle, an open source platform for Baidu's deep learning. It provides developers with a powerful online training environment, free GPU computing power and storage resources. The environment configuration used in this training is as follows (if you use your own GPU server environment, you can refer to: https://blog.csdn.net/loutengyuan/article/details/126527326 ):

  • Python version: python 3.7
  • Framework version: PaddlePaddle 2.2.2
  • Notebook version: AI Studio Classic Edition
  • PaddleDetection version: PaddleDetection 2.3
    insert image description here

2 Dataset preparation

The data production process used here can be referred to: https://blog.csdn.net/loutengyuan/article/details/129751419 , the data contains two folders: Annotations (store annotation xml files) and JPEGImages (store original image files):

├─Annotations
│ ├── image1.xml
│ ├── image2.xml
│ ├── image3.xml
│ ├── image4.xml
│ ├── image5.xml
│ └── image6.xml
└─JPEGImages
├── image1.jpg
├── image2.jpg
├── image3.jpg
├── image4.jpg
├── image5.jpg
└── image6.jpg

Compress the data, upload it to the AI ​​Studio platform to create a data set, and upload the compressed file to the data set:
insert image description here
After completing the preparation of the data set, you can start the next training.

3 Model Training

3.1 Project creation

Create an empty project with the environment configuration given before installing the AI ​​Studio platform, and mount the dataset created in the previous step:
insert image description here

3.2 Project operation

The AI ​​Studio platform will provide 8 points of computing power for free every day. Here I choose the advanced GPU V100 32GB configuration, with 1 point of computing power per hour, so it can be used for 8 hours a day (here you can choose the appropriate configuration according to the trained model).
insert image description here

3.3 Training environment preparation

After the project is running, you need to prepare the training environment:

# 更换当前路径
%cd /home/aistudio/

# 下载PaddleDetection代码
!git clone https://gitee.com/paddlepaddle/PaddleDetection

# 下载依赖
!pip install -r PaddleDetection/requirements.txt

# 更换当前路径
%cd /home/aistudio/PaddleDetection
# 编译安装paddledet
!python setup.py install

#  安装paddlex:为了方便切分数据集
!pip install paddlex

#  查看gpu版本号
!cat /usr/local/cuda/version.txt

#  下载对应的版本
!python -m pip install paddlepaddle-gpu==2.2.2.post101 -f https://www.paddlepaddle.org.cn/whl/linux/mkl/avx/stable.html

3.4 Decompress the dataset

Decompress the mounted dataset compressed file:

# 解压数据
!unzip -o /home/aistudio/data/data177504/truck_detect_voc_20230307.zip -d /home/aistudio/PaddleDetection/dataset/

After decompression, there is an additional truck_detect_voc folder in the PaddleDetection/dataset/ directory, which contains the previously prepared VOC format training dataset.
insert image description here

3.5 Split Dataset

# 更换当前路径
%cd /home/aistudio/PaddleDetection/dataset/
# 切分数据集
!paddlex --split_dataset --format VOC --dataset_dir truck_detect_voc --val_value 0.2 --test_value 0.1

insert image description here
The format of train_list.txt, val_list.txt, test_list.txt files is as follows:

JPEGImages/image1.jpg Annotations/image1.xml
JPEGImages/image2.jpg Annotations/image2.xml
JPEGImages/image3.jpg Annotations/image3.xml
JPEGImages/image4.jpg Annotations/image4.xml
JPEGImages/image5.jpg Annotations/image5.xml
JPEGImages/image6.jpg Annotations/image6.xml
......

3.6 Modify training configuration file

Copy the content of the yolov3_darknet53_270e_voc.yml file in the PaddleDetection/configs/yolov3/ directory to the newly created yolov3_darknet53_270e_voc_truck_detect.yml file, and modify the contents as follows:

_BASE_: [
  # 指定训练数据集
  '../datasets/voc_truck_detect.yml',
  '../runtime.yml',
  '_base_/optimizer_270e.yml',
  '_base_/yolov3_darknet53.yml',
  '_base_/yolov3_reader.yml',
]

snapshot_epoch: 5
# 指定最终输出文件
weights: output/yolov3_darknet53_270e_voc_truck_detect/model_final

# set collate_batch to false because ground-truth info is needed
# on voc dataset and should not collate data in batch when batch size
# is larger than 1.
EvalReader:
  collate_batch: false

Copy the content of the voc.yml file in the PaddleDetection/configs/datasets/ directory to the newly created voc_truck_detect.yml file, and modify the content as follows:

# 指定训练数据格式为VOC
metric: VOC
map_type: 11point
# 训练数据集中目标类别数量
num_classes: 5

TrainDataset:
  !VOCDataSet
    # 指定训练数据集
    dataset_dir: /home/aistudio/PaddleDetection/dataset/truck_detect_voc
    anno_path: train_list.txt
    label_list: labels.txt
    data_fields: ['image', 'gt_bbox', 'gt_class', 'difficult']

EvalDataset:
  !VOCDataSet
    # 指定验证数据集
    dataset_dir: /home/aistudio/PaddleDetection/dataset/truck_detect_voc
    anno_path: val_list.txt
    label_list: labels.txt
    data_fields: ['image', 'gt_bbox', 'gt_class', 'difficult']

TestDataset:
  !ImageFolder
    anno_path: /home/aistudio/PaddleDetection/dataset/truck_detect_voc/labels.txt

3.7 Start Training/Continue Training

Execute the following commands to start or continue training:

# 更换当前路径
%cd /home/aistudio/
#  因为训练机器是单卡GPU,所以这里设置为0,如果是多卡训练,根据实际情况设置
!export CUDA_VISIBLE_DEVICES=0 #,1,2,3
# 开始训练
!python  PaddleDetection/tools/train.py -c PaddleDetection/configs/yolov3/yolov3_darknet53_270e_voc_truck_detect.yml
# 继续训练,其中 output/yolov3_darknet53_270e_voc_truck_detect/254 表示之前训练保存的文件 
!python  PaddleDetection/tools/train.py -c PaddleDetection/configs/yolov3/yolov3_darknet53_270e_voc_truck_detect.yml -r output/yolov3_darknet53_270e_voc_truck_detect/254 --eval --amp

Training log:

/home/aistudio
W0329 19:48:38.160742  4144 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 10.1
W0329 19:48:38.217834  4144 device_context.cc:465] device: 0, cuDNN Version: 7.6.
[03/29 19:48:42] ppdet.utils.download INFO: Downloading DarkNet53_pretrained.pdparams from https://paddledet.bj.bcebos.com/models/pretrained/DarkNet53_pretrained.pdparams
100%|████████████████████████████████| 158704/158704 [00:15<00:00, 10205.06KB/s]
[03/29 19:48:59] ppdet.utils.checkpoint INFO: Finish loading model weights: /home/aistudio/.cache/paddle/weights/DarkNet53_pretrained.pdparams
[03/29 19:48:59] ppdet.engine INFO: Epoch: [0] [   0/1885] learning_rate: 0.000000 loss_xy: 2.979628 loss_wh: 4.050179 loss_obj: 16323.830078 loss_cls: 4.739660 loss: 16335.599609 eta: 2 days, 3:14:35 batch_cost: 0.3625 data_cost: 0.0043 ips: 22.0712 images/s
[03/29 19:49:06] ppdet.engine INFO: Epoch: [0] [  20/1885] learning_rate: 0.000005 loss_xy: 4.550855 loss_wh: 4.956159 loss_obj: 1370.894165 loss_cls: 6.475554 loss: 1390.607056 eta: 1 day, 22:53:41 batch_cost: 0.3302 data_cost: 0.1292 ips: 24.2291 images/s
[03/29 19:49:13] ppdet.engine INFO: Epoch: [0] [  40/1885] learning_rate: 0.000010 loss_xy: 4.161049 loss_wh: 4.574360 loss_obj: 17.324505 loss_cls: 6.370095 loss: 32.187752 eta: 2 days, 2:58:15 batch_cost: 0.3909 data_cost: 0.1634 ips: 20.4679 images/s
[03/29 19:49:20] ppdet.engine INFO: Epoch: [0] [  60/1885] learning_rate: 0.000015 loss_xy: 4.205465 loss_wh: 4.053814 loss_obj: 12.500505 loss_cls: 6.107187 loss: 26.668793 eta: 2 days, 2:30:44 batch_cost: 0.3507 data_cost: 0.1385 ips: 22.8106 images/s
[03/29 19:49:28] ppdet.engine INFO: Epoch: [0] [  80/1885] learning_rate: 0.000020 loss_xy: 4.379153 loss_wh: 3.988246 loss_obj: 12.806162 loss_cls: 6.259327 loss: 27.258268 eta: 2 days, 2:54:20 batch_cost: 0.3687 data_cost: 0.1606 ips: 21.6999 images/s
[03/29 19:49:35] ppdet.engine INFO: Epoch: [0] [ 100/1885] learning_rate: 0.000025 loss_xy: 4.374406 loss_wh: 3.631483 loss_obj: 12.187485 loss_cls: 5.954453 loss: 26.480440 eta: 2 days, 3:06:37 batch_cost: 0.3675 data_cost: 0.1569 ips: 21.7681 images/s
[03/29 19:49:42] ppdet.engine INFO: Epoch: [0] [ 120/1885] learning_rate: 0.000030 loss_xy: 4.085462 loss_wh: 3.361600 loss_obj: 11.558670 loss_cls: 5.272717 loss: 24.563915 eta: 2 days, 2:53:52 batch_cost: 0.3526 data_cost: 0.1381 ips: 22.6892 images/s
[03/29 19:49:50] ppdet.engine INFO: Epoch: [0] [ 140/1885] learning_rate: 0.000035 loss_xy: 3.668382 loss_wh: 2.828609 loss_obj: 10.547073 loss_cls: 4.500311 loss: 21.663486 eta: 2 days, 2:57:11 batch_cost: 0.3630 data_cost: 0.1590 ips: 22.0411 images/s
[03/29 19:49:56] ppdet.engine INFO: Epoch: [0] [ 160/1885] learning_rate: 0.000040 loss_xy: 4.295379 loss_wh: 3.293975 loss_obj: 11.571999 loss_cls: 4.956268 loss: 23.901939 eta: 2 days, 2:29:34 batch_cost: 0.3344 data_cost: 0.1394 ips: 23.9222 images/s
[03/29 19:50:04] ppdet.engine INFO: Epoch: [0] [ 180/1885] learning_rate: 0.000045 loss_xy: 4.463886 loss_wh: 3.281782 loss_obj: 11.444370 loss_cls: 4.747983 loss: 24.420515 eta: 2 days, 2:47:36 batch_cost: 0.3766 data_cost: 0.1649 ips: 21.2408 images/s
[03/29 19:50:12] ppdet.engine INFO: Epoch: [0] [ 200/1885] learning_rate: 0.000050 loss_xy: 3.771811 loss_wh: 2.779631 loss_obj: 10.306545 loss_cls: 4.216544 loss: 20.910732 eta: 2 days, 3:07:28 batch_cost: 0.3831 data_cost: 0.1933 ips: 20.8822 images/s
[03/29 19:50:18] ppdet.engine INFO: Epoch: [0] [ 220/1885] learning_rate: 0.000055 loss_xy: 3.826384 loss_wh: 2.561981 loss_obj: 10.145305 loss_cls: 3.926377 loss: 20.392998 eta: 2 days, 2:55:39 batch_cost: 0.3465 data_cost: 0.1411 ips: 23.0862 images/s
[03/29 19:50:26] ppdet.engine INFO: Epoch: [0] [ 240/1885] learning_rate: 0.000060 loss_xy: 4.362038 loss_wh: 2.609330 loss_obj: 10.681217 loss_cls: 4.222875 loss: 21.947245 eta: 2 days, 2:53:34 batch_cost: 0.3576 data_cost: 0.1561 ips: 22.3724 images/s
[03/29 19:50:33] ppdet.engine INFO: Epoch: [0] [ 260/1885] learning_rate: 0.000065 loss_xy: 4.060377 loss_wh: 2.608733 loss_obj: 9.662804 loss_cls: 3.905273 loss: 19.952328 eta: 2 days, 2:49:26 batch_cost: 0.3540 data_cost: 0.1458 ips: 22.6006 images/s
[03/29 19:50:40] ppdet.engine INFO: Epoch: [0] [ 280/1885] learning_rate: 0.000070 loss_xy: 3.299124 loss_wh: 2.124129 loss_obj: 9.032528 loss_cls: 3.533381 loss: 17.795029 eta: 2 days, 2:54:45 batch_cost: 0.3687 data_cost: 0.1702 ips: 21.6977 images/s
[03/29 19:50:48] ppdet.engine INFO: Epoch: [0] [ 300/1885] learning_rate: 0.000075 loss_xy: 3.906075 loss_wh: 2.350418 loss_obj: 9.924751 loss_cls: 3.748957 loss: 19.654987 eta: 2 days, 2:59:08 batch_cost: 0.3683 data_cost: 0.1608 ips: 21.7204 images/s
[03/29 19:50:56] ppdet.engine INFO: Epoch: [0] [ 320/1885] learning_rate: 0.000080 loss_xy: 3.889876 loss_wh: 2.479131 loss_obj: 9.141432 loss_cls: 3.764449 loss: 19.268023 eta: 2 days, 3:40:00 batch_cost: 0.4384 data_cost: 0.2334 ips: 18.2463 images/s
[03/29 19:51:03] ppdet.engine INFO: Epoch: [0] [ 340/1885] learning_rate: 0.000085 loss_xy: 3.585271 loss_wh: 2.380863 loss_obj: 8.867376 loss_cls: 3.508474 loss: 17.948868 eta: 2 days, 3:29:26 batch_cost: 0.3447 data_cost: 0.1240 ips: 23.2089 images/s
[03/29 19:51:11] ppdet.engine INFO: Epoch: [0] [ 360/1885] learning_rate: 0.000090 loss_xy: 3.408739 loss_wh: 2.129457 loss_obj: 8.074052 loss_cls: 3.069366 loss: 16.955669 eta: 2 days, 3:31:29 batch_cost: 0.3691 data_cost: 0.1395 ips: 21.6747 images/s
[03/29 19:51:19] ppdet.engine INFO: Epoch: [0] [ 380/1885] learning_rate: 0.000095 loss_xy: 3.814119 loss_wh: 2.268811 loss_obj: 9.213679 loss_cls: 3.333805 loss: 19.176319 eta: 2 days, 3:55:20 batch_cost: 0.4186 data_cost: 0.2386 ips: 19.1128 images/s
[03/29 19:51:26] ppdet.engine INFO: Epoch: [0] [ 400/1885] learning_rate: 0.000100 loss_xy: 3.640779 loss_wh: 2.035562 loss_obj: 8.272722 loss_cls: 3.377781 loss: 17.540604 eta: 2 days, 3:37:56 batch_cost: 0.3267 data_cost: 0.1148 ips: 24.4881 images/s
[03/29 19:51:34] ppdet.engine INFO: Epoch: [0] [ 420/1885] learning_rate: 0.000105 loss_xy: 3.670832 loss_wh: 2.191157 loss_obj: 8.531160 loss_cls: 3.341545 loss: 18.225866 eta: 2 days, 3:54:07 batch_cost: 0.4060 data_cost: 0.2044 ips: 19.7058 images/s
[03/29 19:51:41] ppdet.engine INFO: Epoch: [0] [ 440/1885] learning_rate: 0.000110 loss_xy: 3.946175 loss_wh: 2.116696 loss_obj: 8.605946 loss_cls: 3.084132 loss: 17.808537 eta: 2 days, 3:57:42 batch_cost: 0.3771 data_cost: 0.1704 ips: 21.2162 images/s
[03/29 19:51:49] ppdet.engine INFO: Epoch: [0] [ 460/1885] learning_rate: 0.000115 loss_xy: 4.071396 loss_wh: 1.943774 loss_obj: 8.716118 loss_cls: 3.235123 loss: 18.250977 eta: 2 days, 4:04:14 batch_cost: 0.3860 data_cost: 0.1692 ips: 20.7261 images/s
[03/29 19:51:56] ppdet.engine INFO: Epoch: [0] [ 480/1885] learning_rate: 0.000120 loss_xy: 3.293082 loss_wh: 1.645013 loss_obj: 7.791773 loss_cls: 2.824843 loss: 15.625689 eta: 2 days, 3:52:38 batch_cost: 0.3361 data_cost: 0.1366 ips: 23.8052 images/s
[03/29 19:52:03] ppdet.engine INFO: Epoch: [0] [ 500/1885] learning_rate: 0.000125 loss_xy: 3.550079 loss_wh: 1.871112 loss_obj: 8.082129 loss_cls: 2.853400 loss: 16.612953 eta: 2 days, 3:51:59 batch_cost: 0.3657 data_cost: 0.1631 ips: 21.8731 images/s

The model files and final model generated during the training process will be stored in the output/yolov3_darknet53_270e_voc_truck_detect/ directory:
insert image description here

3.8 Validation Model

# 更换当前路径
%cd /home/aistudio/
# 验证中间模型
# !python PaddleDetection/tools/eval.py -c PaddleDetection/configs/yolov3/yolov3_darknet53_270e_voc_truck_detect.yml -o weights=output/yolov3_darknet53_270e_voc_truck_detect/9
# 验证最佳模型
# !python PaddleDetection/tools/eval.py -c PaddleDetection/configs/yolov3/yolov3_darknet53_270e_voc_truck_detect.yml -o weights=output/yolov3_darknet53_270e_voc_truck_detect/best_model
# 验证最终模型
!python PaddleDetection/tools/eval.py -c PaddleDetection/configs/yolov3/yolov3_darknet53_270e_voc_truck_detect.yml -o weights=output/yolov3_darknet53_270e_voc_truck_detect/model_final

Verify model log:

/home/aistudio
W0329 20:00:02.410827  5323 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 10.1
W0329 20:00:02.416776  5323 device_context.cc:465] device: 0, cuDNN Version: 7.6.
[03/29 20:00:07] ppdet.utils.checkpoint INFO: Finish loading model weights: output/yolov3_darknet53_270e_voc_truck_detect/model_final.pdparams
[03/29 20:00:07] ppdet.engine INFO: Eval iter: 0
[03/29 20:00:10] ppdet.engine INFO: Eval iter: 100
[03/29 20:00:13] ppdet.engine INFO: Eval iter: 200
[03/29 20:00:15] ppdet.engine INFO: Eval iter: 300
[03/29 20:00:18] ppdet.engine INFO: Eval iter: 400
[03/29 20:00:21] ppdet.engine INFO: Eval iter: 500
[03/29 20:00:23] ppdet.engine INFO: Eval iter: 600
[03/29 20:00:26] ppdet.engine INFO: Eval iter: 700
[03/29 20:00:29] ppdet.engine INFO: Eval iter: 800
[03/29 20:00:32] ppdet.engine INFO: Eval iter: 900
[03/29 20:00:35] ppdet.engine INFO: Eval iter: 1000
[03/29 20:00:37] ppdet.engine INFO: Eval iter: 1100
[03/29 20:00:40] ppdet.engine INFO: Eval iter: 1200
[03/29 20:00:43] ppdet.engine INFO: Eval iter: 1300
[03/29 20:00:46] ppdet.engine INFO: Eval iter: 1400
[03/29 20:00:48] ppdet.engine INFO: Eval iter: 1500
[03/29 20:00:52] ppdet.engine INFO: Eval iter: 1600
[03/29 20:00:54] ppdet.engine INFO: Eval iter: 1700
[03/29 20:00:57] ppdet.engine INFO: Eval iter: 1800
[03/29 20:01:00] ppdet.engine INFO: Eval iter: 1900
[03/29 20:01:02] ppdet.engine INFO: Eval iter: 2000
[03/29 20:01:05] ppdet.engine INFO: Eval iter: 2100
[03/29 20:01:07] ppdet.engine INFO: Eval iter: 2200
[03/29 20:01:10] ppdet.engine INFO: Eval iter: 2300
[03/29 20:01:13] ppdet.engine INFO: Eval iter: 2400
[03/29 20:01:15] ppdet.engine INFO: Eval iter: 2500
[03/29 20:01:18] ppdet.engine INFO: Eval iter: 2600
[03/29 20:01:20] ppdet.engine INFO: Eval iter: 2700
[03/29 20:01:23] ppdet.engine INFO: Eval iter: 2800
[03/29 20:01:25] ppdet.engine INFO: Eval iter: 2900
[03/29 20:01:28] ppdet.engine INFO: Eval iter: 3000
[03/29 20:01:31] ppdet.engine INFO: Eval iter: 3100
[03/29 20:01:33] ppdet.engine INFO: Eval iter: 3200
[03/29 20:01:36] ppdet.engine INFO: Eval iter: 3300
[03/29 20:01:39] ppdet.engine INFO: Eval iter: 3400
[03/29 20:01:41] ppdet.engine INFO: Eval iter: 3500
[03/29 20:01:44] ppdet.engine INFO: Eval iter: 3600
[03/29 20:01:47] ppdet.engine INFO: Eval iter: 3700
[03/29 20:01:50] ppdet.engine INFO: Eval iter: 3800
[03/29 20:01:53] ppdet.engine INFO: Eval iter: 3900
[03/29 20:01:55] ppdet.engine INFO: Eval iter: 4000
[03/29 20:01:58] ppdet.engine INFO: Eval iter: 4100
[03/29 20:02:01] ppdet.engine INFO: Eval iter: 4200
[03/29 20:02:03] ppdet.engine INFO: Eval iter: 4300
[03/29 20:02:04] ppdet.metrics.metrics INFO: Accumulating evaluatation results...
[03/29 20:02:04] ppdet.metrics.metrics INFO: mAP(0.50, 11point) = 87.00%
[03/29 20:02:04] ppdet.engine INFO: Total sample number: 4310, averge FPS: 36.84345243569448

3.9 Export model

# 更换当前路径
%cd /home/aistudio/
#  模型导出
# !python PaddleDetection/tools/export_model.py -c PaddleDetection/configs/yolov3/yolov3_darknet53_270e_voc_truck_detect.yml --output_dir=./inference_model_truck_detect -o weights=output/yolov3_darknet53_270e_voc_truck_detect/best_model.pdparams
!python PaddleDetection/tools/export_model.py -c PaddleDetection/configs/yolov3/yolov3_darknet53_270e_voc_truck_detect.yml --output_dir=./inference_model_truck_detect -o weights=output/yolov3_darknet53_270e_voc_truck_detect/model_final.pdparams

Export model log:

[03/14 11:41:15] ppdet.utils.checkpoint INFO: Finish loading model weights: output/yolov3_darknet53_270e_voc_truck_detect/model_final.pdparams
[03/14 11:41:15] ppdet.engine INFO: Export inference config file to ./inference_model_truck_detect/yolov3_darknet53_270e_voc_truck_detect/infer_cfg.yml
[03/14 11:41:22] ppdet.engine INFO: Export model and saved in ./inference_model_truck_detect/yolov3_darknet53_270e_voc_truck_detect

The exported model is stored in the /home/aistudio/inference_model_truck_detect/yolov3_darknet53_270e_voc_truck_detect directory:
insert image description here

3.10 Image Prediction

# 更换当前路径
%cd /home/aistudio/
#  图片预测
!python PaddleDetection/deploy/python/infer.py --model_dir=inference_model_truck_detect/yolov3_darknet53_270e_voc_truck_detect --image_file=mydata/11111.jpg --device=GPU

The predicted results are as follows:
insert image description here

3.11 Video Prediction

# 更换当前路径
%cd /home/aistudio/
#  视频预测
!python PaddleDetection/deploy/python/infer.py --model_dir=inference_model_truck_detect/yolov3_darknet53_270e_voc_truck_detect --video_file=mydata/uuuuuu.mp4 --device=GPU

The predicted results are as follows:
insert image description here

3.12 Compression Model

# 更换当前路径
%cd /home/aistudio/
# 压缩模型文件
!zip -r -q -o inference_model_truck_detect_20230314_v1.9.zip inference_model_truck_detect/

The exported model can refer to: https://blog.csdn.net/loutengyuan/article/details/126532324 This document is used for deployment

Guess you like

Origin blog.csdn.net/loutengyuan/article/details/129840308