Human body key point detection 2: Pytorch implements human body key point detection (human posture estimation) including training code

Human body key point detection 2: Pytorch implements human body key point detection (human posture estimation) including training code

Table of contents

Human body key point detection 2: Pytorch implements human body key point detection (human posture estimation) including training code

1 Introduction

2. Human body key point detection method

(1)Top-Down (top-down) method

(2) Bottom-Up (bottom-up) method:

3. Human body key point detection data set

4. Human detection model training

5. Human body key point detection model training

(1) Project installation

(2) Prepare Train and Test data

(3) Configuration file configs

(4) Start training

(5) Tensorboard visual training process

6. Human body key point detection detection model effect

7. Human body key point detection (inference code) download

8. Human body key point detection (training code) download

9. Human body key point detection C++/Android version


1 Introduction

Human Keypoints Detection, also known as human pose estimation 2D Pose, is a relatively basic task in computer vision and a prerequisite for human action recognition, behavior analysis, human-computer interaction, etc. Generally speaking, human body key point detection can be subdivided into single/multi-person key point detection, 2D/3D key point detection. At the same time, there are algorithms that will track key points after completing key point detection, which is also called human body. Posture tracking.

This project will implementhuman body key point detection algorithm, which uses the YOLOv5 model to implement human detection (Person Detection), and uses HRNet, LiteHRNet and Mobilenet-v2 model implementshuman body key point detection. The project is divided into data set description, model training and C++/Android deployment This article is one of a series of articles on the project "Human Key Point Detection (Human Posture Estimation)" Pytorch implements human body key point detection (human posture estimation); in order to facilitate subsequent model engineering and Android platform deployment, the project supports high-precision HRNet detection model, lightweight model LiteHRNet and Mobilenet model training and testing , and provide multiple versions of Python/C++/Android;

The lightweight Mobilenet-v2 model can achieve real-time detection results on ordinary Android phones. The CPU (4 threads) takes about 50ms and the GPU takes about 30ms, which basically meets the performance requirements of the business. The following table shows the calculation amount and parameter amount of HRNet, as well as the lightweight models LiteHRNet and Mobilenet, as well as their detection accuracy.

Model input-size params(M) GFLOPs AP
HRNet-w32 192×256 28.48M 5734.05M 0.7585
LiteHRNet18 192×256 1.10M 182.15M 0.6237
Mobilenet-v2 192×256 2.63M 529.25M 0.6181

Show it firstHuman body key point detectionEffect:

AndroidHuman body control point testAPP Demo experience(下载):https://download.csdn.net/download/guyuealian/88610359

[Respect originality, please indicate the source when reprinting]https://blog.csdn.net/guyuealian/article/details/134837816


For more project "Human key point detection (human posture estimation)" series of articles, please refer to:

  


2.Human body key point detectionMethod

The current mainstreamhuman body key point detection (human posture estimation)mainly has two methods: one isTop-Down (top-down) method, the other isBottom-Up (bottom-up above) method;

(1)Top-Down(Jijijishita)Method

Separate human detection andHuman key point detection (human pose estimation) detection. First, perform human target detection on the image to locate the human body. position; then crop each human body image, and then estimate the key points of the human body; this type of method is often slower, but the pose estimation accuracy is higher. At present, the mainstream models mainly include CPN, Hourglass, CPM, Alpha Pose, HRNet, etc.

(2)Bottom-Up(Joshigejiji)Method:

First estimate all human body key points in the image, and then combine them into instances one by one through the Grouping method; therefore, this type of method is often faster and less accurate when testing inference. A typical example is COCO's 2016 human body key point detection champion Open Pose.

Generally speaking, Top-Down has higher accuracy, while Bottom-Up has faster speed;Based on current research Generally speaking, the Top-Down method has been studied more and its accuracy is higher than the bottom-up method. This project adopts theTop-Down(top-down) method. First use the YOLOv5 model to implement human detection, and then use HRNet for human key point detection (human pose estimation);

This project is improved based on the open source HRNet. Please refer to GitHub for the HRNet project.

HRNet: https://github.com/leoxiaobin/deep-high-resolution-net.pytorch


3. Human body key point detection data set

This project mainly uses COCO data set and MPII data set. For description of human key point detection data set, please refer to "Human key point detection 1: Human posture estimation data set< /span>https://blog.csdn.net/guyuealian/article/details/134703548


4. Human detection model training

This project adoptsTop-Down(top-down)Method, use YOLOv5 model to achieve human target detection, and use HRNet forHuman key point detection (human pose estimation); About human detection model training method , please refer to:

Pedestrian detection (human detection) 2: YOLOv5 implements human detection (including human detection data set and training code)


5. Human body key point detection model training

 The basic structure of the entire project is as follows:

.
├── configs              # 训练配置文件
├── data                 # 一些数据
├── libs                 # 一些工具库
├── pose                 # 姿态估计模型文件
├── work_space           # 训练输出工作目录
├── demo.py              # 模型推理demo文件
├── README.md            # 项目工程说明文档
├── requirements.txt     # 项目相关依赖包
└── train.py             # 训练文件

(1) Project installation

It is recommended to use Python3.8 or Python3.7. There may be version differences in higher versions. If the project depends on the python package, please refer to requirements.txt and use pip to install it, project The code has been verified to run normally on Ubuntu and Windows systems, so please feel free to use it; if an exception occurs, it is most likely that the relevant dependency package versions do not fully correspond

numpy==1.21.6
matplotlib==3.2.2
Pillow==8.4.0
bcolz==1.2.1
easydict==1.9
onnx==1.8.1
onnx-simplifier==0.2.28
onnxoptimizer==0.2.0
onnxruntime==1.6.0
opencv-contrib-python==4.5.2.52
opencv-python==4.5.1.48
pandas==1.1.5
PyYAML==5.3.1
scikit-image==0.17.2
scikit-learn==0.24.0
scipy==1.5.4
seaborn==0.11.2
sklearn==0.0
tensorboard==2.5.0
tensorboardX==2.1
torch==1.7.1+cu110
torchvision==0.8.2+cu110
tqdm==4.55.1
xmltodict==0.12.0
pycocotools==2.0.2
pybaseutils==0.9.4
basetrainer

Please refer to the project installation tutorial ( Getting started for beginners, please read the following tutorial first and configure the Python development environment):

(2) Prepare Train and Test data

DownloadCOCO data setorMPII data set ( It is recommended to use the COCO data set), then:

  • Download and decompress the COCO data set locally. The storage directory structure is as follows (the original image directory and annotation information files are in the same directory)
─── COCO
    ├── train2017
    │   ├── images                           # COCO训练集原始图片目录
    │   └── person_keypoints_train2017.json  # COCO训练集标注信息文件
    └── val2017
        ├── images                           # COCO验证集原始图片目录
        └── person_keypoints_val2017.json    # COCO验证集标注信息文件

  • Download and decompress the MPII data set locally. The storage directory structure is as follows
─── MPII
    ├── images      # MPII数据集原始图片目录
    ├── train.json  # MPII训练集标注信息文件
    └── valid.json  # MPII训练集标注信息文件

(3) Configuration file configs

The project supports HRNet and lightweight model LiteHRNet and Mobilenet model training, and provides corresponding configuration files; you need to modify the data path of the corresponding configuration file; this article takes training HRNet-w32 as an example, and its configuration file is in configs/coco/hrnet/ w32_adam_192_192.yaml, modify the file's training data set path TRAIN_FILE (supports multiple data set training) and test data set TEST_FILE's data path to your local data path, and keep other parameters as default, as shown below:

WORKERS: 8
PRINT_FREQ: 10
DATASET:
  DATASET: 'custom_coco'
  TRAIN_FILE:
    - 'D:/COCO/train2017/person_keypoints_train2017.json'
  TEST_FILE: 'D:/COCO/val2017/person_keypoints_val2017.json'
  FLIP: true
  ROT_FACTOR: 45
  SCALE_FACTOR: 0.3
  SCALE_RATE: 1.25
  JOINT_IDS: [0,1]
  FLIP_PAIRS: [ ]
  SKELETON: [ ]

For some parameter descriptions of the configuration file, please refer to

parameter type Reference illustrate
WORKERS int 8 Number of processes for data loading processing
PRINT_FREQ int 10 Interval for printing LOG information
DATASET str custom_coco Data set type, currently only supports COCO data format
TRAIN_FILE List - Training data set file list (COCO data format), supports multiple data sets
TEST_FILE string - Test data set file (COCO data format), only supports a single data set
FLIP bool True Whether to flip the picture for testing can improve the testing effect.
ROT_FACTOR float 45 The maximum angle of random rotation of training data, used for data enhancement
SCALE_FACTOR float 1.25 Image scaling factor
SCALE_RATE float 0.25 Image zoom ratio
JOINT_IDS list [ ] [ ] represents all key points, and you can also specify the key point serial number ID that needs to be trained.
FLIP_PAIRS list [ ] When the image is flipped, the ID number of the key point is not affected by the flip.
SKELETON list [ ] Sequential list of keypoint connecting lines for visualization purposes

(4) Start training

After modifying the configuration file, you can start preparing for training:

  • Train high-precision model HRNet-w48 or HRNet-w32
# 高精度模型:HRNet-w32
python train.py  -c "configs/coco/hrnet/w48_adam_192_192.yaml" --workers=8 --batch_size=32 --gpu_id=0 --work_dir="work_space/person"
# 高精度模型:HRNet-w48
python train.py  -c "configs/coco/hrnet/w32_adam_192_192.yaml" --workers=8 --batch_size=32 --gpu_id=0 --work_dir="work_space/person"
  • Training lightweight model LiteHRNet
# 轻量化模型:LiteHRNet
python train.py  -c "configs/coco/litehrnet/litehrnet18_192_192.yaml" --workers=8 --batch_size=32 --gpu_id=0 --work_dir="work_space/person"
  • Training lightweight model Mobilenetv2
# 轻量化模型:Mobilenet
python train.py  -c "configs/coco/mobilenet/mobilenetv2_192_192.yaml" --workers=8 --batch_size=32 --gpu_id=0 --work_dir="work_space/person"

The following table gives the calculation amount and parameter amount of HRNet, as well as the lightweight models LiteHRNet and Mobilenet, as well as their detection accuracy AP; the high-precision detection model HRNet-w32,AP can Reaching 0.7585, but its parameter amount and calculation amount are relatively large, and it is not suitable for deployment on the mobile terminal; LiteHRNet18 and Mobilenet-v2 have relatively small parameter amounts and calculation amount, and are suitable for mobile terminal deployment; although LiteHRNet18 The theoretical calculation amount and parameter amount are lower than Mobilenet-v2, but in actual tests, Mobilenet-v2 was found to run faster. The lightweight Mobilenet-v2 model can achieve real-time detection results on ordinary Android phones. The CPU (4 threads) takes about 50ms and the GPU takes about 30ms, which basically meets the performance requirements of the business

Model input-size params(M) GFLOPs AP
HRNet-w32 192×256 28.48M 5734.05M 0.7585
LiteHRNet18 192×256 1.10M 182.15M 0.6237
Mobilenet-v2 192×256 2.63M 529.25M 0.6181

(5) Tensorboard visual training process

The training process visualization tool is to use Tensorboard. To use it, enter in the terminal:
# 基本方法
tensorboard --logdir=path/to/log/
# 例如
tensorboard --logdir="work_space/person/hrnet_w32_16_192_256_mpii_20231127_113836_6644/log"

Click the link printed by TensorBoard on the terminal to view training LOG information, etc. in the browser:


6. Human body key point detectionDetectionModel effect

The demo.py file is used for reasoning and testing the effect of the model. After filling in the configuration file, model file and test image, you can run the test; the demo.py command line parameters are explained as follows:

parameter type Reference illustrate
-c,--config_file str - Configuration file
-m,--model_file str - model file
target str - Bone point type, such as hand, coco_person, mpii_person
image_dir str data/image Path to test image
video_file str,int - Test video file
out_dir str output Save the result, do not save if it is empty
threshold float 0.3 Key point detection confidence
device str difference:0 GPU ID

The following takes running HRNet-w32 as an example. For other models, just modify --config_file or --model_file.

  • test picture
python demo.py -c work_space/person/hrnet_w32_17_192_256_custom_coco_20231115_092948_1789/w32_adam_192_192.yaml -m work_space/person/hrnet_w32_17_192_256_custom_coco_20231115_092948_1789/model/best_model_195_0.7585.pth --image_dir data/test_images --out_dir output
  • Test video file
python demo.py -c work_space/person/hrnet_w32_17_192_256_custom_coco_20231115_092948_1789/w32_adam_192_192.yaml -m work_space/person/hrnet_w32_17_192_256_custom_coco_20231115_092948_1789/model/best_model_195_0.7585.pth --video_file data/video-test.mp4 --out_dir output
  •  Test camera
python demo.py -c work_space/person/hrnet_w32_17_192_256_custom_coco_20231115_092948_1789/w32_adam_192_192.yaml -m work_space/person/hrnet_w32_17_192_256_custom_coco_20231115_092948_1789/model/best_model_195_0.7585.pth --video_file 0 --out_dir output

The project also supports human body key point detection in MPII data set format.

  • Test picture (human body key point detection in MPII format)
python demo.py -c work_space/person/hrnet_w32_16_192_256_mpii_20231127_113836_6644/w32_adam_192_192.yaml -m work_space/person/hrnet_w32_16_192_256_mpii_20231127_113836_6644/model/best_model_148_89.4041.pth --image_dir data/test_images --out_dir output --target mpii_person

Operation effect (supports single and multi-person human body key point detection):


7.Human body key pointsDetection (inference code) download

Human body key point detection inference codeDownload address:Pytorch implements human body key point detection (human posture Estimate) inference code

Resource content includes:

  1. Provides YOLOv5 human detection inference code (excluding training code)
  2. Provide human key point detection (human pose estimation) inference code demo.py (excluding training code)
  3. Provides a high-precision version of HRNet human body key point detection (human pose estimation) (excluding training code)
  4. Provides lightweight model LiteHRNet, and Mobilenet-v2 human key point detection (human pose estimation) (training code not included)
  5. Provide trained models: HRNet-w32, LiteHRNet and Mobilenet-v2 model weight files, configure the environment, and run demo.py directly
  6. The inference code demo.py supports image, video and camera testing

 If you need the supporting training data set and training code, please check the following section


8.Human body key point detection(training code) download

Human body key point detection training codeDownload addressPytorch implements human body key point detection (human posture Estimated) training code

Resource content includes:

  1. Provide YOLOv5 human detection reasoning code
  2. Provide a complete set of project engineering codes, including human key point detection (human posture estimation)training codetrain.py and a>Inference testCode demo.py
  3. Provide a high-precision version of HRNet human key point detection (human posture estimation)Training and testingCode
  4. Provide lightweight model LiteHRNet and Mobilenet-v2 human key point detection (human posture estimation)Training and testingCode
  5. The project code supports MPII data set and COCO data set human key point detection model training and testing
  6. According to the instructions in this blog post, you can start training with simple configuration: train.py
  7. Provide trained models: HRNet-w32, LiteHRNet and Mobilenet-v2 model weight files, configure the environment, and run demo.py directly
  8. The inference code demo.py supports image, video and camera testing

9.Human body key point detectionC++/Android version

 AndroidHuman body control point 检测APP Demo experience(下载):https://download.csdn.net/download/guyuealian/88610359 

  

Guess you like

Origin blog.csdn.net/guyuealian/article/details/134837816