Hand key point detection 5: C++ implements hand key point detection (hand posture estimation) including source code and can be detected in real time

Hand key point detection 5: C++ implements hand key point detection (hand posture estimation) including source code and can be detected in real time

Table of contents

Hand key point detection 4: C++ implements hand key point detection (hand posture estimation) including source code and can be detected in real time

1.Project introduction

2. Hand key point detection (hand posture estimation) method

(1)Top-Down (top-down) method

(2) Bottom-Up (bottom-up) method:

3. Hand key point detection model

(1) Training of hand key point detection model

(2) Convert Pytorch model to ONNX model

(3) Convert ONNX model to TNN model

4. Hand key point detection C/C++ deployment

(1) Project structure

(2) Configure the development environment (OpenCV+OpenCL+base-utils+TNN)

(3) Deploy TNN model

(4) CMake configuration

(5) main source code

(6) Source code compilation and operation

(7) Demo test effect 

5. Project source code download


1.Project introduction

This article is part of a series of articles on the project "Hand key point detection (hand posture estimation)"< a i=3>C++ implements hand key point detection (hand posture estimation) with source code for real-time detection》; The project is based on the Pytorch deep learning framework to implement hand key point detection (hand posture estimation) Estimation) model, in which hand detection uses the YOLOv5 model, hand key point detection is improved based on the open source HRNet, and a complete set of training and testing processes for hand key point detection is constructed; in order to facilitate subsequent model engineering and Android platform deployment, The project supports high-precision HRNet detection model, lightweight model LiteHRNet and Mobilenet model training and testing, and provides multiple versions of Python/C++/Android;

This article mainly shares the deployment of hand detection and hand key point detection models trained in Python to the C/C++ platform. We will develop a simple C/C++ Demo for hand key point detection that can run in real time. The following table shows the calculation amount and parameter amount of HRNet, as well as the lightweight models LiteHRNet and Mobilenet, as well as their detection accuracy.

Model input-size params(M) GFLOPs AP
HRNet-w32 192×192 28.48M 5734.05M 0.8570
LiteHRNet18 192×192 1.10M 182.15M 0.8023
Mobilenet-v2 192×192 2.63M 529.25M 0.7574

First show the C/C++ versionHand detection and hand key point detection (hand posture estimation)Effect:

AndroidHand key point detection (hand posture estimation)APP Demo experience:

https://download.csdn.net/download/guyuealian/88418582

[Respect originality, please indicate the source when reprinting]https://blog.csdn.net/guyuealian/article/details/133277748


For more project "Hand key point detection (hand posture estimation)" series of articles, please refer to:

  


2. Hand key point detection (hand posture estimation) method

Hand key point detection (hand posture estimation) method, there are currently two mainstream methods: one isTop- Down (top-down) method, the other isBottom-Up (bottom-up) method ;

(1)Top-Down(Jijijishita)Method

Separate hand detection and hand key point estimation, first perform hand target detection on the image, and locate the hand position; then crop each hand image, and then estimate the key points of each hand; this type of method is often compared Slow, but the attitude estimation accuracy is higher. At present, the mainstream models mainly include CPN, Hourglass, CPM, Alpha Pose, HRNet, etc.

(2)Bottom-Up(Joshigejiji)Method:

First estimate the key points of all hands in the image, and then combine them into hand instances through the Grouping method; therefore, this type of method is often faster and less accurate when testing inference. A typical example is COCO's 2016 human body key point detection champion Open Pose.

Generally speaking, Top-Down has higher accuracy, while Bottom-Up has faster speed;Based on current research Generally speaking, the Top-Down method has been studied more and its accuracy is higher than the bottom-up method.

This project is improved based on the open source HRNet. Please refer to GitHub for the HRNet project.

HRNet: https://github.com/leoxiaobin/deep-high-resolution-net.pytorch


3. Hand key point detection model

(1) Training of hand key point detection model

This blog post mainly shares the C++ version of the model deployment. It does not include the Python version of hand key point detection and related training code. For the training method and data set description of hand key point detection, please refer to my other Blog post "Hand key point detection 3: Pytorch implements hand key point detection (hand posture estimation) including training code and data set" a>Hand key point detection 3: Pytorch implements hand key point detection (hand posture estimation) including training code and data set-CSDN Blog

(2) Convert Pytorch model to ONNX model

Currently, there are many deployment methods for CNN models. You can use deployment tools such as TNN, MNN, NCNN, and TensorRT. I use TNN for C/C++ terminal deployment. The deployment process can be divided into four steps: Train the model ->Convert the model to an ONNX model ->Convert the ONNX model to a TNN model ->C/C++ deploy the TNN model.

After training the Pytorch model, we need to convert the model to an ONNX model for subsequent model deployment.

  • The original project provides a conversion script. You only need to modify model_file to be your model path.
  •  convert_torch_to_onnx.py implements the script to convert Pytorch model to ONNX model
python libs/convert_tools/convert_torch_to_onnx.py
"""
This code is used to convert the pytorch model into an onnx format model.
"""
import os
import torch.onnx
from pose.inference import PoseEstimation
from basetrainer.utils.converter import pytorch2onnx
 
 
def load_model(config_file, model_file, device="cuda:0"):
    pose = PoseEstimation(config_file, model_file, device=device)
    model = pose.model
    config = pose.config
    return model, config
 
 
def convert2onnx(config_file, model_file, device="cuda:0", onnx_type="kp"):
    """
    :param model_file:
    :param input_size:
    :param device:
    :param onnx_type:
    :return:
    """
    model, config = load_model(config_file, model_file, device=device)
    model = model.to(device)
    model.eval()
    model_name = os.path.basename(model_file)[:-len(".pth")]
    onnx_file = os.path.join(os.path.dirname(model_file), model_name + ".onnx")
    # dummy_input = torch.randn(1, 3, 240, 320).to("cuda")
    input_size = tuple(config.MODEL.IMAGE_SIZE)  # w,h
    input_shape = (1, 3, input_size[1], input_size[0])
    pytorch2onnx.convert2onnx(model,
                              input_shape=input_shape,
                              input_names=['input'],
                              output_names=['output'],
                              onnx_file=onnx_file,
                              opset_version=11)
 
 
if __name__ == "__main__":
    model_file = "../../work_space/hand/mobilenet_v2_21_192_192_custom_coco_20230928_065444_0934/model/best_model_153_0.7574.pth"
    config_file = "../../work_space/hand/mobilenet_v2_21_192_192_custom_coco_20230928_065444_0934/mobilenetv2_hand_192_192.yaml"
    convert2onnx(config_file, model_file)

(3) Convert ONNX model to TNN model

Currently, there are many deployment methods for CNN models. You can use deployment tools such as TNN, MNN, NCNN, and TensorRT. I use TNN for C/C++ terminal deployment.

TNN conversion tool:

​​​​

4. Hand key point detection C/C++ deployment

The project IDE development tool uses CLion. The relevant dependent libraries mainly include OpenCV, base-utils, TNN and OpenCL (optional). OpenCV must be installed, OpenCL is used for model acceleration, base-utils and TNN have been configured and do not need to be installed;

The project is only tested on Ubuntu18.04. Please configure the development environment yourself under Windows system.

(1) Project structure

(2) Configure the development environment (OpenCV+OpenCL+base-utils+TNN)

The project IDE development tool uses CLion. The relevant dependent libraries mainly include OpenCV, base-utils, TNN and OpenCL (optional). OpenCV must be installed, OpenCL is used for model acceleration, base-utils and TNN have been configured and do not need to be installed;

The project is only tested on Ubuntu18.04. Please configure and compile it yourself under Windows system.

  • Installing OpenCV: Image Processing

Image processing (such as reading pictures, image cropping, etc.) requires the use of the OpenCV library for processing

Installation tutorial:Ubuntu18.04 installation of opencv and opencv_contrib

The OpenCV library uses opencv-4.3.0 version. The opencv_contrib library is not used for the time being and does not need to be installed.

  • Installing OpenCL: Model Acceleration

 Installation tutorial:Ubuntu16.04 Installing OpenCV&OpenCL

OpenCL is used for model GPU acceleration. If OpenCL is not used for model inference acceleration, pure C++ inference model will be extremely slow.

  • base-utils: C++ library

GitHub: https://github.com/PanJinquan/base-utils (no installation required, the project has been configured)

base_utils is a C++ library commonly used for personal development, integrating common algorithms such as C/C++ OpenCV.

  • TNN: model inference

GitHub: https://github.com/Tencent/TNN (No installation required, the project has been configured)

The high-performance, lightweight neural network inference framework open sourced by Tencent Youtu Lab also has many outstanding advantages such as cross-platform, high performance, model compression, and code tailoring. The TNN framework further strengthens the support and performance optimization of mobile devices on the basis of the original Rapidnet and ncnn frameworks. It also draws on the high performance and good scalability characteristics of the industry's mainstream open source frameworks to expand support for background X86 and NV GPUs. Mobile-side TNN has been implemented in many applications such as mobile QQ, Weishi, PTu, etc. Server-side TNN, as the basic acceleration framework of Tencent Cloud AI, has provided accelerated support for the implementation of many businesses.

(3) Deploy TNN model

The project implements the C/C++ version of hand key point detection. Hand detection uses YOLOv5 and hand key point detection uses the HRNet model. Model reasoning uses the TNN deployment framework (supports multi-threaded CPU and GPU accelerated reasoning); image processing uses OpenCV. The library and model acceleration use OpenCL, which can achieve real-time processing on ordinary devices.

If you want to deploy your own trained model in this Demo, you can convert the trained Pytorch model to ONNX, then convert it to a TNN model, and then replace the original model with your own TNN model.

(4) CMake configuration

This is CMakeLists.txt, which contains the main configurationOpenCV+OpenCL+base-utils+TNNThese four libraries are available for Windows systems. Configure and compile yourself

cmake_minimum_required(VERSION 3.5)
project(Detector)

add_compile_options(-fPIC) # fix Bug: can not be used when making a shared object
set(CMAKE_CXX_FLAGS "-Wall -std=c++11 -pthread")
#set(CMAKE_CXX_FLAGS_RELEASE "-O2 -DNDEBUG")
#set(CMAKE_CXX_FLAGS_DEBUG "-g")

if (NOT CMAKE_BUILD_TYPE AND NOT CMAKE_CONFIGURATION_TYPES)
    # -DCMAKE_BUILD_TYPE=Debug
    # -DCMAKE_BUILD_TYPE=Release
    message(STATUS "No build type selected, default to Release")
    set(CMAKE_BUILD_TYPE "Release" CACHE STRING "Build type (default Debug)" FORCE)
endif ()

# opencv set
find_package(OpenCV REQUIRED)
include_directories(${OpenCV_INCLUDE_DIRS} ./src/)
#MESSAGE(STATUS "OpenCV_INCLUDE_DIRS = ${OpenCV_INCLUDE_DIRS}")

# base_utils
set(BASE_ROOT 3rdparty/base-utils) # 设置base-utils所在的根目录
add_subdirectory(${BASE_ROOT}/base_utils/ base_build) # 添加子目录到build中
include_directories(${BASE_ROOT}/base_utils/include)
include_directories(${BASE_ROOT}/base_utils/src)
MESSAGE(STATUS "BASE_ROOT = ${BASE_ROOT}")


# TNN set
# Creates and names a library, sets it as either STATIC
# or SHARED, and provides the relative paths to its source code.
# You can define multiple libraries, and CMake buil ds it for you.
# Gradle automatically packages shared libraries with your APK.
# build for platform
# set(TNN_BUILD_SHARED OFF CACHE BOOL "" FORCE)
if (CMAKE_SYSTEM_NAME MATCHES "Android")
    set(TNN_OPENCL_ENABLE ON CACHE BOOL "" FORCE)
    set(TNN_ARM_ENABLE ON CACHE BOOL "" FORCE)
    set(TNN_BUILD_SHARED OFF CACHE BOOL "" FORCE)
    set(TNN_OPENMP_ENABLE ON CACHE BOOL "" FORCE)  # Multi-Thread
    #set(TNN_HUAWEI_NPU_ENABLE OFF CACHE BOOL "" FORCE)
    add_definitions(-DTNN_OPENCL_ENABLE)           # for OpenCL GPU
    add_definitions(-DTNN_ARM_ENABLE)              # for Android CPU
    add_definitions(-DDEBUG_ANDROID_ON)            # for Android Log
    add_definitions(-DPLATFORM_ANDROID)
elseif (CMAKE_SYSTEM_NAME MATCHES "Linux")
    set(TNN_OPENCL_ENABLE ON CACHE BOOL "" FORCE)
    set(TNN_CPU_ENABLE ON CACHE BOOL "" FORCE)
    set(TNN_X86_ENABLE OFF CACHE BOOL "" FORCE)
    set(TNN_QUANTIZATION_ENABLE OFF CACHE BOOL "" FORCE)
    set(TNN_OPENMP_ENABLE ON CACHE BOOL "" FORCE)  # Multi-Thread
    add_definitions(-DTNN_OPENCL_ENABLE)           # for OpenCL GPU
    add_definitions(-DDEBUG_ON)                    # for WIN/Linux Log
    add_definitions(-DDEBUG_LOG_ON)                # for WIN/Linux Log
    add_definitions(-DDEBUG_IMSHOW_OFF)            # for OpenCV show
    add_definitions(-DPLATFORM_LINUX)
elseif (CMAKE_SYSTEM_NAME MATCHES "Windows")
    set(TNN_OPENCL_ENABLE ON CACHE BOOL "" FORCE)
    set(TNN_CPU_ENABLE ON CACHE BOOL "" FORCE)
    set(TNN_X86_ENABLE ON CACHE BOOL "" FORCE)
    set(TNN_QUANTIZATION_ENABLE OFF CACHE BOOL "" FORCE)
    set(TNN_OPENMP_ENABLE ON CACHE BOOL "" FORCE)  # Multi-Thread
    add_definitions(-DTNN_OPENCL_ENABLE)           # for OpenCL GPU
    add_definitions(-DDEBUG_ON)                    # for WIN/Linux Log
    add_definitions(-DDEBUG_LOG_ON)                # for WIN/Linux Log
    add_definitions(-DDEBUG_IMSHOW_OFF)            # for OpenCV show
    add_definitions(-DPLATFORM_WINDOWS)
endif ()
set(TNN_ROOT 3rdparty/TNN)
include_directories(${TNN_ROOT}/include)
include_directories(${TNN_ROOT}/third_party/opencl/include)
add_subdirectory(${TNN_ROOT}) # 添加外部项目文件夹
set(TNN -Wl,--whole-archive TNN -Wl,--no-whole-archive)# set TNN library
MESSAGE(STATUS "TNN_ROOT = ${TNN_ROOT}")

# Detector
include_directories(src)
set(SRC_LIST
        src/Interpreter.cpp
        src/pose_detector.cpp
        src/object_detection.cpp
        src/pose_filter.cpp
        src/yolov5.cpp
        )
add_library(dlcv SHARED ${SRC_LIST})
target_link_libraries(dlcv ${OpenCV_LIBS} base_utils)
MESSAGE(STATUS "DIR_SRCS = ${SRC_LIST}")
add_executable(Detector src/main.cpp)
target_link_libraries(Detector dlcv ${TNN} -lpthread)


(5) main source code

The implementation of function main in the main program provides the method of hand key point detection, supportingpicture, video and camera testing

  •     Test_image_file(); // Test image file
  •     Test_video_file(); // Test video file
  •     Test_camera(); //Test camera
//
// Created by [email protected] on 2020/6/3.
//

#include "pose_detector.h"
#include "object_detection.h"
#include "yolov5.h"
#include "Types.h"
#include <iostream>
#include <string>
#include <vector>
#include "file_utils.h"
#include "image_utils.h"

using namespace dl;
using namespace vision;
using namespace std;

const int num_thread = 1; // 开启CPU线程数目
DeviceType device = GPU;  // 选择运行设备CPU/GPU

// 目标检测SSD或者YOLOv5
const float scoreThresh = 0.5;
const float iouThresh = 0.3;
//const char *det_model_file = (char *) "../data/tnn/ssd/rfb1.0_person_320_320_sim.opt.tnnmodel";
//const char *det_proto_file = (char *) "../data/tnn/ssd/rfb1.0_person_320_320_sim.opt.tnnproto";
//ObjectDetectionParam model_param = PERSON_MODEL;//模型参数
//ObjectDetection *detector = new ObjectDetection(det_model_file, det_proto_file, model_param, num_thread, device);

const char *det_model_file = (char *) "../data/tnn/yolov5/yolov5s05_320.sim.tnnmodel";
const char *det_proto_file = (char *) "../data/tnn/yolov5/yolov5s05_320.sim.tnnproto";
YOLOv5Param dets_model_param = YOLOv5s05_320;//模型参数
YOLOv5 *detector = new YOLOv5(det_model_file,
                              det_proto_file,
                              dets_model_param,
                              num_thread,
                              device);
// 关键点检测
const float poseThresh = 0.3;
const char *pose_model_file = (char *) "../data/tnn/pose/litehrnet18_192_192.sim.tnnmodel";
const char *pose_proto_file = (char *) "../data/tnn/pose/litehrnet18_192_192.sim.tnnproto";
PoseParam pose_model_param = HAND_PARAM;//模型参数
PoseDetector *pose = new PoseDetector(pose_model_file, pose_proto_file, pose_model_param, num_thread, device);

void test_image_file() {
    //测试图片的目录
    string image_dir = "../data/test_image";
    std::vector<string> image_list = get_files_list(image_dir);
    for (string image_path:image_list) {
        cv::Mat bgr = cv::imread(image_path);
        if (bgr.empty()) continue;
        FrameInfo resultInfo;
        // 进行目标检测
        detector->detect(bgr, &resultInfo, scoreThresh, iouThresh);
        // 进行关键点检测
        pose->detect(bgr, &resultInfo, poseThresh);
        // 可视化代码
        pose->visualizeResult(bgr, resultInfo, pose_model_param.skeleton, false, 0);
    }

    delete detector;
    detector = nullptr;
    delete pose;
    pose = nullptr;
    printf("FINISHED.\n");
}


/***
 * 测试视频文件
 * @return
 */
int test_video_file() {
    //测试视频文件
    string video_file = "../data/video/video-test.mp4";
    cv::VideoCapture cap;
    bool ret = get_video_capture(video_file, cap);
    cv::Mat frame;
    while (ret) {
        cap >> frame;
        if (frame.empty()) break;
        FrameInfo resultInfo;
        // 进行目标检测
        detector->detect(frame, &resultInfo, scoreThresh, iouThresh);
        // 进行关键点检测
        pose->detect(frame, &resultInfo, poseThresh);
        // 可视化代码
        pose->visualizeResult(frame, resultInfo, pose_model_param.skeleton, false, 5);
    }
    cap.release();

    delete detector;
    detector = nullptr;
    delete pose;
    pose = nullptr;
    printf("FINISHED.\n");
    return 0;

}


/***
 * 测试摄像头
 * @return
 */
int test_camera() {
    int camera = 0; //摄像头ID号(请修改成自己摄像头ID号)
    cv::VideoCapture cap;
    bool ret = get_video_capture(camera, cap);
    cv::Mat frame;
    while (ret) {
        cap >> frame;
        if (frame.empty()) break;
        FrameInfo resultInfo;
        // 进行目标检测
        detector->detect(frame, &resultInfo, scoreThresh, iouThresh);
        // 进行关键点检测
        pose->detect(frame, &resultInfo, poseThresh);
        // 可视化代码
        pose->visualizeResult(frame, resultInfo, pose_model_param.skeleton, false, 5);
    }
    cap.release();
    delete detector;
    detector = nullptr;
    delete pose;
    pose = nullptr;
    printf("FINISHED.\n");
    return 0;

}

/***
 * 测试跟踪效果
 * @return
 */
int test_pose_track() {
    //测试视频文件
    string video_file = "../data/video/video-test.mp4";
    cv::VideoCapture cap;
    bool ret = get_video_capture(video_file, cap);
    cv::Mat frame;
    // 指定需要跟踪(滤波)的关键点,目前仅仅支持单目标的关键点跟踪,多目标不支持,会出现异常
    vector<int> filter_id = {0};
    // 初始化跟踪
    pose->initTrack(filter_id, 20, 0.35);
    while (ret) {
        cap >> frame;
        if (frame.empty()) break;
        FrameInfo resultInfo;
        // 进行目标检测
        detector->detect(frame, &resultInfo, scoreThresh, iouThresh);
        // 进行关键点检测和跟踪
        pose->track(frame, &resultInfo, poseThresh);
        // 可视化代码
        pose->visualizeResult(frame, resultInfo, pose_model_param.skeleton, false, 5);
    }
    cap.release();

    delete detector;
    detector = nullptr;
    delete pose;
    pose = nullptr;
    printf("FINISHED.\n");
    return 0;

}


int main() {
    test_image_file();   // 测试图片文件
    test_video_file();   // 测试视频文件
    test_camera();       //测试摄像头
    return 0;
}

(6) Source code compilation and operation

Compile script, or directly: bash build.sh

#!/usr/bin/env bash
if [ ! -d "build/" ];then
  mkdir "build"
else
  echo "exist build"
fi
cd build
cmake ..
make -j4
sleep 1
./Detector

  • If you want to test the performance of CPU operation, please modify src/main.cpp

DeviceType device = CPU;

  • If you want to test the performance of GPU operation, please modify src/main.cpp (OpenCL needs to be configured) 

DeviceType device = GPU;

PS: The pure CPU C++ inference mode is time-consuming and takes several seconds. However, after turning on OpenCL acceleration, the GPU mode takes only a dozen milliseconds and the performance is greatly improved.

(7) Demo test effect 

 The results of the C++ version and the Python version are almost the same. The following is a demonstration of the hand key point detection effect:

 


5. Project source code download

C/C++ implements hand key point detection project source code download address:C++ implements hand key point detection (hand posture estimation) source code download 

The complete set of project source code includes:

  1. C/C++ source code supports YOLOv5 hand detection
  2. C/C++ source code provides a high-precision version of HRNet hand key point detection
  3. C/C++ source code provides lightweight models LiteHRNet and Mobilenet-v2 hand key point detection
  4. The C/C++ source code supports CPU and GPU. Turning on the GPU (OpenCL) can detect and identify in real time (pure CPU inference is very slow, model acceleration requires OpenCL to be configured, and GPU inference takes about 15ms)
  5. C/C++ source code Demo supports image, video, and camera testing
  6. The project is configured with base-utils and TNN, but OpenCV and OpenCL need to be compiled and installed by themselves.

 Android hand key point detection APP Demo experience:

https://download.csdn.net/download/guyuealian/88418582

If you need the training code for hand key point detection, please refer to:Hand key point detection 3: Pytorch implements hand key point detection (hand posture estimation) including Training code and dataset

Guess you like

Origin blog.csdn.net/guyuealian/article/details/133277748