Introduction to the LightWeightOpenPose framework and the process of stepping on the pit


Author:qyan.li

Date:2022.4.17

Topic: Introduction to the LightWeightOpenpose framework and the use of stepping pit records

I. Introduction

​ After nearly two weeks of busy work, the course design has finally been completed in the midst of stumbling, and there are really countless pitfalls in it. I promised at the time that after completion, I must write a blog post to record it.

​ The course design is about the direction of gesture recognition, so it is necessary to use a certain framework to realize the detection of key points of the human body and the drawing of the skeleton diagram ; at the beginning, I wanted to use OpenPosethe framework, but I encountered too many problems halfway, so I ran for three days in a daze I didn’t run out, so I had no choice but to switch to other lightweight or easy-to-call frameworks. The teammates next door choose MediaPipe, but I chooseLightWeightOpenPose

​ The ideas and solutions mentioned in this blog post are for reference only (according to my method, the code can definitely run through, but it is not necessarily the easiest method). At the same time, this blog post does not involve any theoretical explanations. I don’t understand much about it, there will be some small changes to the code in the later part, you can refer to it.

2. LightWeightOpenPoseIntroduction to the framework

​ I LightWeightOpenPosedon’t have a particularly deep understanding of , and the reason for choosing it is more because it is the first thing I see; the following introduction comes from an article on Zhihu, you can refer to it:

​On the basis of , a lightweight version is proposed. Compared with the second-order one LightWeightOpenPose, its parameter amount is only 15%, but the performance is almost the same (the accuracy is reduced by 1%). Most importantly, its model can be achieved onIntelOpenPoseOpenPoseCPU26fps

References: Lightweight OpenPose, Lightweight OpenPose - Zhihu (zhihu.com)

Small Tips:

  1. When the framework is running on my computer, the speed is far from reaching 26fps. According to my observation, the input video is about a few frames per second.
  2. Compared with the one next door MediaPipe, the performance gap is still quite large. At least from my observation, the performance and effect are not as good as the one next door.MediaPipe
  3. Compared with MediaPipe, the advantage of LightWeight is that it can detect key points of multiple people, and at the same time, it can select people with a rectangular frame, which is still very important for some situations

3. LightWeightOpenPoseFramework operation:

githubAddress link: LightWeightOpenpose framework

Focus on:

​ Suppose you, like me, do not want to understand the specific principles of its internal algorithm and the specific process of training and reasoning. You only need the following steps to run the code:

  • Configure the specific environment required by the framework: seeGitHub

  • pullUnder the framework code, download the pre-training model: checkpoint_iter_370000.pthfile

    Download address: https://download.01.org/opencv/openvino_training_extensions/models/human_pose_estimation/checkpoint_iter_370000.pth

  • Command line execution cmdcommand:python demo.py --checkpoint-pth checkpoint_iter_370000.pth --images xx.jpg xx.jpg

    SmallTips:

    1. The above code needs to ensure that you have entered the corresponding file path
    2. --images xx.jpg xx.jpgIn order to recognize the execution command of the picture, the recognition video is changed to--video xx.mp4
    3. When identifying multiple pictures, add them directly after the command, separated by spaces

​ Seeing this, some students may not understand why it is implemented in this way. In fact, these things are githubwritten clearly on the website. It is written here just to prevent some students from being unable to understand or not interested in reading.

​ Parameter setting issues can demo.pybe viewed in the file:

### demo.py
parser = argparse.ArgumentParser(
        description='''Lightweight human pose estimation python demo.
                       This is just for quick results preview.
                       Please, consider c++ demo for the best performance.''')
    parser.add_argument('--checkpoint-path', type=str, required=True, help='path to the checkpoint')
    parser.add_argument('--height-size', type=int, default=256, help='network input layer height size')
    parser.add_argument('--video', type=str, default='', help='path to video file or camera id')
    parser.add_argument('--images', nargs='+', default='', help='path to input image(s)')
    parser.add_argument('--cpu', action='store_true', help='run network inference on cpu')
    parser.add_argument('--track', type=int, default=1, help='track pose id in video')
    parser.add_argument('--smooth', type=int, default=1, help='smooth pose keypoints')
    args = parser.parse_args()

​ Fundamentally speaking, all the above parameters can be set. Here we only mention two parameters that must be set if you want to run the code:checkpoint-path 和images或video

Fourth, LightWeightOpenposethe frame stepping on the pit

  • GPUquestion:

    ​ Suppose you are a graduate student with laboratory resources or GPUa self-respecting undergraduate student, you can skip this one; but suppose you have nothing and want to run through this framework:

    ​ Please delete demo.py:

    if not cpu:
    	net = net.cuda()
    

    ​ But assuming that the student party who prostitutes Colab for nothing, please delete it demo.py, cv2.imshow('Lightweight Human Pose Estimation Python Demo', img)and add the function of saving the processed image by yourself. imwriteFor the specific reason, please refer to another blog post: (13 messages) Free GPU and Colab use stepping pit tutorial_ Next Door Li Xuechang's Blog-CSDN Blog

  • Image or video saving issues:

    ​ There is no code for saving the processed image in the original code, so you need to add it yourself. It is very simple, a code:

    cv2.imwrite('./saveTest.PNG',img) ## img为imshow函数传入的img
    

    For video storage issues:

    ​ I haven't thought of a good way to save it yet, but in a later blog post I should update how to combine several frames of images into a video

  • Excess image detection problem:

    ​ This problem is actually caused by the way the code is executed. As mentioned above: LightWeightOpenPosethe code execution method for image annotation with the help of the framework ispython demo.py --checkpoint-path checkpoint_iter_370000.pth --images xx.jpg xx.jpg

    ​ With the help of this method to execute the code, the amount of pictures is small, you can manually input, such as two or three, but what if the amount of pictures is very large? In the project, there are more than 10,000 images in our data set after processing. At this time, manual input is definitely unrealistic, so we must consider some means to implement command line commands.

    ​ We couldn't do this job, mainly because the workload was too large, but the program likes this kind of situation, especially, it pythonlikes to do this kind of work. This is its strength and an important aspect of its circle. The following is to write the code:

    ### 获取文件名列表
    import os
    FileNameLst = os.listdir("Your File Path")
    
    ### command命令生成
    with open('./Test.txt','a',encoding = 'utf-8') as f:
        '''Test.txt文件为最终存放cmd命令的文件'''
        f.write('python demo.py --checkpoint-path checkpoint_iter_370000.pth --images ')
        '''FileNameLst为一万张图片文件名列表,借助于os的listdir命令获取'''
        for item in FileNameLst:
            '''images为存放图片的文件夹'''
            f.write("./images/" + str(item) + " ")
    

    This is a dividing line. Some students may not encounter the following problems (due to the number of data sets)

  • Command line character limit problem:

    ​ With the help of the command generation realized by the above program command, we are happy to execute the command line command, but another problem arises:

    ​ When pasting the command generated by the above program to the command line, you will find that the command is always truncated. At first I thought it was an incomplete copy and paste, but later found that it was the same every time. After consulting the information, I found the problem: command line There is a character limit , and you can check the information by yourself for the specific limit:

    ​ So if a problem arises, it must be solved. What should we do? Since the command is too long, let’s shorten it, do it in batches, don’t process them all at once, so how many pictures can be recognized at a time? sizeHow is this certain?

    ​ Based on the number of my character files and the file command method, when I execute the above-mentioned incomplete cmd command, I find that the result of each execution is 352 pictures, which means that executing 350 pictures at one time will cost me more than OK10,000 The images are divided into 33 batches for execution.

  • The cmd command executes complex issues:

    ​ So far, all obstacles that hinder the running of our program have been solved. Let’s talk about a problem of simplifying the complexity of code execution; the above-mentioned 33 commands are generated. Suppose we copy and paste them to the command line to execute manually every time, which is a bit too inefficient. , then we are wondering if there is any way to simplify the execution of the cmd command, after checking, the method of generating a Bat file.

    Bat​There are a lot of explanation materials on the Internet

    @echo off
    C:
    cd C:/Users\腻味\Desktop\PoseData\mpii_human_pose_v1  ## 目标路径
    start python demo.py --checkpoint-path xx.pth --images xx.jpg ## cmd命令
    exit
    

    Ok, and another idea to solve the problem was reached, and the rest came to pythonthe home field:

    ### 生成前32个Bat文件
    
    '''FileNameLst为一万张图片的文件名列表'''
    print(len(FileNameLst)) # 11715/350 = 33.47
    
    for i in range(33):
        with open('./Test' + str(i) + '.bat','a',encoding = 'utf-8') as f:
            f.write('@echo off' + '\n')
            f.write('C:' + '\n')
            f.write('cd C:/Users\腻味\Desktop\PoseData\mpii_human_pose_v1')
            f.write('\n')
            f.write('start ')
            f.write('python demo.py --checkpoint-path checkpoint_iter_370000.pth --images ')
            for j in range(i*350,(i+1)*350):
                f.write("./images/" + str(FileNameLst[j]) + " ")
            f.write('\n')
            f.write('exit')
            
    ### 生成最后一个Bat文件
    with open('./Test33.bat','a',encoding = 'utf-8') as f:
        f.write('@echo off' + '\n')
        f.write('C:' + '\n')
        f.write('cd C:/Users\腻味\Desktop\PoseData\mpii_human_pose_v1' + '\n')
        f.write('start ')
        f.write('python demo.py --checkpoint-path checkpoint_iter_370000.pth --images ')
        for m in range(33*350,len(FileNameLst)):
            f.write("./images/" + str(FileNameLst[m]) + " ")
        f.write('\n')
        f.write('exit')
    
  • Process 33 Batfiles in parallel Problem:

    ​ When searching for Batfiles, you will usually see the keyword: batch execution . Seeing this, I was thinking: it is really difficult to click on the Bat files one by one, so I simply generate an extra Batfile and put all 33 small Batfiles in it , When it is finally executed, just click on the big Bat file and it will all be solved.

    ​ Of course, I also turned my ideas into reality, without exception, also with the help of programs, but in the final execution, I found that when all command lines exited and executed, only 2000 pictures were generated, and the simplified exploration failed.

    ​ Considering the problem by myself, the computer should not support processing so many programs at the same time, but this provides us with an idea. When 33 files are executed, several more can be executed at the same time ( I was 4 at the time), and there is no need for linear execution to improve efficiency.

    ​ Of course, Batwhen the file is written, it should support linear or parallel execution, or execute several at the same time, and execute others after execution, but I have not explored it in depth. Those who are interested can check it out by themselves.

    Summary: After writing here, I can’t write anymore, the problem will always be solved, sometimes when I reach a dead end, I can change my mind! ! !


​ I can’t write anymore! ! ! I will write again when I have time, and I will update some small changes to the LightWeightOpenPose framework code to output pictures in different forms 2022/04/17 Sunday


Time: 2022/04/17 Sunday, there is not much left, no delay! ! ! Finished today! !


5. LightWeightOpenPoseFine-tuning the output style:

​ First of all, let me talk about why we need to fine-tune the output style. The original output of the LightWeightOpenPose framework is to draw a human skeleton map on the original image and output it. We are looking forward to whether we can make some changes to the output image:

  1. Remove the background and output a pure skeleton map
  2. Bold the human skeleton lines on the original image to improve the classification effect of subsequent models

​ Since the codes of key point detection and skeletal diagram drawing are concentrated in the run_demo function of demo.py, we will mainly adjust the code in run_demo later

  • LightWeightOpenPose outputs pure skeleton image

    ## pose相当于原图+曲线点的集合,img相当于一张底色的图片
    for pose in current_poses:
    	pose.draw(img) # 在原有的图像上绘制pose的线条
    

    The above code completes the operation of drawing the skeleton map on the original image

    1. poseIt mainly saves the key point information and connection mode of the human body
    2. Realize drawing with the help poseof custom methods in the class , representing the original picturedrawimg

    ​ Combining the above code implementation, you can get the method of outputting a pure bone image: manually generate a solid color image with the same size as the original image, and use it as an drawinternal parameter passed in to the function to complete the output of a pure bone image

    ## LightWeightOpenPose输出纯骨骼图像
    t = list(img.shape)
    	tempImage = np.zeros((int(t[0]),int(t[1]),3)) # 其中必须传入tuple
        for pose in current_poses:
        	pose.draw(tempImage) # 在新生成的自定义图像上绘制pose的线条
    
  • Line adjustment of LightWeightOpenPose output graph

    ​ If you observe the run_demo function carefully, you will find that there is such a function: addWeightedfunction, as can be seen from the name of the function, this function can be used to adjust the relative ratio between two things, you can try to modify the following code. parameter:

    img = cv2.addWeighted(orig_img, 0.1, img, 0.9, 0)
    

    ​ After the modification, it is found that the result of the parameter modification will bring about a change in the thickness of the lines on the image. After consulting the relevant information, it is known that the addWeighted function can fuse the two images according to a certain ratio, which is not difficult to explain why the parameters can be used. Adjust to change line thickness

  • Summarize:

    ​ What are the effects of the above adjustments? In fact, my original intention of making the above adjustments is to enhance the classification effect of my follow-up classifier. We will build a classifier by ourselves in the later stage, hoping to improve the data set and enhance the classification effect by making the above adjustments. Theoretically speaking, it should be possible to enhance it, but due to time constraints, no attempt was made.

Guess you like

Origin blog.csdn.net/DALEONE/article/details/124231940