Human gesture recognition based on Python and deep learning technology

Human pose recognition is an important application in the field of computer vision. It can accurately judge the pose and movement of the human body by identifying the key points and joint positions of the human body. This technology can be applied in many fields, such as sports training, medical rehabilitation, security monitoring, etc., which brings great convenience and benefits to people's life and work.

Human gesture recognition based on Python and deep learning technology

In this article, we will introduce a human body pose recognition method based on Python and deep learning technology. By reading pictures or video streams captured by cameras, OpenCV and Tensorflow models are used to detect human body poses in images, and finally output a Image, marking the position of each key point and joint of the body, as well as the connection between the joints.

Preparation

Before starting to implement human gesture recognition, we need to prepare some necessary tools and materials. First, you need to install the Python environment and related libraries, such as OpenCV, Tensorflow, Numpy, etc. Second, you need to download a pre-trained Tensorflow model for detecting human poses in images. Finally, a picture or a video stream captured by a camera is required as input data.

Load the model and define parameters

Before loading the model, we need to define some constants and parameters, such as the number of body parts, joint connections, etc. The specific code is as follows:

import cv2
import numpy as np

# 定义身体部位编号
body_parts = {
    
    
    0: "Nose",
    1: "Neck",
    2: "Right Shoulder",
    3: "Right Elbow",
    4: "Right Wrist",
    5: "Left Shoulder",
    6: "Left Elbow",
    7: "Left Wrist",
    8: "Right Hip",
    9: "Right Knee",
    10: "Right Ankle",
    11: "Left Hip",
    12: "Left Knee",
    13: "Left Ankle",
    14: "Right Eye",
    15: "Left Eye",
    16: "Right Ear",
    17: "Left Ear"
}

# 定义关节连线
pose_parts = [
    [0, 1], [1, 2], [2, 3], [3, 4], [1, 5], [5, 6],
    [6, 7], [1, 8], [8, 9], [9, 10], [1, 11], [11, 12], [12, 13],
    [0, 14], [14, 16], [0, 15], [15, 17]
]

# 加载预训练的 Tensorflow 模型
net = cv2.dnn.readNetFromTensorflow("graph_opt.pb")

Read pictures or video streams

Next, we need to read the picture or the video stream captured by the camera as input data. The specific code is as follows:

# 读取图片或者视频流
image = cv2.imread("test.jpg")
# cap = cv2.VideoCapture(0)

If you want to read the video stream, you can cancel the comment and set the parameter to the corresponding camera number.

process image data

Before feeding the image data into the model, we need to do some processing on the image to convert it into the input format required by the neural network. The specific code is as follows:

python
# 处理图像数据
blob = cv2.dnn.blobFromImage(image, 1.0, (368, 368), (0, 0, 0), swapRB=False, crop=False)
net.setInput(blob)
output = net.forward()

Among them, the cv2.dnn.blobFromImage function is used to convert the image into the input format required by the neural network, that is, to scale the image to the specified size, subtract the mean value, and perform normalization and other operations. The net.setInput function is used to set the input data of the neural network. The net.forward function is used for forward propagation calculations to obtain the coordinates of key points.

Draw key points and joint lines

After getting the coordinates of key points, we need to draw them for observation and analysis. The specific code is as follows:

# 绘制关键点和关节连线
points = []
for i in range(len(body_parts)):
    # 获取可信度
    prob = output[0, i, 2]
    # 判断可信度是否达到阈值
    if prob > 0.5:
        # 获取关键点坐标
        x = int

full code

import cv2
# 关节标识

body_parts={
    
    "Nose":0,"Neck":1,
            "RShoulder":2,"RElbow":3,"RWrist":4,
            "LShoulder": 5, "LElbow": 6, "LWrist": 7,
            "RHip":8,"RKnee":9,"RAnkle":10,
            "LHip":11,"LKnee":12,"LAnkle":13,
            "REye":14,"LEye":15,
            "REar":16,"LEar":17
            }
#关节连线
pose_parts=[
    ["Neck","RShoulder"],["Neck","LShoulder"],
    ["RShoulder","RElbow"],["RElbow","RWrist"],
    ["LShoulder","LElbow"],["LElbow","LWrist"],
    ["Neck","RHip"],["RHip","RKnee"],["RKnee","RAnkle"],
    ["Neck","LHip"], ["LHip","LKnee"], ["LKnee","LAnkle"],
    ["Neck","Nose"],
    ["Nose","REye"], ["REye","REar"],
    ["Nose","LEye"], ["LEye","LEar"]
]
cap=cv2.VideoCapture("a2.jpg")
# cap=cv2.VideoCapture(0,cv2.CAP_DSHOW)
# 加载模型
net=cv2.dnn.readNetFromTensorflow("pose.pb")
while cv2.waitKey(1)<0:
    ok,frame=cap.read()
    if not ok:
        cv2.waitKey()
        break
    width=frame.shape[1]
    height=frame.shape[0]
    net.setInput(cv2.dnn.blobFromImage(
        frame,1.0,(368,368),(127,127,127),swapRB=True,crop=False
    ))
    out=net.forward()
    out=out[:,:19,:,:]
    # print(out)
    points=[]
    for i in range(len(body_parts)):
        heatmap=out[0,i,:,:]
        _,conf,_,point=cv2.minMaxLoc(heatmap)
        x=(width*point[0])/out.shape[3]
        y=(height*point[1])/out.shape[2]
        points.append((int(x),int(y))if conf>0.2 else None)
    # print(points)
    for p in pose_parts:
        partfrom = p[0]
        partto=p[1]
        idfrom=body_parts[partfrom]
        idto=body_parts[partto]
        if points[idfrom] and points[idto]:
            # 画线
            cv2.line(frame,points[idfrom],points[idto],(0,200,0),3)
            # 画点
            cv2.ellipse(frame,points[idfrom],(3,3),0,0,360,(0,0,200),cv2.FILLED)
            cv2.ellipse(frame, points[idto], (3, 3), 0, 0, 360, (0, 0, 200), cv2.FILLED)
    cv2.imshow("",frame)

Effect

insert image description here

Guess you like

Origin blog.csdn.net/qq_46556714/article/details/130952146