Realize gesture recognition with python in 10 minutes

Environmental preparation

①Baidu obtains SDK

Search Baidu Cloud in your browser. If you have not registered, please register first, then log in and click on the management console. Click on Product Services→Artificial Intelligence→Human Body Analysis on the left. Click Create Application, enter the application name such as "Baidu_OCR", select the purpose such as "Learning Office", and finally make a simple application description, then click "Create Now". A list of applications will appear, including AppID, API Key, Secret Key and other information, which will be used later. Insert picture description here
After clicking in, check the required api
Insert picture description here

②The required library

The whole program is implemented by python, and the third-party libraries included in the environment include cv2, threading, time, playsound, and baidu-aip. Students who do not have these libraries can enter cmd in win+R to enter the command line terminal pip install library name.

Process steps

①Turn on the camera function

Here we use cv2 to turn on the camera.

	capture = cv2.VideoCapture(0)#0为默认摄像头
	def camera():
	
	    while True:
	        #获得图片
	        #第一个参数ret 为True 或者False,代表有没有读取到图片
			#第二个参数frame表示截取到一帧的图片
	        ret, frame = capture.read()
	        # cv2.imshow(窗口名称, 窗口显示的图像)
	        #显示图片
	        cv2.imshow('frame', frame)
	        if cv2.waitKey(1) == ord('q'):
	            break

②Gesture recognition

Gesture recognition is implemented by calling Baidu's api. First of all, we get a frame of picture through the camera, after format conversion, it is passed to Baidu's gesture function as a parameter .

def gesture_recognition():
    
    #第一个参数ret 为True 或者False,代表有没有读取到图片

    #第二个参数frame表示截取到一帧的图片
    
    while True:
        try:
            ret, frame = capture.read()

            #图片格式转换
            image = cv2.imencode('.jpg',frame)[1]
            
            gesture =  gesture_client.gesture(image)   #AipBodyAnalysis内部函数
            #获得手势名称
            words = gesture['result'][0]['classname']
            #语音播报
            voice(hand[words])
            print(hand[words])
            
        except:
            voice('识别失败')
        if cv2.waitKey(1) == ord('q'):
            break

③Voice broadcast

After the gesture recognition, we just output the recognition result on the window. The effect is not so gorgeous. Can we broadcast the recognition result in voice? The answer is yes. The playsound library can do this easily. But there is a problem with playsound, that is, it can't be released, which means that an audio can only be played once. If you want to play it again, the modification will prompt you to deny access.
Solution: click here

	def voice(words):
	    #语音函数
	    result  = client.synthesis(words, 'zh', 1, {
    
    
	        'vol': 5,
	    })
	    if not isinstance(result, dict):
	    	#写入文件
	        with open('./res.mp3', 'wb') as f:
	            f.write(result)
	            f.close()
	        #播放音频
	        playsound('./res.mp3')

Achievement display

Source code

	import os
	import cv2
	from aip import AipBodyAnalysis
	from aip import AipSpeech
	from threading import Thread
	import time
	from playsound import playsound
	
	""" 你的 APPID AK SK """
	APP_ID = '********'
	API_KEY = '********'
	SECRET_KEY =  '********'
	''' 调用'''
	
	hand={
    
    'One':'数字1','Five':'数字5','Fist':'拳头','Ok':'OK',
	      'Prayer':'祈祷','Congratulation':'作揖','Honour':'作别',
	      'Heart_single':'比心心','Thumb_up':'点赞','Thumb_down':'Diss',
	      'ILY':'我爱你','Palm_up':'掌心向上','Heart_1':'双手比心1',
	      'Heart_2':'双手比心2','Heart_3':'双手比心3','Two':'数字2',
	      'Three':'数字3','Four':'数字4','Six':'数字6','Seven':'数字7',
	      'Eight':'数字8','Nine':'数字9','Rock':'Rock','Insult':'竖中指','Face':'脸'}
	
	#语音合成
	client = AipSpeech(APP_ID, API_KEY, SECRET_KEY)
	
	#手势识别
	gesture_client = AipBodyAnalysis(APP_ID, API_KEY, SECRET_KEY)
	
	capture = cv2.VideoCapture(0)#0为默认摄像头
	def camera():
	
	    while True:
	        #获得图片
	        ret, frame = capture.read()
	        # cv2.imshow(窗口名称, 窗口显示的图像)
	        #显示图片
	        cv2.imshow('frame', frame)
	        if cv2.waitKey(1) == ord('q'):
	            break
	Thread(target=camera).start()#引入线程防止在识别的时候卡死
	
	def gesture_recognition():
	    
	    #第一个参数ret 为True 或者False,代表有没有读取到图片
	
	    #第二个参数frame表示截取到一帧的图片
	    
	    while True:
	        try:
	            ret, frame = capture.read()
	
	            #图片格式转换
	            image = cv2.imencode('.jpg',frame)[1]
	            
	            gesture =  gesture_client.gesture(image)   #AipBodyAnalysis内部函数
	            words = gesture['result'][0]['classname']
	            
	            voice(hand[words])
	            print(hand[words])
	            
	        except:
	            voice('识别失败')
	        if cv2.waitKey(1) == ord('q'):
	            break
	
	
	        
	def voice(words):
	    #语音函数
	    result  = client.synthesis(words, 'zh', 1, {
    
    
	        'vol': 5,
	    })
	    if not isinstance(result, dict):
	        with open('./res.mp3', 'wb') as f:
	            f.write(result)
	            f.close()
	        playsound('./res.mp3')
	
	gesture_recognition()

Effect videoInsert picture description here

Guess you like

Origin blog.csdn.net/m0_43456002/article/details/105566742