The key part about the extraction of gesture contours and gesture tracking and recognition


    The main purpose of gesture extraction is to extract gestures from a more complex environment.A bunch of grayscale image binarization cannot achieve better results. Although the color glove method to solve this problem can achieve good results, it also introduces unnecessary trouble. After comprehensive consideration, in the design, we use the method of extracting gesture contours from skin color. After experiments, the effect of this method is more obvious. , in most cases, satisfactory results can be achieved. The tracking of gestures takes the form of prediction and relocation. In previous experiments, some people used the optical flow method for tracking. Because it is easily affected by the environment, the effect is not obvious. In addition, the camshift tracking algorithm + Kalman filtering is used. The algorithm tracks random objects, but it has great drawbacks for high-speed moving objects, or objects whose motion is not linear, and does not have good adaptability. For this, we use a more advanced tracking algorithm, based on The particle filter algorithm of Karaman filter has excellent effect on the tracking of randomly moving objects, and can track multiple objects at the same time, which can meet the needs of gesture recognition. For gesture recognition, we use the method of positioning the fingertips first, and simply adopt the method of using the number of fingertips as static features. This method has high accuracy and basically no errors. Another method is to extract feature vectors through support vectors. The gestures are classified in the form of a computer. After experiments, this method recognizes many gestures. If the input standard is used, it can be as high as more than 20 kinds, but the requirements for the environment are higher. However, in our system, due to the static gestures The requirements of the type of s are not complicated, so the former method is adopted. The recognition of dynamic gestures, according to the current situation, does not require too complicated functional recognition. It only needs to complete the tracking of the gesture centroid, and the dynamic feature vector can be obtained by taking the difference between the effective contour centroid positions in the adjacent two frames of images.

The two most critical parts of the project:

             a. Accurate and complete extraction and separation of gesture contours

             b. Efficient and accurate tracking of localized gesture contours

1. Relevant basic knowledge

The data representation of pictures in the computer:

1. Pixel format

The formats of image files are more diverse, such as jpg,bmp,png,gif

Image storage format: matrix   

The file format of the image: Image header Information header Color board Image specific information (pixels)

2. Vector format Image enlargement with fidelity


3. There are two main representations of color spaces: 

  a.RGB 

  b.HSV ( hue ( hue) , saturation (saturation) , brightness (value )

2. Image preprocessing

Before extracting the skin part in the image, in order to obtain a better effect, it is necessary to perform necessary processing on the image.

Image processing to remove salt and pepper noise and make areas of the image more continuous in color. Before extracting the

To make the edge of the image clearer, it can not only improve the accuracy of recognition but also ensure the stability of the algorithm.

         There are two main processing methods:

median filter

Gaussian filter

Gaussian filtering is essentially a signal filter, and its purpose is to smooth the signal. It is known that digital images use

For later applications, the noise is the biggest problem. Due to the cumulative transmission of errors and other reasons, many image processing textbooks will introduce the Gauss filter very early to obtain images with a higher signal-to-noise ratio ( SNR ) (reflecting the real signal). . Related to this is the Gauss-Laplace transform. In fact, in order to obtain better image edges, Gauss smooth filtering is performed on the image to remove noise, and then the second-order derivative vector is obtained, and the zero-crossing point of the second-order derivative is used to determine the edge. is also a frequency-domain product and a spatial-domain convolution.

3. Extract gesture contours according to skin color, and perform morphological processing on the image (erosion and expansion, opening operation and closing operation_

 

Human skin color space HSV range:

           H   (0,40)U(150180)   S(30,170)     V(30256)

                    

                  Overall flow chart Image acquisition and preprocessing

Fourth, extract the contour and calculate the various features of the contour (its own features)

a. The centroid of the contour

b. The shortest longest diameter of the contour

c. The circumcircle of the contour (center and radius)

d. The perimeter and area of ​​the contour

e. The rectangular box position of the outline in the image

f. The aspect ratio of the rectangle frame of the contour and the distance ratio from the centroid to the length direction (required when removing the false contour);

g. Nine-dimensional eigenvectors of contours based on centroids equally spaced outwards

h. The set of outer envelope points of the contour

i. Contour point set

j. Corner detection of contours (corner collection)

k. Moments of various orders of contour

l. Extraction of effective feature vectors of contours

m. Positioning of fingertips

 

Overall process

Contour extraction and feature extraction

 

Five, location tracking

 The principle of particle filter tracking (to be continued)

 

 

 

Overall process

 

 Gesture Tracking and Recognition

 

 

tracking target location

6. Static Gesture Recognition

 A. Classification in the form of support vector machines according to eigenvectors

The principle of      SVM (to be continued)

 B. According to the number of fingertips (currently this method is more accurate)

7. Dynamic gesture recognition

  At present, there is no need to implement overly complicated functions (just get the centroid of the current contour), and the dynamic feature vector can be obtained by changing the centroid position of the contour between adjacent frames

If you need to expand, there are two ways to choose

A. Count and save the position of the effective contour centroid, form a trajectory, and compare it with the template.

Or use the same method as text recognition to obtain the trajectory feature vector, and use the support vector machine for classification and recognition

B. According to the dynamic feature vector, the dynamic feature encoding is performed by converting it into an angle, and the encoding sequence is collected.

The Hmm method is used for identification or the dynamic feature encoding of the interval is used as the feature vector, and the identification is carried out in the form of a support vector machine.

Gesture Recognition Static Gesture Recognition and Gesture Tracking Show: http://v.youku.com/v_show/id_XNzIwNjU2MzEy.html?firsttime=0


Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324386457&siteId=291194637