Facial paralysis detection technology based on image recognition

Author Homepage: Programming Compass

 

About the author: High-quality creator in the Java field, CSDN blog expert, invited author of Nuggets, many years of architect design experience, resident lecturer in Tencent Classroom

Main content: Java project, graduation design, resume template, learning materials, interview question bank, technical mutual assistance

Favorites, likes, don't get lost, it's good to follow the author

Get the source code at the end of the article 

1 Introduction

1.1 Basis and significance of topic selection

It is very important to correctly judge the degree of facial paralysis of the patient and take medical measures as soon as possible, because facial neuritis often greatly disturbs the life and psychology of the patient. In many cases, facial paralysis is not a sudden disease, but has a period of deterioration, because some patients only occasionally experience abnormal facial discomfort at the beginning, but they do not take it seriously. The easiest time to heal has been missed. The system described in this article is very useful in helping patients to detect diseases as soon as possible and take treatment as soon as possible. However, since there is no unified and recognized facial paralysis grade judgment system at home and abroad, the grades judged by different doctors are not the same. It depends on the supervisor. Factors, in a short treatment cycle, it is difficult for the doctor to accurately judge the treatment effect based on the facial features of the patient, and it is very likely to cause misdiagnosis and delay treatment. And through the establishment of this system, not only is it beneficial for doctors to formulate reasonable treatment plans, but also the system can be used by patients at home, so that patients can also obtain intuitive condition analysis, and they can participate in understanding their own condition throughout the whole process instead of blindly following the doctor. , can also avoid unnecessary conflicts caused by differences in subjective judgments between patients and doctors, and can also reduce the psychological burden of doctors. The data calculated by computer machinery can provide theoretical basis for doctors to judge, and increase the trust between doctors and patients. Feeling, reduce misunderstanding and communicate the condition efficiently, forming a virtuous circle.

At present, many hospitals in China will use the system built by face recognition technology for clinical disease judgment. At present, it is widely used in body shape and abnormal hypertrophy of internal organs, genetic diseases, and hypercortisolism. Scholars are now studying the application of face recognition technology in Fetal alcohol syndrome ( FAS), myalgic encephalomyelitis and other diseases. This technology assists doctors in objectively judging disease conditions and development trends. It is simple and efficient to use. It is a model of the perfect combination of the computer industry and all walks of life.

1.2 Research status and development trend at home and abroad                            

Image recognition is achieved through the steps of computer image acquisition and processing for specific situations, analysis and matching targets, feature extraction, and training classification models. With the efforts of scientists at home and abroad, rapid changes have been achieved. People began to apply this technology to Medicine, agriculture, security, transportation, vehicle fields. In this context, many advanced medical methods are inseparable from the support of image recognition technology. For daily physical examinations such as chest X-rays and electrocardiograms, we can obtain preliminary diagnostic information provided by computers through self-service retrieval, and then find Doctor consultation, as large as some minimally invasive surgery, brain CT technology, cardiac pathology analysis, tuberculosis image recognition, retinal image technology for diabetic patients [ [ i] ].

In recent years, medical scholars have done a lot of ideas and research in order to achieve a facial paralysis grade system that can gain everyone's consensus. Currently available foreign papers on judging the grade of facial paralysis have proposed more than 20 methods. Among them, the most widely used are: Toronto Facial Grading System (TFGS), Facial Nerve Function Index (FNFI), HB System, etc. In China, in 2005 and 2006, the China Association of Traditional Chinese and Western Medicine and the Chinese Medical Association respectively drafted standards for the evaluation of facial paralysis, but they have not been widely used. After that, Huashan Hospital developed a set of facial paralysis grade judgment system (HFGS) by itself. After testing, the system has good efficiency and reliability. It includes all facial nerve branches. Each branch is divided into several small items, and there are detailed tests. The program was initially promoted in Huashan Hospital [ [ii] ].

 Combined with the above analysis, there are currently many facial paralysis rating systems at home and abroad, but each has its own advantages and disadvantages, so building an efficient, agreeable, and accurate facial paralysis rating system still needs in-depth research.

1.3 Research content and technical route of this subject

Realize the real-time collection of user's face photos by the camera; complete the feature recognition and dynamic labeling of several key positions such as the tip of the nose, the corners of the eyes, the corners of the mouth, and the middle of the person; calculate the symmetry of the left and right feature points relative to the center line according to the coordinate positions of the feature points, and give The result of the symmetry evaluation. Complete the algorithm and software and hardware system, and realize the test report of no less than 50 people.


Two grades of facial paralysis

2.1 Understanding Facial Paralysis

Facial nerve palsy (facial paralysis) is a common disorder characterized by facial nerve dysfunction caused by facial nerve injury, resulting in dysfunction of facial expressions in patients. The common symptoms of patients with facial paralysis are crooked mouth and eyes. When the patient is seriously ill, it will seriously affect normal life, and lose functions such as closing eyes, raising eyebrows, and smiling [ [1] ] . There are many causes of facial nerve paralysis, but the medical community has not yet identified the specific cause of the disease. Genetic theory, virus origin theory, immunology theory, rheumatology theory, blood transport theory, vascular compression theory, etc. are several pathogenic theories recognized by researchers [ [2] ] . Facial paralysis can be divided into central and peripheral facial paralysis according to the location of the onset [ [3] ]. The former usually does not have expression dysfunction on the face, but the corner of the mouth may be tilted to one side. The latter will suffer from muscle paralysis, which seriously affects life. Only the healthy side can exert force when making a smiling expression. Long-term illness will cause physical and psychological harm to the patient, affect the blood circulation system of the whole body, hinder people's normal social interaction, and cause stuttering and ambiguity when they cannot communicate with others normally. clear illness.

2.2 Existing methods for judging the grade of facial paralysis

  1. The B facial paralysis grade judgment system was approved by the Facial Nerve Diseases Committee in 1984. This system divides facial paralysis into six categories by observing the degree of movement of the facial features and muscles when the patient closes the eyes, grins, bulges the cheeks, wrinkles the nose, and raises the eyebrows. levels, but the boundaries between levels are blurred and usability is low. Still needs improvement. Yanagihara Facial Paralysis Grading System (YFGS) [ [4] ] is scored by collecting ten facial movement states of the patient, and each expression is divided into four points and a total of forty points, including stillness, frowning eyebrows, smiling, closing the corners of the mouth, slightly closing eyes, etc. wait. Temperature data visualization method for judging the grade of facial paralysis [ [5] ] When a person suffers from facial paralysis and loses all or part of the facial muscles, there will be abnormal blood flow transportation status, and the temperature range of the facial muscles will appear under infrared light. The left and right sides of the face are asymmetrical, and the effect of visually observing the severity of facial paralysis can be achieved by drawing a histogram of muscle temperature in a certain direction. The Huashan facial grading system (HFGS) in Huashan Hospital detects 6 facial nerve branches, 5 of which are composed of 2-3 small inspection items. And the researchers provided a scoring basis for each inspection item. For example, the scoring standard for closing eyes is: 5 points for the distance between the upper and lower eyelids greater than or equal to 3 mm, 4 points for between 2 and 3 mm, and 3 points for 0 to 2 mm.

Three edge detection

3.1 Basic principle of edge detection

Usually the edge to be extracted in production and life is the place where the changes in the image samples include the most information. The goal of edge extraction detection is to find a set of pixels with step changes or roof changes.

Because the edge is the part where the gray level of the pixel changes most drastically, the most straightforward way is to calculate the derivative at this point. For the information expressed in Figure 3-1: the maximum value of the first derivative is the edge, and the point where the second derivative is zero is the edge. For the information shown in Figure 3-2: the point where the value of the first-order derivative is 0 is the edge point, and the point where the second-order derivative takes the maximum value is the edge.

      

Figure 3-1 Step change Figure 3-2 Roof change             

General steps for edge detection:

1. Filtering, the edge detection algorithm is mainly based on the first and second derivatives of image intensity, but usually the result is very sensitive to noise, so filters must be used to improve the edge detector performance. Example: Noise-correlated detectors. The filtering methods commonly used by the public mainly include Gaussian filtering. In fact, a discrete Gaussian function is used to generate a set of normalized Gaussian kernels, and then each point of the grayscale matrix of the image is weighted and summed using the Gaussian function.

2. Enhancement, the way to make the edge more obvious is to calculate the intensity of the pixel change of all points in the image sample in its nearby area. The algorithm takes the method of emphasizing certain points: the points whose gray value changes drastically in its vicinity. In actual use, it can be determined by calculating the amount of gradient change.

3. Detection, many times in the image after the second step operation, there are usually many points in the vicinity of some points with a large gradient change, but in some cases such points are not real edge points, so we must take some methods To exclude these points, the strategy of high and low thresholds is often used for screening in practical applications.    

3.2 Common edge detection operators

3.2.1 First-order differential operator: Sobel

The Sobel operator is a discrete differential operator, which is used to calculate the approximate gradient of the gray level of the image. The function of the Soble algorithm integrates Gaussian smoothing and derivative, also known as the first-order differential operator, and the derivative operator. Obtain the image X direction and Y direction gradient image through the horizontal and vertical directions. To understand the Sobel algorithm specifically, it is necessary to determine which are the edges: it is the place where the pixel value jumps, is one of the salient features of the image, and plays an important role in image feature extraction, object detection, pattern recognition and other fields. How to get the edge of the image: Do differential calculation for the area where the edge is to be obtained. The larger the Δf(x), the greater the change of the pixel in the X direction, and the stronger the edge signal. Therefore, the calculation method of the sobel algorithm is to move in the x-direction and y-direction of the image, using a three-by-three template, correspondingly extract nine pixels in the same size sample, and perform convolution calculations to calculate the edge in this direction The maximum value of the change. However, there are many options for calculating the amplitude. Generally, horizontal and vertical directions, or other angles are selected according to specific conditions.

The Sobel operator detection method has a good effect on some special pictures (such as many gray gradients and noise). Sobel operator is inaccurate for edge positioning. There are often more than one pixel at the edge of the image (as shown in Figure 3-3). When the accuracy of edge extraction is not high, the sobel operator is a relatively common edge detection method.

Figure 3-3 sobel edge detection

3.2.2 Second-order differential operator: Laplance

The Laplacian algorithm is easily disturbed by noise, so it is generally not used to extract edges, and it is mostly used to infer the bright and dark areas of the picture. It extracts information by seeking double derivatives. Compared with other operators, its advantage is that it does not have directionality. The edge information obtained in different orientations is equally important; losing any orientation knowledge will lead to the interruption of the extracted edge. , but this method can perform compensation operations on edge detection in all directions to make the image clearer and the edge is not easily affected by interruption.

3.2.3 Non-differential edge detection operator: Canny algorithm

The algorithm is a special means. By capturing multi-faceted and effective information from various images, it can reduce the information that needs to be processed and calculated during the operation process, and speed up the algorithm speed. It is now widely used in image processing processes in various industries. Its inventor has explored in experiments that no matter what the occasion needs to capture the edge, the requirements are basically the same, so he summed up some guidelines that can summarize all edge capture requirements. The general criteria for edge detection [ [i] ] include :

1. The algorithm should try its best to extract edges, which means that no edges can be missed or falsely detected.

2. The detected edge should be precisely positioned at the center of the true edge.

A given edge in the image should be marked only once, and where possible, the noise of the image should not produce spurious edges.

The Canny algorithm is an edge capture operator that is frequently used today. It is calculated through rigorous experiments by predecessors, and can achieve very effective and accurate edge capture effects. And because it can meet the above three criteria, and the calculation steps are easy and time-consuming, it is one of the popular algorithms now.

Canny edge detection algorithm can be divided into the following steps:

  1. By using the Gaussian filter, blurring and noise removal are achieved.

Since we want to minimize the interference of noise on the edge extraction efficiency, we need to take strategies to reduce or eliminate noise so as to reduce the error probability. Therefore, we use the Gaussian filter for convolution calculation to achieve the effect of blurring the image, thereby reducing the error judgment of the edge detector due to noise. The filter kernel of the Gaussian filter is generally odd * odd (minimum 3). If the image has an area B whose size is three times three, and the point to be smoothed is e, then after processing, the pixel value of e will become:

The above formula (1) needs to be further explored, how to choose a suitable convolution kernel, the size of the value will directly affect the detection effect of the canny algorithm. The larger the value, the smaller the interference, but the accuracy of determining the edge position will also decrease. The experimental statistics show that usually multiplying five by five will get better results. Figures 3-4 are the effect diagrams of filter kernels of 3*3 and 7*7 respectively.

 

Figure 3-5 Gaussian filter kernel 3*3 and 7*7

2. Edges can be extracted from all directions of the image, so the Canny operator uses four different methods to select and detect edges in three directions in the image: horizontal, vertical, and diagonal. Returns the value of the first-order differential in the horizontal and vertical directions, through which the gradient and direction of a point in the graph are calculated. The Sobel operator was chosen for this test to derive the gradient and orientation.

 3. Apply a non-maximum result constraint to weaken the irrelevant effects due to edge detection. The non-max constraint is an edge-sparse technique, and it works on the "thin" side. It is not clear and accurate if we only rely on the gradient of the image and then extract the edge visual effect according to the change of the gradient. For criterion 3: there should be only one accurate response to one edge. This requirement cannot be achieved only through the second step. This step can effectively change the gradient value of other points outside the local maximum point to 0.

Take this step for all points in the graph, the specific steps are:

(1) Take the two pixels of the horizontal left and horizontal gradients of the currently calculated point as a reference.

(2) Assuming that the point being calculated has a larger pixel gradient than the reference point, then this point is classified as an edge category, and on the contrary, this point is constrained. In many cases, a more accurate method will be adopted: linearly interpolate pixel values ​​between two reference points to obtain the real benchmark comparison object.

(3) Apply a double threshold test as shown in Figure 3-5 and 3-6 to extract reliable and hidden edges.

        

        

   Figure 3-6 Canny algorithm low threshold Figure 3-7 Canny high threshold effect diagram       

4. After constraining the pixels that are not the maximum result, the remaining points can more realistically present the essential edge of the image. However, some edge pixels due to noise and color changes still remain. To achieve the purpose of dealing with these scattered responses, it is advisable to adopt the method of selecting the upper and lower limits: assuming that the edge pixel change value is smaller than the highest limit, and otherwise larger than the minimum interface, then record it as Weak Edge, assuming the edge pixel change value If it is smaller than the minimum limit, then he will be removed by the constraint. The setting of the upper and lower limits depends on the specific situation of the image to be detected. However, his threshold does not have an adaptive function and needs to be selected according to the situation, so many people have made many improvements to the Canny algorithm [[ ii] ].

5. After constraining individual weak edges through the above steps, the clear edges are the set of pixels identified as strong edges because they are a subset of the real edges of the image. However, whether the weak edges are the edges we need to extract remains to be verified, because their composition may be the exact edge or it may be caused by noise changes or color changes. Therefore, to extract conclusive edges, it is necessary to prohibit the appearance of edges due to noise and color. In general, edges due to noise and color changes are not connected to real edges. We take the method of observing the weak edge and its surrounding eight adjacent pixel values ​​to determine the connection of the edge, as long as one of the eight points is a strong edge, it can be classified into the category of accurate edge.

3.3 Chapter Summary

Based on the above analysis and screenshots of the experimental results, it can be obtained that the edge extraction accuracy of the sobel algorithm is not very high, and there are many extra pixels on the edge, but it works well on noisy and complex images. The Laplance operator performs poorly on noisy images. However, the extracted boundary divides the bright and dark areas very well. Finally, although the canny operator is already a good detection algorithm, it still has its shortcomings [ [ iii] ] For example:

(1) After collecting images, we need to perform preprocessing. When performing Gaussian filtering, we need to consider the choice of Gaussian filtering kernel size. Improper selection may cause both noise filtering and edge information extraction to fail to achieve good results.

(2) Because the Gaussian filter selected by the canny algorithm cannot remove salt and pepper noise and impulse noise, so many edges are noise.

(3) The fourth step of the canny algorithm uses double thresholds to determine precise edges. In different usage scenarios, the collected pictures are affected by light, brightness, and contrast, and the thresholds need to be constantly changed. It does not have an adaptive function and the processing is cumbersome. , without sticking and practical application, will get pseudo-edges [ [iv] ].

Four improved edge detection algorithm

4.1 Adaptive adjustment

In the experiment, it is found that the pictures collected in different occasions have different brightness and contrast, which will affect the effect of edge detection (as shown in Figure 4.1 light intensity, 4.2 light intensity). When the light is good, more lines will be detected, and when the light is dark , relatively few lines are extracted. It can also be explained that when the light is bad, the details of the picture are seriously lost, which will lead to errors in the extraction of edges and determination of feature points.

4.1.1 Adjust Brightness and Contrast

Usually, an image manipulation operator is a function: it has one or more parameters, that is, one or more images can be passed in for operation, and finally a new changed image is generated. There are many kinds of changes here, and two are introduced here: point changes, which are used to adjust the contrast and brightness of pictures, and field changes, which are used to filter pictures and other convolution operations. The common operation of the former is to do arithmetic operations on each pixel in the picture: two variables are commonly used and the brightness and contrast of the picture are changed, which are called gain and offset respectively. The results are shown in Figure 4-1.

Figure 4-1 Changes before and after adjusting the brightness and contrast

4.1.2 Spatial Filters

 A normal filter usually consists of a domain and operates on changes to the pixels within its range. The former is because the general operation calculation objects are pixels within a certain rectangular range, and the latter is because the filtering operation is usually to perform some special transformation steps on those points in the rectangular area, and use the transformed pixel value as the pixel value of the point. The new result outputs a completely different picture. Generally speaking, its alias or abbreviation is kernel, template. According to those changing steps, it can be reduced to two types: linear or non-linear. According to the final effect of the filter on different photos, it is usually divided into: smooth or sharp,

The general function of the former is to blur the picture and reduce noise interference, while the latter is to make the edge of the picture clearer and increase the position of the grayscale change.

The most common non-linear one is Median Filter, as shown in Figure 4-3. Its operation process takes out all the pixels in a rectangular area of ​​a certain point, and the value on the median is used as the new pixel value of the point. The purpose is generally to reduce the pixel and gray size of this point and the point difference between its surrounding area to achieve the purpose of reducing isolated noise, so this method can effectively avoid salt and pepper noise interference. In addition, it can not only be used to avoid noise, but also help the picture to maintain the complete edge content to a higher extent, instead of causing picture distortion like the average filter. However, it still has some disadvantages. For example, it is easy to produce bad effects due to improper selection of the operation rectangle size. When the rectangle is selected relatively small, the details of the output image are restored, but unnecessary noise interference is also retained. down. However, assuming that the rectangular area is set larger, although it can effectively avoid noise interference, the output picture will be distorted to varying degrees.Moreover, by combining its principles, we can know that assuming that the number of noises in this rectangular area is more than the number of points that make up the image in this area, then its final operation effect will not be so satisfactory. Then you can expand the scope of the rectangular area for filtering operations. Although this method can slightly improve the impact of excessive noise, it will lead to distortion of the picture effect. Under normal circumstances, according to the conclusions of previous experiments, as long as the number of noise points is less than 0.2 times the number of pixels in the whole picture, the operation result of this filter is still very applicable. However, if the picture quality is not good enough and there is a lot of noise, normal operations are useless.

Therefore, combined with the above analysis, when performing noise filtering, according to the actual situation of the specific operation area, the size of the kernel can be adjusted at any time, and then the noise can be eliminated. The median is used as a replacement result. This self-adjusting filter can achieve the elimination of salt and pepper, and other non-impulse noises, and because the window size can be adjusted at any time, the edges that are better extracted in terms of collecting information will not be particularly blurred or very blurred light.

4.1.3 Description of Adaptive Median Filtering Algorithm

The self-adjusting filter can not only eliminate the impulsive noise with a large number of occurrences, but also has a stronger ability to collect knowledge about the small links of the sample, all of which are its unique advantages. What it has in common with ordinary filters is that there must be a rectangular operating area, that is, the kernel, and the difference between the two is that its kernel will change according to the specific conditions of the area it slides to during operation, just as 3.2. When the Gaussian filter is introduced in section 3, the effect of selecting different kernel sizes is different, and the meaning is the same here.

A detailed description of this algorithm should use the following notations:

: The grayscale size of the point whose abscissa is x and whose ordinate is y

: Minimum grayscale in the operating range

: Median grayscale within the operating range

: The maximum area of ​​the operating range

: The operating range of noise filtering, xy is the abscissa and ordinate of the center position of the rectangular area

First, the goal of the first step is to judge whether the pixel grayscale extracted within the selected operating range is a noise point. Assuming that the matching size is between and , then this point is not a noise point. Exclude some rare possibilities, such as the size of and or are equal, then this point is a noise point, it is necessary to increase the area of ​​the rectangular operation area, and extract a point that is not a noise point in the new operation range. If you still can't find the non-noise point after the two-step operation, then output the point with a grayscale of .

The next step is to identify whether the point whose coordinates are (x, y) is a noise point. The identification method is the same as above, that is, replace with and then calculate. If the final identification result is noise, replace this point to filter, otherwise keep this point, the final result is shown in Figure 4-5.

Figure 4-2 plus salt and pepper noise Figure 4-3 median filter                    

    Figure 4-4 Gaussian filter Figure 4-5 Adaptive median filter               

4.1.4 Histogram equalization

  Histogram equalization, also known as histogram flattening, is to arrange all the pixels of the picture on the horizontal axis according to their positions, and stretch them non-linearly, so that the number of occurrences corresponding to any gray value on the picture is roughly the same , the histogram of the final output image is compared to before, and the pixel values ​​with more frequent occurrences become a series of points with uniform changes in the number of occurrences. As shown in Figure 4-8.           

The image histogram refers to the number of times that all pixel values ​​​​of the image are in the gray level (0-255) and the number of times the gray level appears is accumulated, and the histogram is generated accordingly. The histogram (Figure 4-6) reflects the distribution characteristics of the gray level of the image. Figure 4-7 reflects the direct relationship between the grayscale size and the number of times it appears in the picture. The x-coordinate axis represents the grayscale size, and the y-coordinate axis represents the number of grayscales that appear in the picture (that is, the number of pixels). Its entire coordinate system describes the gray level distribution of the image, so that the gray level distribution characteristics of the image can be seen, that is, if most of the pixels are concentrated in the low gray level area, the image is presented.

 

 Figure 4-8 is the result of the equalization of the histogram of the operation sample. The original result has obvious peaks and troughs, but after modification, it changes into a smooth curve without particularly obvious protrusions. Its principle is to make the picture After the change, the number of occurrences of different gray values ​​is roughly the same, so the gray size of some points is enlarged, so the visual effect of contrast enhancement appears in Figure 4-9.

4.1.5 Image Fusion

Based on the above comparison of the results of sobel and canny, it can be found that both have smoothing function [ [i] ], but each has its own characteristics. In terms of noise, canny has a good denoising effect. Although sobel is sensitive to noise, it can be suppressed. In terms of edge extraction, the connection effect is better visually. The edge is precise and complete, but the edge is very thin and there is no contrast between strength and weakness. The edge extracted by sobel has clear directionality and has strength and weakness, but it is not very accurate visually. Based on the characteristics of the two, the image can be mixed and fused with the respective advantages and disadvantages of the two.

 

 

4.2 Chapter Summary

Combining the several methods mentioned above, we can see intuitive visual effects in the system: From Figure 4-10 and Figure 4-11, it can be confirmed that the image details are lost more when the light is dark, and when the light is bright There are too many details in the picture, and a lot of information is too complicated, which will lead to a decline in the final face recognition effect. Figure 4-12 and Figure 4-13 are the effects achieved by changing the parameter threshold of the Canny algorithm. Although this chapter does not mention this method in order to avoid repeating the working principle of the Canny algorithm, but combined with Figure 4-10 and 4-11 , and Figures 3-5 and 3-6, it can be seen that by adjusting the threshold, the description of the details of the edge can be significantly changed. The smaller the threshold, the more details are retained. Therefore, increasing the threshold when the light is bright can eliminate unnecessary details. When the light is dark, reduce the threshold and preserve the detail information as much as possible to achieve the adaptive goal. As shown in Figure 4-13, it is obvious that the edge loss is serious and the threshold should be adjusted to 20. Next, Figure 4-14 is the result of adjusting brightness contrast and performing histogram operation Figure 4-15. It can be found that after histogram equalization, some relatively strong lines are obviously thickened, but some relatively light lines are filtered. Because of the working principle of the algorithm, it can be known that the algorithm is realized by directly modifying the pixel value of the image to make the distribution of the number of occurrences of the pixel value uniform, so some pixels with a small number of occurrences will be weakened and directly weakened. This method can effectively For strong edge enhancement, the price is that some details will be lost, which has two sides, but this shortcoming can be compensated by adjusting the threshold of the canny algorithm. As for directly adjusting the brightness of the picture, it can be seen from Figure 4-14 that blindly directly adjusting the brightness and contrast of the picture itself will actually cause the loss of details, so the impact of brightness on the preservation of picture details does not refer to the brightness of the picture itself, but may be the external light. caused by other factors,

 

 Five face detection

After the above steps, we have obtained a face edge image as accurate as possible. Next, we need to extract the face part of the image before acquiring the key points. Currently commonly used methods include cascade classifiers based on harr features, cascade classifiers based on lbp, and methods such as using the dlib library.

5.1 Cascaded classifiers for Lbp or haar features

Lbp is also named as the local binary mode, which takes the midpoint pixel value as a reference, and the surrounding pixels are greater than 1 and less than 0, and the local binary feature of the central pixel is obtained by outputting in binary form. The local binary feature Features such as points, lines, edges, corners, and flat areas of the image can be obtained. lbp has scale invariance and rotation invariance.

Haar features have high inter-class variability, that is, there is a large difference between each feature, low inter-class variability, the same feature detection results are basically consistent, local intensity is poor, spatial illumination rotation invariance, and high computational efficiency. The Haar feature detects the face, and k features can be selected on the face and then compared in n areas. Among them, haar features are floating-point calculations, and lbp is integer calculations, so the speed of the former is several times slower, but the data detection results trained by haar under the same sample data are more accurate than lbp, and lbp needs more sample data to achieve the same effect .

 

The principle of the cascade classifier [ [i] ]: the classifier is trained through a large number of positive and negative picture sample data, which has the characteristics of slow training and fast detection. Each feature is a weak classifier, and multiple weak classifiers form a strong classification The weak classifier can be constructed with a decision tree (as shown in Figure 5-1), and the extracted features are compared with the features of the classifier one by one to determine whether the feature belongs to a person's face. In fact, the strong classification is the first step to let each weak classification vote, and the next step is to select the weight according to the possibility of misjudgment by different classifiers, and perform the sum operation according to the weight. That is, only the result of calculating the mean value is used as a reference to make the final judgment.

5.2 DLIB face detection 

Although the haar and lbp data models that come with opencv are often used in face detection and the effect is better, this system uses dlib, a very commonly used c++ library package, which contains many tool interfaces about machine learning. In terms of accuracy, the speed is high, and opencv may not be able to detect the face once the face image acquired by opencv is not facing the right direction when detecting the face, but this rarely happens with dlib. Because this method adopts the strategy of cascading regression, the first step is to collect a large number of specially processed photos of people with the same size and background as training data samples, and the next step is to train to obtain a data model. When we input a photo, dlib will use the detector in its library to call the officially trained data model for matching, and return the extracted face rets, and call the library interface to operate on the returned value. 68 points are marked, and then the position information of 68 points is stored in a parameter, and the user can directly know the coordinate position of each point through this parameter. 68 feature points can be obtained by shape (Figure 5.2).

 

Calculation of six facial paralysis grades

6.1 Acquiring image data 

This system collects photos by using a computer camera, uses visual c++2015 as a development tool, and calls the camera through videocapture to take images of patients in real time. It must be collected in a strict environment such as:

(1) The photographs collected must be of the patient's full face and be as symmetrical as possible.

(2) When collecting different expressions, it needs to be under the same light source, and the light source should be naturally and evenly distributed. At present, the judgment of the grade of facial paralysis mainly depends on the subjective judgment of doctors. At present, there is no generally accepted method of judgment, but according to the clinical characteristics of facial paralysis, we can know that after suffering from facial paralysis, people cannot close their eyes completely, grin and smile normally, and drum up. You can’t close your mouth when you lift your cheeks, and you can only lift one side of your eyebrows when you raise your eyebrows. Even when you’re still, you can see that the face is skewed. Based on these features, combined with the feature points extracted in the previous step, we can do some calculations. Classification of facial paralysis.

6.2 Vertical distance method (14 points)

According to the external expression characteristics of patients with facial paralysis mentioned above, by marking 68 points on the facial features to obtain the coordinates of facial feature points, we first select seven pairs of feature points, which can be calculated by selecting the y coordinates of the highest points of the left and right eyes and eyebrows , normal people can raise their eyebrows with the same range at the same time, so the y-coordinate difference is the same, but the y-axis difference between the left and right eyebrows of patients is definitely inconsistent, and the greater the difference, the more serious the disease. For patients who cannot grin and smile at the left and right corners of the mouth at the same time, you can select the left and right outer corners of the mouth (left: 48 right: 56), and then make a difference with the y coordinate of the inner corner of the eye that will not change no matter what expression you do. According to the characteristics of the patient’s inability to completely close the eye on the affected side, select the upper and lower eyelid feature points of the left and right eyes (left: 38 42 right: 44 45) to make the difference. It must be bigger. When the patient makes an expression of pouting and squeezing the nose, due to muscle weakness on the affected side, the distance between the outer side of the nose and the inner corner of the eye will be farther for comparison. Finally, some seriously ill patients may have obvious skewed mouth corners in a static state. It can be seen from the pictures collected from the Internet that most of the severe patients will have serious deviation of the outer corners of the mouth, so the left and right outer corners of the mouth can be recorded. Calculate the difference of the y-axis coordinates of the inner corner of the eye that does not change, and compare them. Finally, five sets of differences are collected, and the weight of each set of differences in the calculation of participation levels is selected according to the doctor's clinical experience. This system uses mean calculation.

Seven conclusions

With the acceleration of people's life pace, people's daily life is far less healthy than in the past. Facial paralysis has become a relatively common disease. The onset time of facial paralysis has nothing to do with people's age. No matter children or the elderly, there will be cases of the disease. Moreover, today's middle-aged people have a lot of pressure in their lives, have complicated affairs to deal with, and are in a relatively hard state for a long time, so they are especially prone to illness. Especially after young people like to exercise, or overeating among their peers, it is very easy to cause the disease after drinking a lot of alcohol and facing the air vent or air conditioner. Patients with facial paralysis are not restricted by age, and it is possible to get sick from children to the elderly. Nowadays, young and middle-aged people are more stressed at work, and bad living habits such as staying up late cause overwork, which increases the chance of suffering from facial paralysis. Sweating profusely, blowing air conditioners and fans after drinking alcohol are the biggest triggers for young people to suffer from facial paralysis in summer. Therefore, it is necessary to develop a system that can conveniently and quickly detect the level of facial paralysis. With the deepening of research and feedback from users, the system will be continuously improved.

This topic adopts face recognition technology based on opencv to establish a facial paralysis recognition system, based on the degree of facial asymmetry of patients with facial paralysis. The scheme of face recognition technology for medical diagnosis is as follows: take photos of subjects → automatically or semi-automatically set marker points → extract facial feature data → compare and classify with known patterns [7]. It is expected to get a list of facial feature points through the function library of face_landmarks and opencv, assuming that each face is a dictionary, including nose_bridge, right_eyebrow, right_eye, chine, left_eyebrow, bottom_lip, nose_tip, top_lip, left_eye, each part contains several parts feature points (x, y), there are 68 feature points in total, from which at least five feature points are intercepted, and the method based on the symmetry axis or the method of symmetry axis and distance difference [1] is used to compare the different facial features of the patient. The degree of symmetry is used to analyze the degree of facial paralysis. 

This research has been preliminarily verified and used for auxiliary diagnosis of facial paralysis and other diseases. By extracting the location data of specific landmarks on the patient's face and analyzing the patient's face data information, it is possible to obtain development trends that are difficult to identify with the naked eye, and make timely judgments on the condition. The application of face recognition technology in medical diagnosis can reduce the problems of delayed diagnosis of diseases and insufficient medical resources. It is expected to be used for early screening of diseases in the future, improve the effectiveness of clinical diagnosis, and provide clinical workers with a new perspective of medical diagnosis. 

 

Guess you like

Origin blog.csdn.net/whirlwind526/article/details/130447319