OpenCV combat (30) - the collision of OpenCV and machine learning

0. Preface

With the development of artificial intelligence, many machine learning algorithms are used to solve machine vision problems. Machine learning is a broad field of study that encompasses many important concepts. In this section we describe some of the main machine learning techniques and how they can be used to OpenCVapply them in computer vision systems.

1. Introduction to Machine Learning

At its core, machine learning is the development of computer systems that can learn by themselves what to do with data input. Machine learning systems do not require explicit programming, but automatically train and learn based on data samples. Once the system is successfully trained, the trained system can output correct results for new unseen data.
Machine learning can be used to solve many types of problems, but in this section we focus on classification problems. Typically, in order to build a classifier that can recognize instances of a particular class, a large number of labeled samples must be used to train the classifier. In binary classification problems, the sample dataset consists of positive samples representing instances of the class to be learned and negative samples that do not belong to instances of the class of interest. From the sample data, the system learns a decision function that predicts the correct class of the input instance.
In computer vision, data samples can be images or video clips. Therefore, it is first necessary to describe the content of each image in a unified way. A simple representation is to scale the image to a fixed size, concatenate the scaled pixels row by row to form a vector, and then use it as a machine learning training samples for the algorithm. In this section we will study different image representation methods and build a classic face recognition model.

2. Nearest Neighbor Face Recognition Based on Local Binary Patterns

We first introduce nearest-neighbor classification ( nearest neighbor classification) and local binary pattern ( Local Binary Pattern, LBP) features, a popular image representation method that encodes texture patterns and contours of images in a unique way.
We will use the above two techniques to solve the face recognition problem. 20Face recognition is a very challenging problem that has been a popular research object in the past years, in this section, we introduce OpenCVthe face recognition solution implemented in .
OpenCVThe library provides cv::face::FaceRecognizerface recognition methods implemented by subclasses of many common classes. In this section, we'll learn about cv::face::LBPHFaceRecognizerthe class, which is a classification method based on a simple but often effective classifier, the nearest neighbor classifier. Furthermore, the image representation it uses is LBPconstructed from features, which is a popular way of characterizing patterns in images.

(1) To create cv::face::LBPHFaceRecognizeran instance of the class, call its static createmethod:

    cv::Ptr<cv::face::FaceRecognizer> recognizer =
        cv::face::LBPHFaceRecognizer::create(1, // LBP 模式半径
                                        8,      // 要考虑的邻居像素数
                                        8, 8,   // 单元格尺寸
                                        200.);  // 到最近邻居的最小距离

The first two parameters of class (2) cv::face::LBPHFaceRecognizer are used to describe the features to be used LBP, and then provide the input reference face image to the recognizer. The input reference image needs to provide two vectors: one contains the face image, and the other contains the associated label, which is an integer value used to identify a specific person. The recognizer is trained by feeding it different images of the person it wants to recognize, the more representative the input image is, the better the chance of identifying the correct person. In the example, we just provide two images of two reference people, trainthe method is to call:

    // 参考图像矢量及其标签
    std::vector<cv::Mat> referenceImages;
    std::vector<int> labels;
    // 打开参考图像
    referenceImages.push_back(cv::imread("p0_1.png", cv::IMREAD_GRAYSCALE));
    labels.push_back(0); // person 0
    referenceImages.push_back(cv::imread("p0_2.png", cv::IMREAD_GRAYSCALE));
    labels.push_back(0); // person 0
    referenceImages.push_back(cv::imread("p1_1.png", cv::IMREAD_GRAYSCALE));
    labels.push_back(1); // person 1
    referenceImages.push_back(cv::imread("p1_2.png", cv::IMREAD_GRAYSCALE));
    labels.push_back(1); // person 1
    // 通过计算 LBPH 来训练分类器
    recognizer->train(referenceImages, labels);

(3) The pictures used are shown in the figure below, the first line is 0the picture of the person numbered , and the second line is 1the picture of the person numbered :

character picture

(4) The quality of the reference image is also important. Additionally, we can perform normalization on it, i.e. put the main facial features at normalized positions. For example, the tip of the nose is in the middle of the image, and the two eyes are horizontally aligned at a specific image location, which can be used to automatically standardize facial feature detection methods for facial images. By providing an input image, the model predicts the label corresponding to the face image:

    // 预测图像标签
    recognizer->predict(inputImage,     // 人脸图像
                        predictedLabel, // 图像的预测标签
                        confidence);    // 预测的置信度

The input image is as shown below:

input image

The recognizer returns not only the predicted label, but also the corresponding confidence score. In cv::face::LBPHFaceRecognizerthe class, confidence is used to measure the distance between the recognized face and the original model, the lower the value, the more confident the recognizer is in its prediction.

3. Image representation and face recognition

In order to understand the face recognition method presented in this section, next, we will explain its two main components: image representation and classification method.
cv::face::LBPHFaceRecognizerThe algorithm makes use of LBPthe feature, which is a way of describing the image patterns present in an image. It is a local representation that converts each pixel into a binary representation by encoding the pattern of image intensities found in its neighborhood. To achieve this goal, the following rules need to be applied; compare the local pixel with each of its selected neighbors, and if its value is greater than its neighbor's value, set the value at the corresponding position, otherwise set it 0to 1. Most commonly, each pixel needs to 8be compared with its immediate neighbors, resulting in 8a bit pattern. For example, suppose we have the following partial pattern:
[ 87 98 17 21 26 89 19 24 90 ] \left[ \begin{array}{ccc} 87&98&17\\ 21&26&89\\ 19&24&90\\\end{array}\right] 872119982624178990
Applying the above rules produces the following binary value:
[ 1 1 0 0 1 0 0 1 ] \left[ \begin{array}{ccc} 1&1&0\\ 0&&1\\ 0&0&1\\\end{array}\right] 10010011
Take the upper left pixel as the initial position and move clockwise, the center pixel is 11011000replaced by the binary sequence of . LBPA complete 8bit image can be generated by looping through all pixels of the image to generate the corresponding bytes for all pixels LBP:

// 计算灰度图像点局部二值特征
void lbp(const cv::Mat &image, cv::Mat &result) {
    
    
    assert(image.channels() == 1);      // 输入图像必须为灰度图像
    result.create(image.size(), CV_8U); // 内存分配
    for (int j = 1; j<image.rows - 1; j++) {
    
                        // 循环所有行 (除了第一行和最后一行)
        const uchar* previous = image.ptr<const uchar>(j - 1);  // 上一行
        const uchar* current = image.ptr<const uchar>(j);       // 当前行
        const uchar* next = image.ptr<const uchar>(j + 1);      // 下一行
        uchar* output = result.ptr<uchar>(j);                   // 输出行
        for (int i = 1; i<image.cols - 1; i++) {
    
    
            // 局部二值特征
            *output = previous[i - 1] > current[i] ? 1 : 0;
            *output |= previous[i] > current[i] ? 2 : 0;
            *output |= previous[i + 1] > current[i] ? 4 : 0;
            *output |= current[i - 1] > current[i] ? 8 : 0;
            *output |= current[i + 1] > current[i] ? 16 : 0;
            *output |= next[i - 1] > current[i] ? 32 : 0;
            *output |= next[i] > current[i] ? 64 : 0;
            *output |= next[i + 1] > current[i] ? 128 : 0;
            output++;   // 下一像素
        }
    }
    // 将未处理像素置为零
    result.row(0).setTo(cv::Scalar(0));
    result.row(result.rows - 1).setTo(cv::Scalar(0));
    result.col(0).setTo(cv::Scalar(0));
    result.col(result.cols - 1).setTo(cv::Scalar(0));
}

The body of the loop compares each pixel to its eight neighbors and assigns a bit value:

The original image
You will end up with an LBPimage, which can be displayed as a grayscale image:

LBP image

In cv::face::LBPHFaceRecognizerthe class, createthe first two parameters of the method specify the neighborhood to consider by size (i.e. radius in pixels) and dimension (i.e. number of pixels along the circle, interpolation may be applied). After the image is generated LBP, the image is divided into grids. The size of the grid createis specified by the third parameter of the method.
For each block in the resulting grid, build LBPa histogram of the values. binA global image representation is obtained by concatenating the counts of all these histograms into one long vector. Using 8×8the grid, the computed sets of 256histograms binform a 16384dimensional vector.
cv::face::LBPHFaceRecognizerThe method of the class traingenerates a long vector for each reference image provided. Then, each face image can be regarded as a point in the high-dimensional space. When a new image is passed to the recognizer using predictthe method, the closest reference point to that image is found. Therefore, the label associated with that point is the predicted label and the confidence value is the calculated distance. There is usually another case; if the nearest neighbor of an input point is too far away from it, then this may mean that the point does not actually belong to any reference class. We can use the fourth parameter of the method cv::face::LBPHFaceRecognizerof the class to specify how far the distance will be considered as an outlier. The effectiveness of this method can be observed by plotting different classes in the representation space to generate different point clouds. Another advantage of this method is that it handles multiple classes implicitly, since it only gets the predicted class from the nearest neighbors. Its disadvantage lies in the high computational cost, it may take a lot of time to find the nearest neighbor in a huge space that may consist of a large number of sample points, and the space cost of storing all these sample points is also high.create

4. Complete code

The complete code recognizeFace.cppis as follows:

#include <iostream>
#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <opencv2/face.hpp>

// 计算灰度图像点局部二值特征
void lbp(const cv::Mat &image, cv::Mat &result) {
    
    
    assert(image.channels() == 1);      // 输入图像必须为灰度图像
    result.create(image.size(), CV_8U); // 内存分配
    for (int j = 1; j<image.rows - 1; j++) {
    
                        // 循环所有行 (除了第一行和最后一行)
        const uchar* previous = image.ptr<const uchar>(j - 1);  // 上一行
        const uchar* current = image.ptr<const uchar>(j);       // 当前行
        const uchar* next = image.ptr<const uchar>(j + 1);      // 下一行
        uchar* output = result.ptr<uchar>(j);                   // 输出行
        for (int i = 1; i<image.cols - 1; i++) {
    
    
            // 局部二值特征
            *output = previous[i - 1] > current[i] ? 1 : 0;
            *output |= previous[i] > current[i] ? 2 : 0;
            *output |= previous[i + 1] > current[i] ? 4 : 0;
            *output |= current[i - 1] > current[i] ? 8 : 0;
            *output |= current[i + 1] > current[i] ? 16 : 0;
            *output |= next[i - 1] > current[i] ? 32 : 0;
            *output |= next[i] > current[i] ? 64 : 0;
            *output |= next[i + 1] > current[i] ? 128 : 0;
            output++;   // 下一像素
        }
    }
    // 将未处理像素置为零
    result.row(0).setTo(cv::Scalar(0));
    result.row(result.rows - 1).setTo(cv::Scalar(0));
    result.col(0).setTo(cv::Scalar(0));
    result.col(result.cols - 1).setTo(cv::Scalar(0));
}

int main(){
    
    
    cv::Mat image = imread("test_img.png", cv::IMREAD_GRAYSCALE);
    cv::imshow("Original image", image);
    cv::Mat lbpImage;
    lbp(image, lbpImage);
    cv::imshow("LBP image", lbpImage);
    cv::Ptr<cv::face::FaceRecognizer> recognizer =
        cv::face::LBPHFaceRecognizer::create(1, // LBP 模式半径
                                        8,      // 要考虑的邻居像素数
                                        8, 8,   // 单元格尺寸
                                        200.);  // 到最近邻居的最小距离
    // 参考图像矢量及其标签
    std::vector<cv::Mat> referenceImages;
    std::vector<int> labels;
    // 打开参考图像
    referenceImages.push_back(cv::imread("p0_1.png", cv::IMREAD_GRAYSCALE));
    labels.push_back(0); // person 0
    referenceImages.push_back(cv::imread("p0_2.png", cv::IMREAD_GRAYSCALE));
    labels.push_back(0); // person 0
    referenceImages.push_back(cv::imread("p1_1.png", cv::IMREAD_GRAYSCALE));
    labels.push_back(1); // person 1
    referenceImages.push_back(cv::imread("p1_2.png", cv::IMREAD_GRAYSCALE));
    labels.push_back(1); // person 1
    // 4 个正样本
    cv::Mat faceImages(2 * referenceImages[0].rows, 2 * referenceImages[0].cols, CV_8U);
    for (int i = 0; i < 2; i++)
        for (int j = 0; j < 2; j++) {
    
     
            referenceImages[i * 2 + j].copyTo(faceImages(cv::Rect(j*referenceImages[i * 2 + j].cols, i*referenceImages[i * 2 + j].rows, referenceImages[i * 2 + j].cols, referenceImages[i * 2 + j].rows)));
        }
    cv::resize(faceImages, faceImages, cv::Size(), 0.5, 0.5);
    cv::imshow("Reference faces", faceImages);
    // 通过计算 LBPH 来训练分类器
    recognizer->train(referenceImages, labels);
    int predictedLabel = -1;
    double confidence = 0.0;
    // 提取人脸图像
    cv::Mat inputImage;
    cv::resize(image(cv::Rect(300, 75, 150, 150)), inputImage, cv::Size(256, 256));
    cv::imshow("Input image", inputImage);
    // 预测图像标签
    recognizer->predict(inputImage,     // 人脸图像
                        predictedLabel, // 图像的预测标签
                        confidence);    // 预测的置信度
    std::cout << "Image label= " << predictedLabel << " (" << confidence << ")" << std::endl;
    cv::waitKey();
}

summary

Machine learning is a subset of artificial intelligence. It provides computers and other computing-capable systems with the ability to automatically predict or make decisions. Machine learning applications such as virtual assistants, license plate recognition systems, and intelligent recommendation systems bring great benefits to our daily lives. A convenient experience. In this section, we introduced how to implement OpenCVmachine learning algorithms in computer vision applications, and took face recognition as an example to experience the power of artificial intelligence.

series link

OpenCV actual combat (1) - OpenCV and image processing foundation
OpenCV actual combat (2) - OpenCV core data structure
OpenCV actual combat (3) - image area of ​​interest
OpenCV actual combat (4) - pixel operation
OpenCV actual combat (5) - Image operation detailed
OpenCV actual combat (6) - OpenCV strategy design mode
OpenCV actual combat (7) - OpenCV color space conversion
OpenCV actual combat (8) - histogram detailed
OpenCV actual combat (9) - image detection based on backprojection histogram Content
OpenCV actual combat (10) - detailed explanation of integral image
OpenCV actual combat (11) - detailed explanation of morphological transformation
OpenCV actual combat (12) - detailed explanation of image filtering
OpenCV actual combat (13) - high-pass filter and its application
OpenCV actual combat (14) ——Image Line Extraction
OpenCV Actual Combat (15) ——Contour Detection Detailed
OpenCV Actual Combat (16) ——Corner Point Detection Detailed
OpenCV Actual Combat (17) —— FAST Feature Point Detection
OpenCV Actual Combat (18) —— Feature Matching
OpenCV Actual Combat (19) )——Feature Descriptor
OpenCV Actual Combat (20)——Image Projection Relationship
OpenCV Actual Combat (21)—Based on Random Sample Consistent Matching Image
OpenCV Actual Combat (22)——Homography and Its Application
OpenCV Actual Combat (23)——Camera Calibrate
OpenCV actual combat (24) - camera pose estimation
OpenCV actual combat (25) - 3D scene reconstruction
OpenCV actual combat (26) - video sequence processing
OpenCV actual combat (27) - tracking feature points in the video
OpenCV actual combat (28) - optical flow estimation
OpenCV actual combat (29) - video object tracking

Guess you like

Origin blog.csdn.net/LOVEmy134611/article/details/132594464