OpenCV combat (30) - the collision of OpenCV and machine learning
0. Preface
With the development of artificial intelligence, many machine learning algorithms are used to solve machine vision problems. Machine learning is a broad field of study that encompasses many important concepts. In this section we describe some of the main machine learning techniques and how they can be used to OpenCV
apply them in computer vision systems.
1. Introduction to Machine Learning
At its core, machine learning is the development of computer systems that can learn by themselves what to do with data input. Machine learning systems do not require explicit programming, but automatically train and learn based on data samples. Once the system is successfully trained, the trained system can output correct results for new unseen data.
Machine learning can be used to solve many types of problems, but in this section we focus on classification problems. Typically, in order to build a classifier that can recognize instances of a particular class, a large number of labeled samples must be used to train the classifier. In binary classification problems, the sample dataset consists of positive samples representing instances of the class to be learned and negative samples that do not belong to instances of the class of interest. From the sample data, the system learns a decision function that predicts the correct class of the input instance.
In computer vision, data samples can be images or video clips. Therefore, it is first necessary to describe the content of each image in a unified way. A simple representation is to scale the image to a fixed size, concatenate the scaled pixels row by row to form a vector, and then use it as a machine learning training samples for the algorithm. In this section we will study different image representation methods and build a classic face recognition model.
2. Nearest Neighbor Face Recognition Based on Local Binary Patterns
We first introduce nearest-neighbor classification ( nearest neighbor classification
) and local binary pattern ( Local Binary Pattern
, LBP
) features, a popular image representation method that encodes texture patterns and contours of images in a unique way.
We will use the above two techniques to solve the face recognition problem. 20
Face recognition is a very challenging problem that has been a popular research object in the past years, in this section, we introduce OpenCV
the face recognition solution implemented in .
OpenCV
The library provides cv::face::FaceRecognizer
face recognition methods implemented by subclasses of many common classes. In this section, we'll learn about cv::face::LBPHFaceRecognizer
the class, which is a classification method based on a simple but often effective classifier, the nearest neighbor classifier. Furthermore, the image representation it uses is LBP
constructed from features, which is a popular way of characterizing patterns in images.
(1) To create cv::face::LBPHFaceRecognizer
an instance of the class, call its static create
method:
cv::Ptr<cv::face::FaceRecognizer> recognizer =
cv::face::LBPHFaceRecognizer::create(1, // LBP 模式半径
8, // 要考虑的邻居像素数
8, 8, // 单元格尺寸
200.); // 到最近邻居的最小距离
The first two parameters of class (2) cv::face::LBPHFaceRecognizer
are used to describe the features to be used LBP
, and then provide the input reference face image to the recognizer. The input reference image needs to provide two vectors: one contains the face image, and the other contains the associated label, which is an integer value used to identify a specific person. The recognizer is trained by feeding it different images of the person it wants to recognize, the more representative the input image is, the better the chance of identifying the correct person. In the example, we just provide two images of two reference people, train
the method is to call:
// 参考图像矢量及其标签
std::vector<cv::Mat> referenceImages;
std::vector<int> labels;
// 打开参考图像
referenceImages.push_back(cv::imread("p0_1.png", cv::IMREAD_GRAYSCALE));
labels.push_back(0); // person 0
referenceImages.push_back(cv::imread("p0_2.png", cv::IMREAD_GRAYSCALE));
labels.push_back(0); // person 0
referenceImages.push_back(cv::imread("p1_1.png", cv::IMREAD_GRAYSCALE));
labels.push_back(1); // person 1
referenceImages.push_back(cv::imread("p1_2.png", cv::IMREAD_GRAYSCALE));
labels.push_back(1); // person 1
// 通过计算 LBPH 来训练分类器
recognizer->train(referenceImages, labels);
(3) The pictures used are shown in the figure below, the first line is 0
the picture of the person numbered , and the second line is 1
the picture of the person numbered :
(4) The quality of the reference image is also important. Additionally, we can perform normalization on it, i.e. put the main facial features at normalized positions. For example, the tip of the nose is in the middle of the image, and the two eyes are horizontally aligned at a specific image location, which can be used to automatically standardize facial feature detection methods for facial images. By providing an input image, the model predicts the label corresponding to the face image:
// 预测图像标签
recognizer->predict(inputImage, // 人脸图像
predictedLabel, // 图像的预测标签
confidence); // 预测的置信度
The input image is as shown below:
The recognizer returns not only the predicted label, but also the corresponding confidence score. In cv::face::LBPHFaceRecognizer
the class, confidence is used to measure the distance between the recognized face and the original model, the lower the value, the more confident the recognizer is in its prediction.
3. Image representation and face recognition
In order to understand the face recognition method presented in this section, next, we will explain its two main components: image representation and classification method.
cv::face::LBPHFaceRecognizer
The algorithm makes use of LBP
the feature, which is a way of describing the image patterns present in an image. It is a local representation that converts each pixel into a binary representation by encoding the pattern of image intensities found in its neighborhood. To achieve this goal, the following rules need to be applied; compare the local pixel with each of its selected neighbors, and if its value is greater than its neighbor's value, set the value at the corresponding position, otherwise set it 0
to 1
. Most commonly, each pixel needs to 8
be compared with its immediate neighbors, resulting in 8
a bit pattern. For example, suppose we have the following partial pattern:
[ 87 98 17 21 26 89 19 24 90 ] \left[ \begin{array}{ccc} 87&98&17\\ 21&26&89\\ 19&24&90\\\end{array}\right]
872119982624178990
Applying the above rules produces the following binary value:
[ 1 1 0 0 1 0 0 1 ] \left[ \begin{array}{ccc} 1&1&0\\ 0&&1\\ 0&0&1\\\end{array}\right]
10010011
Take the upper left pixel as the initial position and move clockwise, the center pixel is 11011000
replaced by the binary sequence of . LBP
A complete 8
bit image can be generated by looping through all pixels of the image to generate the corresponding bytes for all pixels LBP
:
// 计算灰度图像点局部二值特征
void lbp(const cv::Mat &image, cv::Mat &result) {
assert(image.channels() == 1); // 输入图像必须为灰度图像
result.create(image.size(), CV_8U); // 内存分配
for (int j = 1; j<image.rows - 1; j++) {
// 循环所有行 (除了第一行和最后一行)
const uchar* previous = image.ptr<const uchar>(j - 1); // 上一行
const uchar* current = image.ptr<const uchar>(j); // 当前行
const uchar* next = image.ptr<const uchar>(j + 1); // 下一行
uchar* output = result.ptr<uchar>(j); // 输出行
for (int i = 1; i<image.cols - 1; i++) {
// 局部二值特征
*output = previous[i - 1] > current[i] ? 1 : 0;
*output |= previous[i] > current[i] ? 2 : 0;
*output |= previous[i + 1] > current[i] ? 4 : 0;
*output |= current[i - 1] > current[i] ? 8 : 0;
*output |= current[i + 1] > current[i] ? 16 : 0;
*output |= next[i - 1] > current[i] ? 32 : 0;
*output |= next[i] > current[i] ? 64 : 0;
*output |= next[i + 1] > current[i] ? 128 : 0;
output++; // 下一像素
}
}
// 将未处理像素置为零
result.row(0).setTo(cv::Scalar(0));
result.row(result.rows - 1).setTo(cv::Scalar(0));
result.col(0).setTo(cv::Scalar(0));
result.col(result.cols - 1).setTo(cv::Scalar(0));
}
The body of the loop compares each pixel to its eight neighbors and assigns a bit value:
You will end up with an LBP
image, which can be displayed as a grayscale image:
In cv::face::LBPHFaceRecognizer
the class, create
the first two parameters of the method specify the neighborhood to consider by size (i.e. radius in pixels) and dimension (i.e. number of pixels along the circle, interpolation may be applied). After the image is generated LBP
, the image is divided into grids. The size of the grid create
is specified by the third parameter of the method.
For each block in the resulting grid, build LBP
a histogram of the values. bin
A global image representation is obtained by concatenating the counts of all these histograms into one long vector. Using 8×8
the grid, the computed sets of 256
histograms bin
form a 16384
dimensional vector.
cv::face::LBPHFaceRecognizer
The method of the class train
generates a long vector for each reference image provided. Then, each face image can be regarded as a point in the high-dimensional space. When a new image is passed to the recognizer using predict
the method, the closest reference point to that image is found. Therefore, the label associated with that point is the predicted label and the confidence value is the calculated distance. There is usually another case; if the nearest neighbor of an input point is too far away from it, then this may mean that the point does not actually belong to any reference class. We can use the fourth parameter of the method cv::face::LBPHFaceRecognizer
of the class to specify how far the distance will be considered as an outlier. The effectiveness of this method can be observed by plotting different classes in the representation space to generate different point clouds. Another advantage of this method is that it handles multiple classes implicitly, since it only gets the predicted class from the nearest neighbors. Its disadvantage lies in the high computational cost, it may take a lot of time to find the nearest neighbor in a huge space that may consist of a large number of sample points, and the space cost of storing all these sample points is also high.create
4. Complete code
The complete code recognizeFace.cpp
is as follows:
#include <iostream>
#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <opencv2/face.hpp>
// 计算灰度图像点局部二值特征
void lbp(const cv::Mat &image, cv::Mat &result) {
assert(image.channels() == 1); // 输入图像必须为灰度图像
result.create(image.size(), CV_8U); // 内存分配
for (int j = 1; j<image.rows - 1; j++) {
// 循环所有行 (除了第一行和最后一行)
const uchar* previous = image.ptr<const uchar>(j - 1); // 上一行
const uchar* current = image.ptr<const uchar>(j); // 当前行
const uchar* next = image.ptr<const uchar>(j + 1); // 下一行
uchar* output = result.ptr<uchar>(j); // 输出行
for (int i = 1; i<image.cols - 1; i++) {
// 局部二值特征
*output = previous[i - 1] > current[i] ? 1 : 0;
*output |= previous[i] > current[i] ? 2 : 0;
*output |= previous[i + 1] > current[i] ? 4 : 0;
*output |= current[i - 1] > current[i] ? 8 : 0;
*output |= current[i + 1] > current[i] ? 16 : 0;
*output |= next[i - 1] > current[i] ? 32 : 0;
*output |= next[i] > current[i] ? 64 : 0;
*output |= next[i + 1] > current[i] ? 128 : 0;
output++; // 下一像素
}
}
// 将未处理像素置为零
result.row(0).setTo(cv::Scalar(0));
result.row(result.rows - 1).setTo(cv::Scalar(0));
result.col(0).setTo(cv::Scalar(0));
result.col(result.cols - 1).setTo(cv::Scalar(0));
}
int main(){
cv::Mat image = imread("test_img.png", cv::IMREAD_GRAYSCALE);
cv::imshow("Original image", image);
cv::Mat lbpImage;
lbp(image, lbpImage);
cv::imshow("LBP image", lbpImage);
cv::Ptr<cv::face::FaceRecognizer> recognizer =
cv::face::LBPHFaceRecognizer::create(1, // LBP 模式半径
8, // 要考虑的邻居像素数
8, 8, // 单元格尺寸
200.); // 到最近邻居的最小距离
// 参考图像矢量及其标签
std::vector<cv::Mat> referenceImages;
std::vector<int> labels;
// 打开参考图像
referenceImages.push_back(cv::imread("p0_1.png", cv::IMREAD_GRAYSCALE));
labels.push_back(0); // person 0
referenceImages.push_back(cv::imread("p0_2.png", cv::IMREAD_GRAYSCALE));
labels.push_back(0); // person 0
referenceImages.push_back(cv::imread("p1_1.png", cv::IMREAD_GRAYSCALE));
labels.push_back(1); // person 1
referenceImages.push_back(cv::imread("p1_2.png", cv::IMREAD_GRAYSCALE));
labels.push_back(1); // person 1
// 4 个正样本
cv::Mat faceImages(2 * referenceImages[0].rows, 2 * referenceImages[0].cols, CV_8U);
for (int i = 0; i < 2; i++)
for (int j = 0; j < 2; j++) {
referenceImages[i * 2 + j].copyTo(faceImages(cv::Rect(j*referenceImages[i * 2 + j].cols, i*referenceImages[i * 2 + j].rows, referenceImages[i * 2 + j].cols, referenceImages[i * 2 + j].rows)));
}
cv::resize(faceImages, faceImages, cv::Size(), 0.5, 0.5);
cv::imshow("Reference faces", faceImages);
// 通过计算 LBPH 来训练分类器
recognizer->train(referenceImages, labels);
int predictedLabel = -1;
double confidence = 0.0;
// 提取人脸图像
cv::Mat inputImage;
cv::resize(image(cv::Rect(300, 75, 150, 150)), inputImage, cv::Size(256, 256));
cv::imshow("Input image", inputImage);
// 预测图像标签
recognizer->predict(inputImage, // 人脸图像
predictedLabel, // 图像的预测标签
confidence); // 预测的置信度
std::cout << "Image label= " << predictedLabel << " (" << confidence << ")" << std::endl;
cv::waitKey();
}
summary
Machine learning is a subset of artificial intelligence. It provides computers and other computing-capable systems with the ability to automatically predict or make decisions. Machine learning applications such as virtual assistants, license plate recognition systems, and intelligent recommendation systems bring great benefits to our daily lives. A convenient experience. In this section, we introduced how to implement OpenCV
machine learning algorithms in computer vision applications, and took face recognition as an example to experience the power of artificial intelligence.
series link
OpenCV actual combat (1) - OpenCV and image processing foundation
OpenCV actual combat (2) - OpenCV core data structure
OpenCV actual combat (3) - image area of interest
OpenCV actual combat (4) - pixel operation
OpenCV actual combat (5) - Image operation detailed
OpenCV actual combat (6) - OpenCV strategy design mode
OpenCV actual combat (7) - OpenCV color space conversion
OpenCV actual combat (8) - histogram detailed
OpenCV actual combat (9) - image detection based on backprojection histogram Content
OpenCV actual combat (10) - detailed explanation of integral image
OpenCV actual combat (11) - detailed explanation of morphological transformation
OpenCV actual combat (12) - detailed explanation of image filtering
OpenCV actual combat (13) - high-pass filter and its application
OpenCV actual combat (14) ——Image Line Extraction
OpenCV Actual Combat (15) ——Contour Detection Detailed
OpenCV Actual Combat (16) ——Corner Point Detection Detailed
OpenCV Actual Combat (17) —— FAST Feature Point Detection
OpenCV Actual Combat (18) —— Feature Matching
OpenCV Actual Combat (19) )——Feature Descriptor
OpenCV Actual Combat (20)——Image Projection Relationship
OpenCV Actual Combat (21)—Based on Random Sample Consistent Matching Image
OpenCV Actual Combat (22)——Homography and Its Application
OpenCV Actual Combat (23)——Camera Calibrate
OpenCV actual combat (24) - camera pose estimation
OpenCV actual combat (25) - 3D scene reconstruction
OpenCV actual combat (26) - video sequence processing
OpenCV actual combat (27) - tracking feature points in the video
OpenCV actual combat (28) - optical flow estimation
OpenCV actual combat (29) - video object tracking