Ranging monocular camera

Ranging monocular camera

Monocular ranging small projects probably need is a monocular camera, to identify a specific target and measures the distance from the camera to the target. So I go online to find a bunch of tutorials, here for everyone to sum up, we want to give a white reference.

The first is the basic needs of

  • opencv naturally will, which let's not much to say, will be a little on the line
  • Need a camera, I use a lot of fish-eye distortion-free drive camera comes with a camera that we use on a computer can also be, that is not convenient.
  • MATLAB camera calibration needs to be

In fact, above all nonsense, get to the bar below.

Online methods are of two kinds, here introduces a brother to me who are called PnP problem one way, but will simply introduce another two relatively simple and crude, but in fact the principle feasible ineffective methods.

Camera distortion correction

At the time of monocular distance with the camera, called the need to use a camera internal control things, which need to rely on camera calibration to get. These probably from camera model to start:

Opencv camera model is that each school's students to be sooner or later come into contact with it!

Conducted experiments hole camera model is the most common kind of simple pinhole imaging camera model, the model we use the following chart with a too good:

 

 

 

Where f is a camera parameter we know - the focal length , and the intersection of the optical axis of the imaging plane called the principal point , X represents a length of the arrow, Z is the distance from the camera to the arrow. In the figure this simple and ideal pinhole imaging "camera", we can easily write the conversion relationship between the yellow arrow in the real world coordinate system and the imaging plane coordinate system:

 

 

 

However, in practice the camera, the imaging plane of the camera is sensitive chip, the pinhole lens is, however, did not in the main point of the center of the imaging plane (i.e. the center of the lens optical axis with the sensor chip is not in a line), because in actual production, we can not do the camera is an imaging means installed inside a precision of micron level, we need to introduce two new parameters Cx and Cy, our hardware to be corrected offset:

 

 


 

In the above formula we introduce two different focal lengths fx and fy, since a single pixel in the image forming apparatus is low rectangular rather than square. Wherein, fx is the product of the unit size of each physical Sx focal length of the imaging lens of the apparatus.

 

 

By the above equation we can know four parameters of the camera's internal reference , and are fx, fy, Cx, Cy. But in the calculations, we used to come to a certain transformation by some mathematical skills, resulting in the following equation:

 

 


 

among them:

 

 

 

Through the above formula, we can point in space and picture match point up. M matrix equation is that we often hear the camera matrix of intrinsic a.

External camera parameters

Internal reference and a camera, there is the outer camera parameters, the camera parameters from the camera itself outside the distortion, the distortion can be divided into radial distortion (shape of the lens have caused) and shear (caused by the installation process itself entire camera) to the Distortion .

 

 


 

Image distortion caused by the lens shape itself, a good lens, after some sophisticated processing, distortion is not obvious, but on the regular network camera distortion is particularly prominent. We can be seen as a distortion near r = 0 the number of Tai Leqi expanded to several former is. Typically two for the former k1, k2, for the fisheye lens, will use the first three k3. The radial position of a point can be adjusted according to the following equation on the image forming apparatus, then we will have two or three unknown variables:

 

 

At this point, we get a total of five parameters: K1 K2 K3 P1 P2  , these five parameters are necessary for us to eliminate distortion, distortion called a vector, also known as external camera parameters.

 

 

 

Tangential distortion due to manufacturing defects of the lens is not parallel to the imaging plane is generated. It can be cut by two parameters p1 and p2 to represent the distortion:

 

 

 

 

 

 

At this point, we get a total of five parameters: K1 K2 K3 P1 P2  , these five parameters are necessary for us to eliminate distortion, distortion called a vector, also known as external camera parameters.

Camera Calibration

To participate in the off-camera in the above, the camera reference a total of at least eight parameters, and we want to eliminate camera distortion and camera calibration to rely on to solve the eight unknown parameters.

Finished camera model, have to say about the camera calibration, camera calibration in order to solve these eight parameters above, that solved the eight parameters can be doing it? Software can eliminate distortion, that is, after the above eight parameters, using mathematical formulas listed above, the offset of each pixel homing.

Calibration need to use a called calibration plate thing, there are many types, but the common is probably drawing board, and the board requires a very high accuracy requirements, grid square, very expensive to buy a calibration plate, in csdn figure also up and down the board c to draw a lot of money, so you can draw a word, very simple, just do a five row table 7, widened to a full page, and then set the width and height of each grid square to set it again painted on it. Signed this picture, but there is no print out, I suggest that you own a painting on OK.

 

 

 

The calibration process is carried out using MATLAB, the process is not here to say, tutorials on CSDN grasping a lot, MATLAB will return the camera's internal control and external reference after the completion of the calibration. On the principle of "learning oepncv3" This book has been said very good, except according to the book copy I can not say anything new, but today, the principles do not understand it does not matter.

With the camera outside internal control parameters, we can carry cameras eliminate distortion of:


#include <opencv2/opencv.hpp>#include <opencv2\highgui\highgui.hpp>#include <iostream>#include <stdio.h>
using namespace std;

CV namespace the using;
const int imageWidth = 640; // define the image size, i.e., the resolution of the camera 

const int imageHeight = 480;

ImageSize = Size Size (imageWidth, imageHeight); Mat MapX, mapy; // camera internal reference

CameraMatrix = MAT (Mat_ <Double> (. 3,. 3) << 273.4985, 0, 321.2298,0, 273.3338, 239.7912,0, 0,. 1); // external camera parameters

Mat distCoeff = (Mat_<double>(1, 4) << -0.3551, 0.1386,0, 0);

:: = R & lt Eye Mat Mat (. 3,. 3, CV_32F);
VideoCapture CAP1;  // open the camera
void img_init (void)  // initialize camera

{  cap1.set(CAP_PROP_FOURCC, 'GPJM'); 

cap1.set(CAP_PROP_FRAME_WIDTH, imageWidth);  

cap1.set(CAP_PROP_FRAME_HEIGHT, imageHeight);}
int main(){  initUndistortRectifyMap(cameraMatrix, distCoeff, R, cameraMatrix, imageSize, CV_32FC1, mapx, mapy);  Mat frame;  img_init();while (1) 

{Cap1 >> frame; imshow ( "original fisheye camera image", frame); remap (frame, frame, mapx, mapy, INTER_LINEAR); imshow ( "eliminate distortion after", frame); waitKey (30);} return 0;

Source above we have two functions, line 39 and line 32, it is provided opencv eliminate distortion function to us.

Correcting mapping function calculation using cv :: initUndistortRecitifyMap (), the following function prototype:

initUndistortRectifyMap (InputArray cameraMaxtrix, 3 * 3 matrix of intrinsic

InputArray distCoeffs, distortion coefficient vector 1 * 4

InputArray R, or may be used to noArray (). Is a rotation matrix, will be used before the advance correction to compensate the rotational camera relative to the global coordinate system on the camera.

InputArray newCameraMatrix, when the monocular imaging it is generally not used

Size size, the size of the output map corresponds to the size of the image used to correct

int m1type, the final mapping type, may only CV_32FC1 32_16SC2, corresponding to the type of representation map1

OutputArray  map1,       

OutputArray  map2);

#include <opencv2/opencv.hpp>#include <opencv2\highgui\highgui.hpp>#include <iostream>#include <stdio.h>
using namespace std;

CV namespace the using;
const int imageWidth = 640; // define the image size, i.e., the resolution of the camera 

const int imageHeight = 480;

Size imageSize = Size(imageWidth, imageHeight);

MapX MAT, mapy; // camera internal reference

CameraMatrix = MAT (Mat_ <Double> (. 3,. 3) << 273.4985, 0, 321.2298,0, 273.3338, 239.7912,0, 0,. 1); // external camera parameters

DistCoeff = MAT (Mat_ <Double> (. 1,. 4) << -0.3551, 0.1386,0, 0); R & lt Mat Mat :: = Eye (. 3,. 3, CV_32F);
VideoCapture CAP1;  // open the camera
void img_init ( void)  // initialize the camera {cap1.set (CAP_PROP_FOURCC, 'GPJM'  );

cap1.set(CAP_PROP_FRAME_WIDTH, imageWidth);  cap1.set(CAP_PROP_FRAME_HEIGHT, imageHeight);}
int main(){  initUndistortRectifyMap(cameraMatrix, distCoeff, R, cameraMatrix, imageSize, CV_32FC1, mapx, mapy); 

Mat frame;  

img_init (); while (1) {cap1 >> frame; imshow ( "fisheye original camera images", frame);  

 remap(frame,frame,mapx,mapy, INTER_LINEAR);   

imshow ( "post-distortion cancellation", frame); waitKey (30);} return 0;}

Source above we have two functions, line 39 and line 32, it is provided opencv eliminate distortion function to us.

Correcting mapping function calculation using cv :: initUndistortRecitifyMap (), the following function prototype:

initUndistortRectifyMap (InputArray cameraMaxtrix, 3 * 3 matrix of intrinsic

InputArray distCoeffs, distortion coefficient vector 1 * 4

InputArray R, or may be used to noArray (). Is a rotation matrix, will be used before the advance correction to compensate the rotational camera relative to the global coordinate system on the camera.

InputArray newCameraMatrix, when the monocular imaging it is generally not used

Size size, the size of the output map corresponds to the size of the image used to correct

int m1type, the final mapping type, may only CV_32FC1 32_16SC2, corresponding to the type of representation map1

OutputArray  map1,        OutputArray  map2);

We simply use this function at the beginning of the program calculates a correction map can be used cv :: remap () function is the correction applied to the video image of each frame.

 

 

 

 

PnP ranging in

Well, to this we have a little bit of knowledge of the camera that point thing, and that is what PnP problem? In some cases we already know the intrinsic parameters of the camera, it is only necessary to calculate the position of the object being observed, and general camera calibration significantly different in this case, but there are similarities. This operation is called an N-point perspective (Perspective N-Point) or PnP problems.

CV :: solvePnP BOOL (CV :: InputArray objectPoints,     // coordinates in the matrix , at least four (world coordinate system) 

:: InputArray imagePoints CV,      // the pixel coordinates of the four points in the image 

:: InputArray cameraMatrix CV,     // camera internal reference matrix ( 9 * 9 ) 

:: InputArray distCoeffs CV,       // camera external reference matrix ( 1 * 4 ), or ( 1 * 5 ) 

:: outputArray RVEC CV,             // output rotation matrix 

:: outputArray tvec CV,             // output translation matrix 

bool        useExtrinsicGuess = false,   

int        flags = cv::SOLVEPNP_ITERATIVE);

First, let's explain what the output of this function is not it,

Rotation matrix is a 3 * 1 vector, the matrix may be expressed with respect to the camera rotation angle of the three axes of a world coordinate system XYZ.

Translation matrix is a 3-dimensional vector, the camera may represent XYZ axes offset with respect to the object, and this is what we request matrix: We know the position of the object relative to the camera, also obtained from the measurement in order to achieve the purpose pitch.

What are the parameters that enter it? The camera internal control and external camera parameters would not have said it.

The first argument is the object at any point four three point coordinates in the world coordinate system, why four is actually very good understanding, we need to solve is a matrix, and XYZ axis offset rotation, a total of four unknowns , at least four columns formula can only be solved.

A more detailed explanation we can look at this CSDN:

https://blog.csdn.net/cocoaqin/article/details/77841261

The second parameter , the pixel coordinates of the four points on the first argument we find any object in the image.

Now it is clearly understood, right? By rotating vector and translation vector we can get the camera coordinate system relative to the rotation and translation parameters of the World coordinate system.

But we have to solve a problem, how to ensure that the positions of the four points of it? Is, for example, a square object is a board, the board length 2L, I may be selected from the center of the board as the center of the world coordinate system, then I coordinate on the four corners of the board are obtained (L, L), (L, -L ), (- L, L), (- L, -L). But how to determine which of the four points on the image of the four corners of the board is it? You need to board identified. But if the board is not an individual? How do you divide the people out? This requires more complex things, what semantic segmentation classifier ah ah Han, there is not much to say.

I do not take the four corners of the board, using the corner detection can be arbitrarily taken four points, which would address the problem of correspondence between the world coordinate system and pixel coordinates, but we have a new problem, how to ensure that these four corner points are on the body of the object rather than the background of it? Or should square identified. . .

So much so, we have introduced two-dimensional code, we can directly identify the two-dimensional code ranging, here is necessary to use a call ZBar library stuff, and it can identify is a two-dimensional code or barcode library specific Baidu own right. Then we also need to learn a new library? opencv libraries have not learned to understand it, but also learn to identify a two-dimensional code? In fact you do not need two routines of this library has to meet our needs of:

例程一:
#include <zbar.h>#include <opencv2\opencv.hpp>#include <iostream>
int main(int argc, char*argv[]){  zbar::ImageScanner scanner;  scanner.set_config(zbar::ZBAR_NONE, zbar::ZBAR_CFG_ENABLE, 1);  cv::VideoCapture capture;  capture.open(0);  //打开摄像头  cv::Mat image;  cv::Mat imageGray;std::vector<cv::Point2f> obj_location;bool flag = true;
if (!capture.isOpened())  {std::cout << "cannot open cam!" << std::endl;  }else  {while (flag)    {      capture >> image;      cv::cvtColor(image, imageGray, CV_RGB2GRAY);int width = imageGray.cols;int height = imageGray.rows;      uchar *raw = (uchar *)imageGray.data;      zbar::Image imageZbar(width, height, "Y800", raw, width * height);      scanner.scan(imageZbar);  Scan the bar code//      zbar::Image::SymbolIterator symbol = imageZbar.symbol_begin();if (imageZbar.symbol_begin() != imageZbar.symbol_end())  //如果扫描到二维码      {        flag = false;//解析二维码for (int i = 0; i < symbol->get_location_size(); i++)        {          obj_location.push_back(cv::Point(symbol->get_location_x(i), symbol->get_location_y(i)));        }for (int i = 0; i < obj_location.size(); i++)        {          cv::line(image, obj_location[i], obj_location[(i + 1) % obj_location.size()], cv::Scalar(255, 0, 0), 3);//定位条码        }for (; symbol != imageZbar.symbol_end(); ++symbol)        {std::cout << "Code Type: " << std::endl << symbol->get_type_name() << std::endl; //获取条码类型std::cout << "Decode Result: " << std::endl << symbol->get_data() << std::endl;  //解码        }        imageZbar.set_data(NULL, 0);      }      cv::imshow("Result", image);      cv::waitKey(50);    }    cv::waitKey();  }return 0;}

This function may be implemented to open the camera, and to see two-dimensional code recognition, and thus the print content and type two-dimensional code:

 

 


 

So this library requires ZBar how to configure our VS2017 and opencv library and used with it? We can see my CSDN Bowen "Win10 + VS2017 + opencv410 + ZBar perfect Configuration Library":

https://blog.csdn.net/qq_43667130/article/details/104128684

Routine II:

#include <opencv2/opencv.hpp>#include <zbar.h>
using namespace cv;using namespace std;using namespace zbar;
typedef struct{string type;string data;vector <Point> location;} decodedObject;
// Find and decode barcodes and QR codesvoid decode(Mat &im, vector<decodedObject>&decodedObjects){
// Create zbar scanner  ImageScanner scanner;
// Configure scanner  scanner.set_config(ZBAR_NONE, ZBAR_CFG_ENABLE, 1);
// Convert image to grayscale  Mat imGray;  cvtColor(im, imGray,COLOR_BGR2GRAY);
// Wrap image data in a zbar imageImage image(im.cols, im.rows, "Y800", (uchar *)imGray.data, im.cols * im.rows);
// Scan the image for barcodes and QRCodesint n = scanner.scan(image);
// Print resultsfor(Image::SymbolIterator symbol = image.symbol_begin(); symbol != image.symbol_end(); ++symbol)  {    decodedObject obj;
    obj.type = symbol->get_type_name();    obj.data = symbol->get_data();
// Print type and datacout << "Type : " << obj.type << endl;cout << "Data : " << obj.data << endl << endl;
// Obtain locationfor(int i = 0; i< symbol->get_location_size(); i++)    {      obj.location.push_back(Point(symbol->get_location_x(i),symbol->get_location_y(i)));    }
    decodedObjects.push_back(obj);  }}
// Display barcode and QR code location  void display(Mat &im, vector<decodedObject>&decodedObjects){// Loop over all decoded objectsfor(int i = 0; i < decodedObjects.size(); i++)  {vector<Point> points = decodedObjects[i].location;vector<Point> hull;
// If the points do not form a quad, find convex hullif(points.size() > 4)      convexHull(points, hull);else      hull = points;
// Number of points in the convex hullint n = hull.size();
for(int j = 0; j < n; j++)    {      line(im, hull[j], hull[ (j+1) % n], Scalar(255,0,0), 3);    }
  }
// Display results   imshow("Results", im);  waitKey(0);
}
int main(int argc, char* argv[]){
// Read image  Mat im = imread("zbar-test.jpg");
// Variable for decoded objects vector<decodedObject> decodedObjects;
// Find and decode barcodes and QR codes  decode(im, decodedObjects);
// Display location   display(im, decodedObjects);
return EXIT_SUCCESS;}

The routine can be implemented on the basis of a routine on the function, can recognize the position of the two-dimensional code.

Code

Here, how to achieve ranging code written it? We need two routines on the basis of the above code on this, plus the camera distortion correction code, plus the code for a PnP function solved:

vector<Point3f> obj = vector<Point3f>{        cv::Point3f(-HALF_LENGTH, -HALF_LENGTH, 0),  //tl        cv::Point3f(HALF_LENGTH, -HALF_LENGTH, 0),  //tr        cv::Point3f(HALF_LENGTH, HALF_LENGTH, 0),  //br        cv::Point3f(-HALF_LENGTH, HALF_LENGTH, 0)  //bl    };   //自定义二维码四个点坐标    cv::Mat rVec = cv::Mat::zeros(3, 1, CV_64FC1);//init rvec     cv::Mat tVec = cv::Mat::zeros(3, 1, CV_64FC1);//init tvec    solvePnP(obj, pnts, cameraMatrix, distCoeff, rVec, tVec, false, SOLVEPNP_ITERATIVE);

The three parts of the above together, we can write it monocular ranging code:

#include "pch.h" #include <the iostream> #include <opencv2 / opencv.hpp> #include <zbar.h>
the using namespace CV; the using namespace STD;
#define HALF_LENGTH 15 // two-dimensional code is two-thirds the width a
const int imageWidth = 640; // set the image size, i.e., the resolution of the camera  const int imageHeight = 480; imageSize size = size (imageWidth, imageHeight); MapX Mat, mapy; // camera internal reference Mat cameraMatrix = (Mat_ <double > (3, 3) << 273.4985, 0, 321.2298,0, 273.3338, 239.7912,0, 0, 1); // external camera parameters Mat distCoeff = (Mat_ <double> (1, 4) << -0.3551, .1386, 0, 0); R & lt Mat Mat :: = Eye (. 3,. 3, CV_32F);
VideoCapture CAP1;
typedef struct   // definition of the structure of a two-dimensional code objects{String type; String Data; Vector <Point> LOCATION;} decodedObject;
void img_init (void); void decode (Mat & IM, Vector <decodedObject> & decodedObjects); void the display (Mat & IM, Vector <decodedObject> & decodedObjects);

int main (int argc, char * the argv []) {initUndistortRectifyMap (cameraMatrix, distCoeff, R & lt, cameraMatrix, imageSize, CV_32FC1, MapX, mapy); img_init (); namedWindow ( "yuantu", WINDOW_AUTOSIZE); Mat IM;
the while (waitKey ( !. 1) = 'Q') {CAP1 >> IM; IF (im.empty ()) BREAK; remap (IM, IM, MapX, mapy, INTER_LINEAR); // distortion correction     imshow ( "yuantu", im) ;
// decoded object variable     the Vector <decodedObject> decodedObjects;
// find and decode barcodes and 2D codes     decode (im,decodedObjects);
// display position    the display (IM, decodedObjects); // Vector <Point> points_xy decodedObjects = [0] .location; // assumed that a two-dimensional code figure on the object, the position of the four corners of the two-dimensional code taken     imshow ( "two-dimensional code", im );
    waitKey (30);}
return of EXIT_SUCCESS;}
void img_init (void) { // initialization camera   cap1.open (0); cap1.set (CAP_PROP_FOURCC , 'GPJM'); cap1.set (CAP_PROP_FRAME_WIDTH, imageWidth); cap1 .set (CAP_PROP_FRAME_HEIGHT, imageHeight);} // find and decode bar codes and two-dimensional code // input image // returns the found object as a barcode void decode (Mat & IM, Vector <decodedObject> & decodedObjects) {
// create zbar scan instrument   zbar :: ImageScanner scanner;
// configure the scanner  scanner.set_config (zbar :: ZBAR_NONE, zbar ZBAR_CFG_ENABLE ::,. 1);
// convert the image to grayscale gradation   Mat imGray; cvtColor (IM, imGray, COLOR_BGR2GRAY);
// image data packets contained in zbar image // can refer to: https://blog.csdn.net/bbdxf/article/details/79356259   zbar :: Image Image (im.cols, im.rows, "Y800", (uchar *) imGray.data, IM. * im.rows cols);
// the image scan and QRCodes // barcodes for scanning a bar code image and qr code   int = n-scanner.scan (image);
// the Print Results for (zbar, image :: :: SymbolIterator symbol = image.symbol_begin ();! symbol = image.symbol_end (); ++ symbol) {decodedObject obj;
    obj.type = symbol->get_type_name();    obj.data = symbol->get_data();
// Print type and data//打印//cout << "Type : " << obj.type << endl;//cout << "Data : " << obj.data << endl << endl;
// Obtain location//获取位置for (int i = 0; i < symbol->get_location_size(); i++)    {      obj.location.push_back(Point(symbol->get_location_x(i), symbol->get_location_y(i)));    }
    decodedObjects.push_back(obj);  }}// 显示位置  void display(Mat &im, vector<decodedObject>&decodedObjects){// Loop over all decoded objects//循环所有解码对象for (int i = 0; i < decodedObjects.size(); i++)  {    vector<Point> points = decodedObjects[i].location;    vector<Point> hull;
// If the points do not form a quad, find convex hull//如果这些点没有形成一个四边形,找到凸包if (points.size() > 4)      convexHull(points, hull);else      hull = points;    vector<Point2f> pnts;// Number of points in the convex hull//凸包中的点数    int n = hull.size();
for (int j = 0; j < n; j++)    {      line(im, hull[j], hull[(j + 1) % n], Scalar(255, 0, 0), 3);      pnts.push_back(Point2f(hull[j].x, hull[j].y));    }    vector<Point3f> obj = vector<Point3f>{        cv::Point3f(-HALF_LENGTH, -HALF_LENGTH, 0),  //tl        cv::Point3f(HALF_LENGTH, -HALF_LENGTH, 0),  //tr        cv::Point3f(HALF_LENGTH, HALF_LENGTH, 0),  //br        cv::Point3f(-HALF_LENGTH, HALF_LENGTH, 0)  //bl    };   // custom four-point coordinates of the two-dimensional code     CV = CV :: :: Rvec Mat Mat :: zeros (. 3,. 1, CV_64FC1); // the init RVEC     CV = CV tVec Mat :: :: :: Mat zeros (. 3,. 1, CV_64FC1); // the init tvec     solvePnP (obj, PNTS, cameraMatrix, distCoeff, Rvec, tVec, to false, SOLVEPNP_ITERATIVE); COUT << "tvec: \ n-" << endl << tVec;}}

The figure below are the results:

 

 


 

Three numbers are the distances X, Y, Z in the unit cm & lt, precision can reach 0.1cm.

 

Triangulation method

Remember that little hole camera model at the beginning of the article?

 

 

 

Triangular distance measuring method is based on this ideal, simple model, performed after the known object size, focal length F., And measure the length of the object image, it can be calculated based on the following formula Z a length.

 

 

 

Pixel block odometry

This method is known to play openmv time, the ranging algorithm openmv monocular packaging, is to first target object (10cm) take a photograph at a fixed distance, the pixel area of ​​the object detected in the photo. To obtain a scale factor K, then the object is moved to any position, the distance can be estimated based on the pixel area.

However, the two methods is certainly not robust Zeyang.

 

Guess you like

Origin www.cnblogs.com/wujianming-110117/p/12544006.html