Camera coordinate system, world coordinate system, pixel coordinate system conversion, and Fov conversion of OPENGLDEFocal Length and Opengl

Purpose: Understand how the camera works, rendering and depth maps captured by the camera

Recently, the parameter relationship between camera parameters and opengl rendered images has been studied. The project matrix will be encountered in the process of going from the camera coordinate system to the pixel coordinate system.

Theoretical Camera Transformation

Camera-based learning theory knowledge. Normally, we use camera parameters to build a project and generally need to go through the following two processes:
camera coordinate system -> image coordinate system -> pixel coordinate system Suppose we have obtained a vertex p ( xc , yc , zc ) p(x_c )
in the camera coordinate system ,y_c,z_c)p(xc,yc,zc) , transformed into a vertex p on the image( xi , yi ) p(x_i,y_i)p(xi,yi) . Its transformation matrix is​​M proj M_{proj}Mproj
Below we will introduce a few steps:

  1. Camera coordinate system to image coordinate system M p 2 c M_{p2c}Mp2c
    At the same location, different cameras will get different pictures. It can be seen that this process is generally related to the parameters of the camera. Some parameters of the camera are generally in the camera manual, and you can also find the corresponding style of documentation on the Internet, and some even in the configuration file (for example, the camera parameters of the image camera in the scanner). In order to understand the function of these parameters, we need to understand the principle of camera imaging (small hole imaging).
    insert image description here
    The coordinate system in the image is ( O ​​′ x , O ′ y ) (O'x, O'y)(Ox,O y), the camera coordinate system is( O ​​cxc , O cyc , O czc ) (O_cx_c, O_cy_c , O_cz_c)(Ocxc,Ocyc,Oczc) , for easy understanding, convert the visual view. Because( O ′ x , O ′ y ) (O'x, O'y)(Ox,Oy) ( O c x c , O c y c , O c z c ) (O_cx_c, O_cy_c , O_cz_c) (Ocxc,Ocyc,Oczc) ( O c x c , O c y c ) (O_cx_c, O_cy_c) (Ocxc,Ocyc) parallel. Therefore, according to the geometric symmetry, the image coordinate system is moved to the mirrored position, as shown in the figure below. where ∠ O c AB \angle{O_cAB}OcA B is a right angle. The picture is not very accurate.

insert image description here
Because similar triangles (1) :
△ ABO c ∼ △ o CO c \bigtriangleup ABO_c \sim \bigtriangleup oCO_cABOcoCOc
Get the following formula:
O co O c Z c = o CAB = O c CO c B ( 1 ) \frac{O_co}{O_cZ_c} = \frac{oC}{AB} = \frac{O_cC}{O_cB} \space \space \space (1)OcZcOco=ABoC=OcBOcC   ( 1 )
Another expression (becauseO c O_cOcis the origin , changed the symbol) as follows:
O co O c Z c = f Z c ( 2 ) \frac{O_co}{O_cZ_c} = \frac{f}{Z_c} \space \space \space (2)OcZcOco=Zcf   (2)
o C A B = x X c     ( 3 ) \frac{oC}{AB} = \frac{x}{X_c} \space \space \space (3) ABoC=Xcx   (3)

Because similar triangles (2) :
△ ABO c ∼ △ o CO c \bigtriangleup ABO_c \sim \bigtriangleup oCO_cABOcoCOc
The following formula is obtained:
O c CO c B = C p BP ( 4 ) \frac{O_cC}{O_cB} = \frac{Cp}{BP} \space \space \space (4)OcBOcC=BPCp   ( 4 )
Another expression (becauseO c O_cOcis the origin , changed the symbol) as follows:
C p BP = y Y c ( 5 ) \frac{Cp}{BP} = \frac{y}{Y_c} \space \space \space(5)BPCp=Ycy   ( 5 )
because(1) (2) (3) (4) (5) (1)(2)(3)(4)(5)( 1 ) ( 2 ) ( 3 ) ( 4 ) ( 5 ) The formula gets:
f Z c = x X c = y Y c \frac{f}{Z_c} = \frac{x}{X_c} = \frac{ y}{Y_c}Zcf=Xcx=Ycy
Further converted into the following formula:
x = f ∗ X c Z c ( 6 ) x=\frac{f*X_c}{Z_c} \space \space \space(6)x=ZcfXc   (6)
y = f ∗ Y c Z c     ( 7 ) y=\frac{f*Y_c}{Z_c} \space \space \space(7) y=ZcfYc   ( 7 )
among them( 6 ), ( 7 ) (6), (7)(6),( 7 ) The formula is written in the form of matrix as follows:
[ xyz ] = [ f 0 0 0 0 f 0 0 0 0 1 0 ] [ X c Y c Z c 1 ] \begin{bmatrix} x\\ y \\ z \end{bmatrix} = \begin{bmatrix} f & 0 & 0 & 0 \\ 0 & f & 0 & 0 \\ 0 & 0 & 1 & 0 \end{bmatrix} \begin{bmatrix} X_c\\ Y_c \\ Z_c \\ 1 \end{bmatrix}xyz=f000f0001000XcYcZc1
The final matrix equation M p 2 c M_{p2c}Mp2cDefault:
M p 2 c = [ f 0 0 0 0 f 0 0 0 0 1 0 ] M_{p2c}= \begin{bmatrix} f & 0 & 0 & 0 \\ 0 & f & 0 & 0 \\ 0&0&1&0\end{bmatrix}Mp2c=f000f0001000
The shorthand formula can be obtained:
[ xyz ] = M p 2 c [ X c Y c Z c 1 ] \begin{bmatrix} x\\ y \\ z \end{bmatrix} = M_{p2c}\begin{bmatrix} X_c \\ Y_c \\ Z_c \\ 1 \end{bmatrix}xyz=Mp2cXcYcZc1
[ xyz ] \begin{bmatrix} x\\ y \\ z \end{bmatrix} obtained at this timexyzis a physical unit. It is not in units of pixels, so it needs to be converted into units of pixels, it needs the specific physical length of each pixel. The sensor parameter of the camera is required. The camera coordinates to pixel coordinates are described below.

  1. Convert image coordinate system to pixel coordinate system M i 2 p M_{i2p}Mi 2 p
    It is necessary to convert the camera coordinate system into a pixel coordinate system, which involves two issues,
    1) Two coordinate system issues. In the image coordinate system, generally the origin is at the center. However, the origin of our traditional pixel display coordinate system is the upper left corner of the image, and this vector needs to be translated.
    2) Unit conversion problem: It is necessary to know how many physical units (usually mm) each pixel has, and the physical unit is millimeter. For both conversions, you need to know dx , dy d_x,d_ydx,dyIndicate how many mm each column and each row represent, and represent the width of a column 1 pixel = dx mm 1 \space pixel = d_x \space mm1 pixel=dx m m ; the width of a line1 pixel = dy mm 1 \space pixel = d_y \space mm1pixel =dy m m , in generaldx = dy d_x=d_ydx=dy. Therefore xxThe pixel represented by the x coordinate is xdx \frac{x}{dx}dxx, the corresponding yyThe pixel represented by the y coordinate is ydy \frac{y}{dy}dyy
    The two coordinate systems shown in the figure are as follows:
    insert image description here
    the following formula is obtained by translating the coordinate system:
    u = xdx + u 0 ( 8 ) u= \frac{x}{d_x}+u_0 \space \space \space (8)u=dxx+u0   (8)
    v = y d y + v 0     ( 9 ) v= \frac{y}{d_y}+v_0 \space \space \space (9) v=dyy+v0   ( 9 )
    among them( 8 ), ( 9 ) (8), (9)(8),( 9 ) The formula is written in the form of a matrix as follows:
    [ uv 1 ] = [ 1 dx 0 u 0 1 dy 0 v 0 0 0 1 ] [ uv 1 ] \begin{bmatrix} u\\ v \\1 \end{ bmatrix}= \begin{bmatrix} \frac{1}{d_x}&0&u_0 \\ \frac{1}{d_y}&0&v_0 \\0&0&1 \end{bmatrix}\begin{bmatrix} u\\ v \\1 \end {bmatrix}uv1=dx1dy10000u0v01uv1
    The obtained matrix formula is:
    M i 2 p = [ 1 dx 0 u 0 1 dy 0 v 0 0 0 1 ] M_{i2p}= \begin{bmatrix} \frac{1}{d_x}&0&u_0 \\ \frac{1 }{d_y}&0&v_0 \\0&0&1 \end{bmatrix}Mi 2 p=dx1dy10000u0v01

Summarize the above two transformations to get:
M proj = [ 1 dx 0 u 0 1 dy 0 v 0 0 0 1 ] [ f 0 0 0 0 f 0 0 0 0 1 0 ] M_{proj}= \begin{bmatrix} \ frac{1}{d_x}&0&u_0 \\ \frac{1}{d_y}&0&v_0 \\0&0&1 \end{bmatrix} \begin{bmatrix} f & 0 & 0 & 0 \\ 0 & f & 0 & 0 \\ 0 & 0 & 1 & 0 \end{bmatrix}Mproj=dx1dy10000u0v01f000f0001000
Finally, the formula projection matrix is ​​obtained:
M proj = [ fdx 0 u 0 0 0 fdyv 0 0 0 0 1 0 ] = [ fx 0 u 0 0 0 fxv 0 0 0 0 1 0 ] M_{proj}= \begin{bmatrix } \frac{f}{d_x}& 0 & u_0 & 0 \\ 0 & \frac{f}{d_y} & v_0 & 0 \\ 0 & 0 & 1 & 0 \end{bmatrix} = \begin{bmatrix } f_x & 0 & u_0 & 0 \\ 0 & f_x & v_0 & 0 \\ 0 & 0 & 1 & 0 \\end{bmatrix}Mproj=dxf000dyf0u0v01000=fx000fx0u0v01000
Understand the relationship through the formula, get fx = fdx f_x=\frac{f}{d_x}fx=dxf, it indicates how many pixels the focal length has (this is the pixel unit), similarly fy = fdy f_y=\frac{f}{d_y}fy=dyfIt also indicates how many pixels there are. In general, dx = dy d_x=d_ydx=dy, and the focal length is one, the ff in the above figuref

The summary is as follows:
M proj = [ fx 0 u 0 0 0 fxv 0 0 0 0 1 0 ] M_{proj}= \begin{bmatrix} f_x & 0 & u_0 & 0 \\ 0 & f_x & v_0 & 0 \\ 0 & 0 & 1 & 0 \end{bmatrix}Mproj=fx000fx0u0v01000

Opengl coordinate rendering theory

Camera-based OPENGL theoretical knowledge. Normally, we use camera parameters to build a project that requires more fov, aspect ratio, near, and far. Among them, we found that there are 4 parameters in the specific rendering, which can accurately render the parameters required by the space, and they are easy to understand, and they cut a space for normalization operations and easy rendering calculations. For details, please Baidu learnopengl about camera settings. It sets about the above 4 parameters. It has two kinds of projects, and I will introduce their principles below:
insert image description here
If you don't convert them into fov, aspect ratio, near, far, you can directly project the matrix and complete it through glFrustum. But in learning opengl rendering, gluPerspective is usually used, and the parameters it uses are fov, aspect ratio, near, far. We can directly set the project matrix in the shader to get the rendered project matrix. It also imitates gluPerspective to generate the project matrix. At the same time, it can also be converted into the above four, and opengl's gluPerspective can help you generate the project matrix. Either way works fine.
1) The parameters of the camera are converted into four parameters of fov, aspect ratio, near, and far.
First understand these four parameters:
1) Field of view (FOV), aspect ratio Aspect
ratio is the x/y ratio of the final displayed image
FOV: Indicates the angle of the open field of view, see the figure below.
2) near and far represent the distance between the far and near tangent planes and the origin, see the two parallel planes in the figure.
insert image description here
The relationship in the above picture can be seen:
insert image description here
if you need to render a picture that is consistent with the image, the picture is expressed as img, and we set the aspect ratio and FOV through the image
According to the definition, ratio is the x/y ratio of the final displayed image, and
the calculated formula is as follows:
aspect = img . colsimg . rows aspect=\frac{img.cols}{img.rows}aspect=img.rowsimg.cols
Bringing into the above formula, top corresponds to half of the y-axis, that is, the pixelwidth in the pixel coordinate system, which is
top = pixelwidth 2.0 top = \frac{pixelwidth}{2.0}top=2.0pixelwidth
Among them, near corresponds to the focal length (pixel coordinate system), which is in the pixel coordinate system. The unit is unified.
Later converted to the following formula:
FOV = 2.0 ∗ atanf ( pixelwidth 2.0 ∗ focal L length ) ∗ π 180.0 FOV=2.0*atanf({\frac{pixelwidth}{2.0*focalLength}})*\frac{\pi}{180.0 }FOV=2.0a t a n f (2.0focalLengthpixelwidth)180.0p
The distance set for near and far can be as close or as far as possible.
Its matrix is ​​as follows:
insert image description here
After setting, the rendered image is consistent with the image captured by the camera.

Subsequent to join the derivation process.

Guess you like

Origin blog.csdn.net/weixin_43851636/article/details/125082129