Unity Shader graphics, perspective projection matrix, the principle and derivation of orthogonal projection matrix, the most intuitive understanding of the meaning of the matrix

Preface

When I was reading "Introduction to UnityShader", I found that the book did not give the derivation process of the projection matrix
The projection matrix plays a key role in graphics and in games Development, virtual reality and augmented reality, robotics, and machine vision are all involved
I want to explain how perspective orthogonal projection is obtained in the most intuitive way
If you want to understand this article, you must have a clear and thorough understanding of zoom, rotation, and translation, and the three major matrices

Orthographic projection

Let’s start with a simple orthogonal projection. If you want to learn the perspective matrix, it is recommended to learn the orthogonal matrix first
Insert image description here
Size: half the height in the vertical direction, NearHeight/2
Aspect, camera aspect ratio, Aspect= N e a r W i d t h N e a r H ​​e i g h t {\frac{NearWidth}{NearHeight}} NearHeightNearWidth
Find the transformation matrix based on the known Size, Near, Far, and Apect
Insert image description here
First scale the cuboid so that xyz is in [-1,1],
As shown in the figure below,
for the y-axis, for point 2.y= N e a r H ​​e i g h t 2 \frac{NearHeight}{2 } 2NearHeight=Size,Size=>1,k2= 1 S i z e \frac{1}{Size} Sifrom1
The intuitive understanding is that the y-axis is from Size=>1, and the scaling size of the y-axis is 1 S i z e \frac{1}{Size} Sifrom1
Aspect(简称a)= N e a r W i d t h N e a r H e i g h t {\frac{NearWidth}{NearHeight}} NearHeightNearWidth,w=ah,x=ay

对于x轴
Attention is change before a=x/y => x=ay
x'=k< a i=3>1x y'=k2y1 y x \frac{y}{x} < /span> * 2=k1 ky2x=k k x'=y'=1(变换之后的x',y'弐标都为1)



xy
a= x y \frac{x}{y} andx
k1= k 2 a \frac{k2}{a} ak2
k2= 1 S i z e \frac{1}{Size} Sifrom1
k1= 1 A s p e c t ∗ S i z e \frac{1}{Aspect*Size} Aspect Size1
w=ahWhy not k1=ak2, but k1=k2/a, which is Because what we know is the ratio before transformation, after transformation x'=y',
Assume it is a horizontal screen, the width is greater than the height, that is, a>1, x>y, we want x'=y ', then x'=kx,k<1, that is, 1/a

Please add image description

对于z轴,观观查长方体变换前后的长率,z=Far-Near,z'=2,z轴下载、乘い-1,
Far-Near= >2
z'=k3z,
k − 2 F a r − N e a r -\frac{2}{Far-Near} =3FarNear2
矩阵:
[ 1 Size 0 0 0 0 1 Aspect Size 0 0 0 0 − 2 Far − Near ? 0 0 0 1 ] \begin{bmatrix}\frac{1}{Size}& 0 & 0 & 0 \\ 0 & \frac{1}{AspectSize} & 0&0\\ 0 & 0 & -\frac{2}{Far-Near}&?\\ 0 & 0 & 0& 1\\ \end{bmatrix} Sifrom10000AspectS ize10000FarNear2000?1
Then consider the translation on the z-axis to find the complete matrix
The translation consists of 2 parts, as shown in the figure above, in the zoomed space observation, < /span>
One part is the distance from the camera to the clipping plane, and the other part is half the length of the cube.
The distance from the camera to the clipping plane. Due to scaling, the size is not Near, but Near*k3=< a i=3> − 2 N e a r F a r − N e a r -\frac{2Near}{Far-Near} FarNear2Near
Half the length of the cube, note that the z-axis is flipped, 1, − 2 N e a r F a r − N e a r -\frac{2Near}{Far-Near} FarNear2Near-1= − 2 N e a r F a r − N e a r -\frac{2Near}{Far-Near} FarNear2Near- F a r − N e a r F a r − N e a r \frac{Far-Near}{Far-Near} FarNearFarNear
= − N e a r − F a r F a r − N e a r \frac{-Near-Far}{Far-Near} FarNearNearFar= − F a r + N e a r F a r − N e a r -\frac{Far+Near}{Far-Near} FarNearFar+Near(Consistent with the writing method in the book)
At this point, the orthogonal matrix has been derived
[ 1 S i z e 0 0 0 0 1 A s p e c t S i z e 0 0 0 0 − 2 F a r − N e a r − F a r + N e a r F a r − N e a r 0 0 0 1 ] \begin{bmatrix} \frac{1}{Size}& 0 & 0 & 0 \\ 0 & \frac{1}{AspectSize} & 0&0\\ 0 & 0 & -\frac{2}{Far-Near}&-\frac{Far+Near} {Far-Near}\\ 0 & 0 & 0& 1\\ \end{bmatrix} Sifrom10000AspectS ize10000FarNear2000FarNearFar+Near1
If you don’t understand why matrices are written, you can look up information about scaling, rotation, and translation, and the three major matrices.

perspective projection

  1. Basic concepts:
    NearHeight: the height of the near clipping plane
    FarHeight: the height of the far clipping plane
    Near :The distance of the camera from the near clipping plane
    Far:The distance of the camera from the near clipping plane
    FOV:Field of View,field of view

Insert image description here
Insert image description here
Insert image description here

Why can’t the above transformation be directly transformed into a [-1,1] standardized cube like an orthogonal transformation?
Instead, it must be transformed into a prism. This is because the matrix Linear transformation can transform a cuboid into a cube,
cannot transform a prism into a cube, and converting a prism into a cube requires a nonlinear transformation function,
4 The transformation matrix of The straight lines on both sides of the prism are not parallel => parallel, linear transformation cannot be achieved
Store z before transformation in w, divide all points xyz after transformation by w, and then transform them to [-1,1], this transformation is non-linear, and the scaling value is not fixed. For
different planes (planes parallel to xOy), the scaling value is the distance from the camera before the plane
This is why xy has Near,Far after transformation
The above transformation can be divided into the following steps

  • Scale the view frustum so that its xyzw is equal to the scaled xyzw
  • Translate the viewing frustum in the z direction
  • Flip z-axis

tan F O V 2 \frac{FOV}{2} 2FOV= n e a r H e i g h t 2 N e a r \frac{\frac{nearHeight}{2}}{Near} Near2nearHeight
tan F O V 2 \frac{FOV}{2} 2FOVEqual to half the height of the near clipping plane/Near
Aspect, camera aspect ratio, Aspect= N e a r W i d t h N e a r H ​​e i g h t {\frac{NearWidth}{NearHeight}} NearHeightNearWidth
First consider scaling
The scaling of the y-axis, for the following figure point 2.y< /span> N e a r H ​​e i g h t 2 \frac{NearHeight}{2} 2NearHeight=>Near
对于y轴,由 n e a r H e i g h t 2 = > N e a r \frac{nearHeight}{2}=>Near 2nearHeight=>Near,
Sokufugenori y 乘い N e a r N e a r H ​​e i g h t 2 \frac{Near}{\frac{NearHeight}{2}} 2NearHeightNear=>y’
y* N e a r N e a r H e i g h t 2 \frac{Near}{\frac{NearHeight}{2}} 2NearHeightNear=y’
y’=ky,k= N e a r N e a r H e i g h t 2 \frac{Near}{\frac{NearHeight}{2}} 2NearHeightNear
由①=>cot F O V 2 \frac{FOV}{2} 2FOV= N e a r n e a r H e i g h t 2 \frac{Near}{\frac{nearHeight}{2}} 2nearHeightNear
由②③gotk=cot F O V 2 \frac{FOV}{2} 2FOV
How to intuitively understand cot F O V 2 \frac{FOV}{2} 2FOV,
is actually the scaling of point 2.y before and after the y transformation,
Exactly equal to half the height of Near/near clipping plane, that is, cot F O V 2 \frac{FOV}{2} 2FOV

因为,NearWidth=Aspect*NearHeight,w=ah
NearHeight= N e a r W i d t h A s p e c t \frac{NearWidth}{Aspect} AspectNearWidth,h=w/a
For x-axis scaling=y-axis scaling/Aspect
For x-axis, by NearWidth/2= >Near, that is, multiply the original x by N e a r N e a r W i d t h 2 \frac{Near}{\frac{NearWidth}{2}} 2NearWidthNear
w=ahWhy not k1=ak2, but k1=k2/a, which is Because what we know is the ratio before transformation, after transformation x'=y',
Assume it is a horizontal screen, the width is greater than the height, that is, a>1, x>y, we want x'=y ', then x'=kx,k<1, that is, 1/a

There are two ways to derive the scaling of the x-axis
Method 1:
Be careful before changing a=x/y => x=ay
x'=k1x
y'=k2y
x'=y' =Near(变换之后的x',y'弐标都为Near)
k1x=k=k< /span> y x \frac{y}{x} * 21 ky2
xy
a= x y \frac{x}{y} andx
k1= k 2 a \frac{k2}{a} ak2
k2=cot F O V 2 \frac{FOV}{2} 2FOV
k1= c o t F O V 2 A s p e c t \frac{cot\frac{FOV}{2}}{Aspect} Aspectwhatt2FOV

方法2:
x * N e a r N e a r W i d t h 2 \frac{Near}{\frac{NearWidth}{2}} 2NearWidthNear=x’,k2= N e a r N e a r W i d t h 2 \frac{Near}{\frac{NearWidth}{2}} 2NearWidthNear
②=> N e a r h 2 \frac{Near}{\frac{h}{2}} 2hNear=> N e a r w / a 2 \frac{Near}{\frac{w/a}{2}} 2w/aNear=> N e a r w 2 a \frac{Near}{\frac{w}{2a}} 2awNear
cot F O V 2 \frac{FOV}{2} 2FOV= N e a r n e a r H e i g h t 2 \frac{Near}{\frac{nearHeight}{2}} 2nearHeightNear
=>cot F O V 2 \frac{FOV}{2} 2FOV= N e a r w 2 a \frac{Near}{\frac{w}{2a}} 2awNear
=> c o t F O V 2 a \frac{cot\frac{FOV}{2}}{a} awhatt2FOV= N e a r w 2 \frac{Near}{\frac{w}{2}} 2wNear
k2= N e a r w 2 \frac{Near}{\frac{w}{2}} 2wNear= c o t F O V 2 a \frac{cot\frac{FOV}{2}}{a} awhatt2FOV
Therefore, the scaling factor of x, y has been obtained
[ c o t F O V 2 A s p e c t 0 0 0 0 c o t F O V 2 0 0 0 0 ? ? 0 0 ? 0 ] \begin {bmatrix} \frac{cot\frac{FOV}{2}}{Aspect} & 0 & 0 & 0 \\ 0 & cot\frac{FOV}{2} & 0&0\\ 0 & 0 & ?&?\\ 0 & 0 & ?& 0\\ \end{bmatrix} Aspectwhatt2FOV0000whatt2FOV0000??00?0
Same as the picture above, easy to view the picture
Insert image description here

The following explains how to obtain the scaling of the z-axis
For the view frustum, in the z direction, the length of the previous view frustum is Far-Near, and the size after transformation is Far+Near ,
So k3= F a r + N e a r F a r − N e a r \frac{Far+Near}{Far-Near} FarNearFar+Near,对z轴进行訳,k3=- F a r + N e a r F a r − N e a r \frac{Far+Near}{Far-Near} FarNearFar+Near
The following explains the translation of the z-axis and the flip of the z-axis. The translation is divided into two parts. Observed in the zoomed space,
The part is the z of the transformed point 2, that is, z'2 =k3< /span>Near, the other part is Near
z’=k3z,k3= F a r + N e a r N e a r − F a r \frac{Far+Near}{Near-Far} NearFarFar+Near,Near F a r + N e a r N e a r − F a r \frac{Far+Near}{Near-Far} NearFarFar+Near= N e a r ∗ F a r + N e a r 2 N e a r − F a r \frac{Near*Far+Near^{2}}{Near-Far} NearFarNearFar+Near2
After adding the transformation, -Near (because of the flip of the z-axis), N e a r ∗ F a r + N e a r 2 N e a r − F a r \frac{Near*Far +Near^{2}}{Near-Far} NearFarNearFar+Near2-Near=
N e a r ∗ F a r + N e a r 2 N e a r − F a r \frac{Near*Far+Near^{2}}{Near-Far} NearFarNearFar+Near2- N e a r 2 − N e a r ∗ F a r N e a r − F a r \frac{Near^{2}-Near*Far}{Near-Far} NearFarNear2NearFar= 2 ∗ N e a r ∗ F a r N e a r − F a r \frac{2*Near*Far}{Near-Far} NearFar2NearFar
At this point, the zoom and pan are completed
To save the previous z, just set the 4th row and 3rd column to 1
The perspective matrix is Derived
[ c o t F O V 2 A s p e c t 0 0 0 0 c o t F O V 2 0 0 0 0 − F a r + N e a r F a r − N e a r − 2 ∗ N e a r ∗ F a r F a r − N e a r 0 0 1 0 ] \begin{bmatrix} \frac{cot\frac{FOV}{2}}{Aspect} & 0 & 0 & 0 \\ 0 & cot\frac{FOV}{ 2} & 0&0\\ 0 & 0 & -\frac{Far+Near}{Far-Near} & -\frac{2*Near*Far}{Far-Near} \\ 0 & 0 & 1& 0\\ \end{bmatrix} Aspectwhatt2FOV0000whatt2FOV0000FarNearFar+Near100FarNear2NearFar0

Guess you like

Origin blog.csdn.net/qq_58047420/article/details/134278171