OpenGL Shader Program Analysis--Perspective Projection

Reprinted from: https://blog.csdn.net/cordova/article/details/52599161

background

Finally, it's time to optimize the display of 3d graphics: project the objects of the 3d world onto the 2d plane while preserving the depth of the object . A very typical example is the road that extends far into the 3D world. On the 2D screen, it looks narrower and narrower and finally converges into a point on the far horizon.

We now want to create a projection transformation that satisfies the above requirements, and we also want to simplify it and display the projected coordinate system in a unitized box space of -1 to 1 to make the clipping work easier. Simple, so that the clipper does not need to know the dimensions of the screen and the position of the far and near perspective planes, and can directly clip.

The perspective transformation of graphics needs to provide four parameters: 
1. Screen aspect ratio: the aspect ratio of the screen is the target of projection; 
2. Vertical field of view: the angle in the vertical direction of the camera window looking at the 3d world; 
3. Z The position of the axis near plane: the near plane is used to crop objects that are too close to the camera; 
4. The position of the Z axis far plane: the far plane is used to crop objects that are too far from the camera;

The screen aspect ratio is a necessary parameter, because we want to display all coordinate systems in a unitized box with equal width and height, and usually the width of the screen is greater than the height of the screen, so the axis in the horizontal direction is required The more dense coordinate points are arranged on the top, and the vertical direction is relatively sparse. After this transformation, we can add more X coordinates in the X direction according to the ratio of the X axis in the unit box space while ensuring that we can see a wider screen image.

The vertical field of view allows us to zoom in or out in the 3d world by adjusting it. Consider the example in the illustration below: the camera on the left has a larger angle of view and objects should look smaller on the screen; the image on the right has a smaller angle of view and objects should look larger. Note that this is due to the position of the camera, which is a bit counterintuitive (the same object looks bigger and smaller). The camera on the left is closer to the projection screen for a larger viewing angle and the camera on the right is further away from the screen for a smaller viewing angle. However, keep in mind that in the program this phenomenon has no practical effect, because the projected coordinate system and the screen are mapped to match, and the position of the camera will not have any effect .

write picture description here

Let's start with the issue of the distance from the projection screen to the camera. The projection screen is a plane parallel to the XY plane. Obviously, the entire plane won't be visible because it's too big. We can only see objects from a rectangular area (projection window) that has the same aspect ratio as the screen. The aspect ratio of the screen is: 
ar = screen width / screen height

We simply define the height of the projection window as 2, which means that the width of the screen is exactly twice the aspect ratio of the screen: 2*ar. If we put the camera at the origin and look at this area from the back of the camera we see the following coordinate system:

write picture description here

All objects beyond this rectangle will be clipped. We have seen that the coordinate system inside will have its Y coordinate component within this range. The X axis component is relatively large now but will provide a fixed value later. 
Now we look at the YZ plane from one side:

write picture description here

From the vertical viewing angle we can see the distance from the camera to the projection plane (represented by the angle alpha):

write picture description here

The next step is to calculate the X and Y in the projected coordinate system. Look at the picture below (still looking at the YZ plane).

write picture description here 
There is a point with coordinates (x, y, z) in the 3d world, and we want to find the coordinates (xp, yp) at which this point is projected onto the plane on the projection plane. In this figure, the X-axis component is no longer within the observation range due to the vertical paper plane, so we calculate it with the Y-axis. According to the similar triangles, we can get:

write picture description here

In the same way, the calculation on the X axis is:

write picture description here

Since our screen size is 2*ar wide and 2 high, so long as the X coordinate is between -ar and ar, and the Y coordinate is between -1 and 1, there will be projected points on the screen. This way we are effectively normalized on the Y component, but not on the X component. We can unitize Xp by dividing it by the screen aspect ratio. After unitization, the original X coordinate of +ar becomes +1. For example, if the X coordinate after projection is +0.5 and the screen aspect ratio is 1.333 (a screen with a resolution of 1024x768), the new X coordinate after normalization becomes 0.375. In short, dividing by the screen aspect ratio for normalization can have the effect of condensing more points on the X axis. 
For X and Y coordinates we get the following equations:

write picture description here

Before solving the final result, let's try to see what this projection transformation matrix should roughly look like. That is, we are going to use a matrix to represent the above equation. So here comes the problem, in both equations, we need to divide X or Y by Z respectively, and Z is also a component of the vector representing the position, the value of Z changes from one vertex to the next vertex, also It's just not constant, so it's impossible to put it in a matrix to transform all vertices. For a better understanding we can first look at the four components (a, b, c, d) of the first row at the top of the transformation matrix. We need to find a set of values ​​such that the following equation holds:

write picture description here

This is the dot product of this set of values ​​in the first row and the vertex position vector, and finally as the value of the X component of the transformed vertex position vector. We can make 'b' and 'd' both 0, but no matter what the values ​​of 'a' and 'c' are, we won't get the result on the right side of the equal sign. The solution in OpenGL is to decompose this transformation into two steps: multiply by a projection transformation matrix, and then divide by the value of the Z component alone. That projection transformation matrix will be provided in our application, and the step of multiplying the vertex and projection transformation matrix in the shader, the separate step of dividing by the Z component is fixed in the GPU, and is performed in the rasterizer (in the vertex somewhere between shader and fragment shader). So how does the GPU know which vertices output from the vertex shader need to be divided by their Z value? Simply, this will be implemented by the built-in gl_Position variable, and we don't need to worry about it. Now what we have to do is to find the above projection transformation matrix for only two components of X and Y. After multiplying this projection transformation matrix, the GPU will automatically help us to perform the Z value division transformation so that we can get the final result we want. 
But there's another complication here: if we multiply the vertex position by the transformation matrix, and then divide by the Z value we actually lose the Z value, because the Z value in each vertex becomes 1. The original Z value must be saved because it will be used for depth testing later. The trick here is to save the original Z value in the W component of the resulting vector, then just divide XYZ by the W component instead of Z. W saves the original value of Z for final depth detection. The automatic step of dividing gl_Position by the W component is called 'perspective division' . Now we can create an intermediate transformation matrix that implements the two equations above, and holds the value of Z in the W component:

write picture description here

As I said before, we want to both normalize the Z value and make it easier for the clipper to deep clip the graph without knowing the Near Z and Far Z values. However, the above matrix turns Z to 0. We know that the system will automatically perform perspective separation after vector transformation. We need to select a set of values ​​in the third row of the transformation matrix, so that the separation operation on the Z component in the viewport (for example: NearZ <= Z <= FarZ) is mapped to [- 1,1] range. This mapping operation consists of two parts: first, we scale the range [NearZ, FarZ] to any range with a width of 2, and then translate it so that its starting point starts from -1, which is the range of [-1,1] . The operation of zooming and then translating Z can be represented by the following general function:

write picture description here

But later perspective separation turns the function on the right side of the equals sign into:

write picture description here

Now we find the values ​​of A and B that map the range of Z to the range [-1,1]. In particular, we know that the mapping result is -1 when Z is equal to NearZ, and the mapping result is 1 when Z is equal to FarZ, so after substituting in, we can get:

write picture description here

Now we need to select the third row of the matrix as the vector (abcd) that will satisfy: 
Now we need to set the third row of the transformation matrix (a,b,c,d) to satisfy the following equation:

write picture description here

First we immediately set the values ​​of 'a' and 'b' to 0, because we don't want X and Y to affect Z during the transformation. Then we can have A as the value of 'c' and B as the value of 'd' (W is known to be 1). So our final transformation matrix is:

write picture description here

将顶点位置向量和投影变换矩阵相乘之后,坐标系将会变换到裁剪空间中,并且在透视分离之后坐标系会变换到NDC 空间(Normalized Device Coordinates:单位化设备坐标系)中。 
这一系列教程中的整个渲染路线现在应该十分清晰了。之前在没有投影变换的情况下,我们只能从顶点着色器VS中简单地输出XYZ各分量都在[-1,1]范围内的顶点,保证他们在屏幕当中,并且通过让W的值为1我们可以防止透视分离产生的影响,然后将坐标变换到屏幕空间就结束了。当使用了投影变换矩阵之后,透视分离就成了3d投射到2d平面的一个集成的部分了。

源代码详解

(1)


void Pipeline::InitPerspectiveProj(Matrix4f& m) const>
{
    const float ar = m_persProj.Width / m_persProj.Height;
    const float zNear = m_persProj.zNear;
    const float zFar = m_persProj.zFar;
    const float zRange = zNear - zFar;
    const float tanHalfFOV = tanf(ToRadian(m_persProj.FOV / 2.0));

    m.m[0][0] = 1.0f / (tanHalfFOV * ar); 
    m.m[0][1] = 0.0f;
    m.m[0][2] = 0.0f;
    m.m[0][3] = 0.0f;

    m.m[1][0] = 0.0f;
    m.m[1][1] = 1.0f / tanHalfFOV; 
    m.m[1][2] = 0.0f; 
    m.m[1][3] = 0.0f;

    m.m[2][0] = 0.0f; 
    m.m[2][1] = 0.0f; 
    m.m[2][2] = (-zNear - zFar) / zRange; 
    m.m[2][3] = 2.0f * zFar * zNear / zRange;

    m.m[3][0] = 0.0f;
    m.m[3][1] = 0.0f; 
    m.m[3][2] = 1.0f; 
    m.m[3][3] = 0.0f;
}

管线类中添加一个叫做m_persProj的数据结构,用来保存透视变换的配置信息。上面的方法可以创建我们在上面演算得到的变换矩阵。

(2)m_transformation = PersProjTrans * TranslationTrans * RotateTrans * ScaleTrans;

我们将投影变换矩阵放在相乘式子的第一个位置来实现完整的变换。注意由于位置向量在最右边所以透视变换实际上是最后进行的,我们先缩放,然后旋转,平移,最后投影。

(3)p.SetPerspectiveProj(30.0f, WINDOW_WIDTH, WINDOW_HEIGHT, 1.0f, 1000.0f);

在渲染函数中我们设置投影变换的参数,可以调节这个看不同的效果。

示例Demo

#include <stdio.h>
#include <string.h>

#include <math.h>
#include <GL/glew.h>
#include <GL/freeglut.h>

#include "ogldev_util.h"
// 管线类
#include "ogldev_pipeline.h"
// 屏幕宽高宏定义
#define WINDOW_WIDTH 1024
#define WINDOW_HEIGHT 768

GLuint VBO;
GLuint IBO;
GLuint gWorldLocation;
// 透视变换配置参数数据结构
PersProjInfo gPersProjInfo;

const char* pVSFileName = "shader.vs";
const char* pFSFileName = "shader.fs";


static void RenderSceneCB()
{
    glClear(GL_COLOR_BUFFER_BIT);

    static float Scale = 0.0f;

    Scale += 0.1f;

    Pipeline p;
    p.Rotate(0.0f, Scale, 0.0f);
    p.WorldPos(0.0f, 0.0f, 5.0f);
    // 设置投影变换的参数
    p.SetPerspectiveProj(gPersProjInfo);

    glUniformMatrix4fv(gWorldLocation, 1, GL_TRUE, (const GLfloat*)p.GetWPTrans());

    glEnableVertexAttribArray(0);
    glBindBuffer(GL_ARRAY_BUFFER, VBO);
    glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, 0, 0);
    glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, IBO);

    glDrawElements(GL_TRIANGLES, 12, GL_UNSIGNED_INT, 0);

    glDisableVertexAttribArray(0);

    glutSwapBuffers();
}


static void InitializeGlutCallbacks()
{
    glutDisplayFunc(RenderSceneCB);
    glutIdleFunc(RenderSceneCB);
}

static void CreateVertexBuffer()
{
    Vector3f Vertices[4];
    Vertices[0] = Vector3f(-1.0f, -1.0f, 0.5773f);
    Vertices[1] = Vector3f(0.0f, -1.0f, -1.15475f);
    Vertices[2] = Vector3f(1.0f, -1.0f, 0.5773f);
    Vertices[3] = Vector3f(0.0f, 1.0f, 0.0f);

    glGenBuffers(1, &VBO);
    glBindBuffer(GL_ARRAY_BUFFER, VBO);
    glBufferData(GL_ARRAY_BUFFER, sizeof(Vertices), Vertices, GL_STATIC_DRAW);
}

static void CreateIndexBuffer()
{
    unsigned int Indices[] = { 0, 3, 1,
                               1, 3, 2,
                               2, 3, 0,
                               0, 1, 2 };

    glGenBuffers(1, &IBO);
    glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, IBO);
    glBufferData(GL_ELEMENT_ARRAY_BUFFER, sizeof(Indices), Indices, GL_STATIC_DRAW);
}

static void AddShader(GLuint ShaderProgram, const char* pShaderText, GLenum ShaderType)
{
    GLuint ShaderObj = glCreateShader(ShaderType);

    if (ShaderObj == 0) {
        fprintf(stderr, "Error creating shader type %d\n", ShaderType);
        exit(1);
    }

    const GLchar* p[1];
    p[0] = pShaderText;
    GLint Lengths[1];
    Lengths[0]= strlen(pShaderText);
    glShaderSource(ShaderObj, 1, p, Lengths);
    glCompileShader(ShaderObj);
    GLint success;
    glGetShaderiv(ShaderObj, GL_COMPILE_STATUS, &success);
    if (!success) {
        GLchar InfoLog[1024];
        glGetShaderInfoLog(ShaderObj, 1024, NULL, InfoLog);
        fprintf(stderr, "Error compiling shader type %d: '%s'\n", ShaderType, InfoLog);
        exit(1);
    }

    glAttachShader(ShaderProgram, ShaderObj);
}

static void CompileShaders()
{
    GLuint ShaderProgram = glCreateProgram();

    if (ShaderProgram == 0) {
        fprintf(stderr, "Error creating shader program\n");
        exit(1);
    }

    string vs, fs;

    if (!ReadFile(pVSFileName, vs)) {
        exit(1);
    };

    if (!ReadFile(pFSFileName, fs)) {
        exit(1);
    };

    AddShader(ShaderProgram, vs.c_str(), GL_VERTEX_SHADER);
    AddShader(ShaderProgram, fs.c_str(), GL_FRAGMENT_SHADER);

    GLint Success = 0;
    GLchar ErrorLog[1024] = { 0 };

    glLinkProgram(ShaderProgram);
    glGetProgramiv(ShaderProgram, GL_LINK_STATUS, &Success);
    if (Success == 0) {
        glGetProgramInfoLog(ShaderProgram, sizeof(ErrorLog), NULL, ErrorLog);
        fprintf(stderr, "Error linking shader program: '%s'\n", ErrorLog);
        exit(1);
    }

    glValidateProgram(ShaderProgram);
    glGetProgramiv(ShaderProgram, GL_VALIDATE_STATUS, &Success);
    if (!Success) {
        glGetProgramInfoLog(ShaderProgram, sizeof(ErrorLog), NULL, ErrorLog);
        fprintf(stderr, "Invalid shader program: '%s'\n", ErrorLog);
        exit(1);
    }

    glUseProgram(ShaderProgram);

    gWorldLocation = glGetUniformLocation(ShaderProgram, "gWorld");
    assert(gWorldLocation != 0xFFFFFFFF);
}

int main(int argc, char** argv)
{
    glutInit(&argc, argv);
    glutInitDisplayMode(GLUT_DOUBLE|GLUT_RGBA);
    glutInitWindowSize(WINDOW_WIDTH, WINDOW_HEIGHT);
    glutInitWindowPosition(100, 100);
    glutCreateWindow("Tutorial 12");

    InitializeGlutCallbacks();

    // Must be done after glut is initialized!
    GLenum res = glewInit();
    if (res != GLEW_OK) {
      fprintf(stderr, "Error: '%s'\n", glewGetErrorString(res));
      return 1;
    }

    glClearColor(0.0f, 0.0f, 0.0f, 0.0f);

    CreateVertexBuffer();
    CreateIndexBuffer();

    CompileShaders();

    // 初始化透视变换配置参数
    gPersProjInfo.FOV = 30.0f;
    gPersProjInfo.Height = WINDOW_HEIGHT;
    gPersProjInfo.Width = WINDOW_WIDTH;
    gPersProjInfo.zNear = 1.0f;
    gPersProjInfo.zFar = 100.0f;

    glutMainLoop();

    return 0;
}

running result

It can be seen that the perspective is very three-dimensional: 
write picture description here


Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326750467&siteId=291194637