[Step by step to learn Metal Engine 1] - "draw the first triangle"

Tutorial 1

Draw the first triangle

Tutorials Source Download: https://github.com/jiangxh1992/MetalTutorialDemos

CSDN full version column: https://blog.csdn.net/cordova/category_9734156.html

First, knowledge

Metal pipelines
Vertex buffer
Metal shader (vertex shader and fragment shader)
Vertex coordinate
Metal Shading Language（MSL）

Second, with regard Metal

2.1 Metal Introduction

Metal with DirectX, OpenGL, Vulkan belong GPU graphics API, developers provide developers with a graphical development interface, all of them with the hardware level interface directly, you can call GPU driver, perform the rendering and computing instructions. Metal engine introduced in 2014 by Apple to developers, tailored for Apple's A-series graphics cards, give full play to the hardware performance to a unique advantage OpenGL ES before replacing the Apple platform, Apple became only the underlying graphics hardware device interface. Compared to Microsoft's DX and OpenGL cross-platform, Metal become a new and improved graphical interface to one kind of the times, especially in the gaming application has obvious advantages, constantly updated engine optimization and acceleration features for the game engine provide power, while learning the relatively low threshold for iOS developers, developer friendly.

2.2 Metal frame position in the development of

Metal is the underlying application framework, and hardware drivers to interact directly with, and provide support and services for the upper frame. Common framework layer on Metal are some of the graphics rendering library, including the core graphics library Core Graphics (QuartzCore), Core Animation and Core Aniamtion library CoreImage. Developer is an application layer framework of a common re-top, for example: UIKit and AppKit.

Here Insert Picture Description

In UI development, for example, conventional developers often use iOS UIKit UI framework to develop, and UIKit drawn is dependent on the underlying support CoreGraphics graphics library, it is CoreGraphics Metal graphics rendering package. After the command to the Metal, Metal drive to mobilize the GPU, allowing GPU to start working as expected, the results of GPU rendering will eventually draw on the screen.

In addition to game developers can use the official Metal-based packaging SceneKit, SpritKit and other developed game engine library, you can also use third-party game engine development, such as Unity and other end graphics support we are back on the Metal (before is OpenGL ES), submitted by the drawing instruction to the GPU device Metal. Developers can use existing mature framework of indirect Metal, Metal API can also be used to provide a low-level official graphical development directly on the Metal.

Third, programmable rasterization rendering pipeline

The traditional rasterization rendering pipeline is the mainstream of the rendering pipeline, and experienced the evolution from the fixed rendering pipeline to render the programmable pipeline. Fixed rendering pipeline means that hardware developers have fixed the rendering process to write the dead, not the second development, the rendering process can not be changed, fixed function, developers can only change some parameters. But after the development in order to increase flexibility, part of the pipeline stage is open to developers, developers can write code coloring function programmable stage, to develop rich graphical features and effects.

Several programmable current pipeline stage includes: vertex processing stages, a processing stage geometry and fragment processing stages.

Here Insert Picture Description
Stage vertex processor is responsible for handling the code through each vertex of the vertex shader pipeline, i.e., vertex shader. How vertex shader stages do not care rendered primitive topology, not further remove discarded vertex in the vertex shader processor stages, and each vertex has only one pass vertex processor, continue to next step after the line to the converted .

下一个阶段是几何处理器阶段。这个阶段着色器可获取图元的完整数据，包括所有的顶点数据和相邻顶点的数据。这个阶段的主要特点是可以改变顶点的数量，可以增加顶点或者删除顶点，典型的应用场景是在曲面细分技术中的应用，通过一定算法逻辑合理的增加更多的顶点，使模型更加精细。几何处理阶段是可选的阶段，一般情况下开发者不会在此写代码，可以默认跳过。默认情况则是直接将顶点处理阶段的数据继续往下传递，不做修改。

之后就进入了Clip裁剪阶段以及裁剪后的片段处理阶段。裁剪阶段是管线的一个固定模块，不需要开发者关心，会自动将有效范围之外的图元顶点裁除掉，留下可见的顶点并变换到屏幕空间。有效范围指的是一个单位化的盒子空间，盒子外的顶点不可能出现在屏幕上，超出屏幕边界以及前后边界太近太远的都是不可见顶点。

之后光栅器会根据可见图元的结构将他们渲染到屏幕上。片段着色阶段，会对每个像素执行片段着色函数，开发者可在片段着色器中对每个像素计算颜色。（光栅化的详细过程和原理参考文章：图形流水线中光栅化原理与实现）

通常开发中，我们面对的主要是顶点着色阶段和片段着色阶段。在顶点着色器中主要进行顶点的MVP变换，在片段着色器中主要进行纹理采样，光照计算等。

四、Metal框架结构

Metal提供的API是面向对象的结构，支持Swift和Objective-C语言。框架中一些重要的对象概念例如：commandbuffer、renderCommandEncoder、renderPassDescriptor、renderPipelineState等以及他们的用法需要开发者了解和熟悉。

Here Insert Picture Description

MTLDevice:开发者使用Metal进行图形开发期间，首先要获取设备上下文device，device指的就是设备的GPU，获取设备对象的方法很简单，直接调用接口：MTLCreateSystemDefaultDevice()即可。
MTLCommandQueue: commandQueue是device对象下创建的指令序列，创建之后即一直存在于整个应用周期间，被重复使用。commandQueue是用来组织后面的commandBuffer的，组织commandBuffer有序的提交在GPU上执行。commandQueue是线程安全的，支持多个commandBuffer异步编码和提交，是GPU并发编程的一部分。
MTLCommandBuffer：commandBuffer是一个命令缓冲，保存编码后的渲染指令，提交给GPU去执行。commandBuffer提交之前是要用后面的renderCommandEncoder对象编码指令填充到commandBuffer中的。commandBuffer也支持多个异步的renderCommandEncoder并发工作。
MTLRenderCommandEncoder：renderCommandEncoder对象是为一个render pass编码指令的，常见的包括设置渲染状态：renderPipeLineState，设置着色器的texture资源，设置顶点着色器和片段着色器的数据buffer等。
MTLRenderPipelineState：renderPipeLineState对象用来配置一个render pass的渲染状态，包括着色器函数等。MTLRenderPipelineState对象和前面介绍的对象的创建都是比较耗费资源的，通常都是在流程的开始全局地创建并在之后重复利用，避免频繁的创建和销毁。
最后，Metal框架中还有一些Descriptor对象，例如MTLRenderPipelineState的MTLRenderPipelineDescriptor，是用来描述和配置MTLRenderPipelineState的参数，用来创建MTLRenderPipelineState对象的。

五、Demo源码分析：使用Metal绘制一个三角形

5.1 Metal开发环境搭建

Xcode创建一个iOS平台的Game工程，框架渲染Metal，就得到了一个官方的Metal游戏demo，demo运行起来我们的metal开发环境就搭建好了。相比于OpenGL，DX等图形接口要配置各种插件库，复杂的环境搭建让人崩溃，Metal的环境高度整合，对于新手来说门槛降低一大截，对新人十分友好。

Here Insert Picture Description
官方的默认demo是绘制一个旋转的cube，对于本教程这一章来说还是太复杂了，这个demo已经包含了内置模型加载，纹理贴图，UniformBuffer，坐标变换等知识点。这里本章的demo中对原demo进一步做了减法简化，只绘制一个简单的三角形，用来分析和讲解Metal引擎最基本的一些知识点。

5.2 源码分析

5.2.1 GameViewController.m

_view = (MTKView *)self.view;

这里是在一个普通的UIViewController中的代码，我们是用Metal在这个UIViewController所在的UIView中进行绘制，这里要将UIView强制转为MetalKit框架中的MTKView，MTKView是UIView的子类，是Metal所能操作的类对象。

_view.device = MTLCreateSystemDefaultDevice();

MTKView中定义了MTLDevice的引用，这里通过MTLCreateSystemDefaultDevice()接口获取设备上下文的引用，用于后续的渲染过程。

// 初始化渲染器，设置渲染器的渲染对象为_view
    _renderer = [[Renderer alloc] initWithMetalKitView:_view];
    // _view尺寸变化事件，传递给render渲染器
    [_renderer mtkView:_view drawableSizeWillChange:_view.bounds.size];
    // 设置MTKView的delegate为_render，在_render中处理drawableSizeWillChange回调事件
    _view.delegate = _renderer;

Renderer是自定义的一个渲染器类，几种在这个类里面写我们渲染过程的代码，初始化创建的时候要把MTKView传进去，渲染的结果最后要绘制到MTKView上。Render实现了MTKView的代理回调事件，监听这个MTKView的视口变化，从而调整渲染的屏幕尺寸，MTKView相当于一块画布。

5.2.2 Renderer.m

    view.depthStencilPixelFormat = MTLPixelFormatDepth32Float_Stencil8;
    view.colorPixelFormat = MTLPixelFormatBGRA8Unorm_sRGB;
    view.sampleCount = 1;

这里首先设置了view默认的深度模板texture和颜色texture的像素格式。颜色texture指的是缓存一帧渲染结果的framebuffer，保存的是color buffer颜色数据。而深度模板texture不同通道保存了depth buffer深度数据和stencil buffer模板数据。（关于color buffer、depth buffer、stencil buffer的概念不了解的请自行查询）

sampleCount指的是每个像素的颜色采样个数，正常情况每个像素只采样一个，而在某些情况下，例如需要实现MSAA等抗锯齿算法的时候，则可能将采样数设置为4或者更多。

    id<MTLLibrary> defaultLibrary = [_device newDefaultLibrary];

    id <MTLFunction> vertexFunction = [defaultLibrary newFunctionWithName:@"vertexShader"];
    id <MTLFunction> fragmentFunction = [defaultLibrary newFunctionWithName:@"fragmentShader"];

MTLLibrary是用来编译和管理metal shader的，它包含了Metal Shading Language的编译源码，会在程序build过程中或者运行时编译shader文本。.metal文件中的shader代码实际上是text文本，经过MTLLibrary编译后成为可执行的MTLFunction函数对象。上面代码创建编译了顶点着色器函数和片段着色器函数。另外还有kernel函数，即computer shader，用于GPU通用并行计算。

device的newDefaultLibrary管理的是xcode工程中的.metal文件，可识别工程目录下的.metal文件中的vertex函数、fragment函数和kernel函数。

    _pipelineState = [_device newRenderPipelineStateWithDescriptor:pipelineStateDescriptor error:&error];

这里创建了绘制我们三角形的管线状态对象_pipelineState，创建_pipelineState之前需要定义一个它的Descriptor，用来配置这个render pass的一些参数，最主要是设置着色器函数。

    _depthState = [_device newDepthStencilStateWithDescriptor:depthStateDesc];

这里定义了一个深度模板状态对象，用来配置当前render pass的深度和模板配置操作。例如可以设置是否写入深度缓冲，深度测试的compare模式等。

    _commandQueue = [_device newCommandQueue];

这里使用设备上下文创建了全局唯一的指令队列对象。至此我们完成了渲染流程中的一些必要对象的创建和初始化。

    // 顶点buffer
    static const Vertex vert[] = {
        {{0,1.0}},
        {{1.0,-1.0}},
        {{-1.0,-1.0}}
    };
    vertexBuffer = [_device newBufferWithBytes:vert length:sizeof(vert) options:MTLResourceStorageModeShared];

渲染对象创建好了，现在我们准备渲染模型数据。由于我们只需要绘制一个简单的三角形，所以这里直接创建一个顶点缓冲，并设置顶点坐标。单位坐标系的原点位于中心，坐标范围在单位1盒子内。代码中的三个顶点坐标对应如下图：

Here Insert Picture Description

- (void)drawInMTKView:(nonnull MTKView *)view
{...}

drawInMTKView是我们MTKView的一个代理回调函数，每一帧之前会执行，我们在这个函数里编写每一帧的指令代码。

    id <MTLCommandBuffer> commandBuffer = [_commandQueue commandBuffer];

这里获取我们commandQueue的默认commandBuffer对象。

    MTLRenderPassDescriptor* renderPassDescriptor = view.currentRenderPassDescriptor;

MTLRenderPassDescriptor是一个很重要的descriptor类，它是用来设置我们当前pass的渲染目标（render target）的，这里我们使用view默认的配置，只有一个渲染默认的目标。在一些其他渲染技术例如延迟渲染中，需要使用这个descriptor配置MRT，这里不深入介绍。

        id <MTLRenderCommandEncoder> renderEncoder =
        [commandBuffer renderCommandEncoderWithDescriptor:renderPassDescriptor];
        renderEncoder.label = @"MyRenderEncoder";

        [renderEncoder pushDebugGroup:@"DrawBox"];
        [renderEncoder setRenderPipelineState:_pipelineState];
        [renderEncoder setDepthStencilState:_depthState];
        [renderEncoder setVertexBuffer:vertexBuffer offset:0 atIndex:0];
        [renderEncoder drawPrimitives:MTLPrimitiveTypeTriangle vertexStart:0 vertexCount:3];
        [renderEncoder popDebugGroup];

        [renderEncoder endEncoding];

这里使用view默认的renderPassDescriptor创建renderCommandEncoder，来编码我们的渲染指令。pushDebugGroup和popDebugGroup只是做一个指令阶段的标记，方便我们在截帧调试的时候观察，不是很重要。代码中，我们使用renderCommandEncoder设置了渲染流程，设置了管线状态对象，和深度模板状态对象，传入了我们的顶点缓冲数据，最后调用一次drawcall绘制三角形。[renderEncoder endEncoding]标示当前render pass指令结束。

        [commandBuffer presentDrawable:view.currentDrawable];

这行代码表示当前的渲染目标设置为我们MTKView的framebuffer，将渲染结果绘制到视图上。

    [commandBuffer commit];

最后提交commandBuffer到commandQueue，等待被GPU执行。

5.2.3 Shaders.metal

shader shader file, we have to write Metal Shading Language (MSL) code. Here we did not actually do the substantive work, just pass to the next stage of the pipeline process in accordance with the coordinates of the vertices in the vertex shader. In the fragment shader, we return to a unified default red, the red triangle.

typedef struct
{
    float4 position [[position]];
    float2 texCoord;
} ColorInOut;

vertex ColorInOut vertexShader(constant Vertex *vertexArr [[buffer(0)]],
                               uint vid [[vertex_id]])
{
    ColorInOut out;

    float4 position = vector_float4(vertexArr[vid].pos, 0 , 1.0);
    out.position = position;

    return out;
}

fragment float4 fragmentShader(ColorInOut in [[stage_in]])
{
    return float4(1.0,0,0,0);
}

In ColorInOut structural body, we define the data transmitted from the vertex shader to the next stage. [[position]]MSL is bound semantic grammar, represented by two brackets and wherein the attribute keywords. [[position]]It represents the vertex shader apex coordinate data is transmitted to the next stage.

Here the parameters passed in the vertex shader function we came vertex buffer array containing all vertex data. By [[vertex_id]]semantics we get the id of the current vertex, that is, vertex buffer vertex index.

Vertex shader is a data structure that we defined:

typedef struct
{
    vector_float2 pos;
} Vertex;

This structure and to buffer data corresponding to the vertex. Herein refers to the coordinate data is defined, this structure may also subsequent expansion, because our vertex buffer may also contain normal discovery data, uv texture coordinates and other data of the tangent.

Vertex shader vertex we do a simple process, float2 be extended float4, z coordinates is set to 0. The fourth component is set to 1.0 w. We fragment shading function simply returns the red color value (1.0,0,0,0).

Sixth, operating results

Here the following operation effect landscape. Vertical screen effect can be observed, understood vertex coordinates box in the unit space are mapped to screen space.

Note that you need to run code on a real machine, because Metal2 does not support running in the simulator.

Here Insert Picture Description