[Unity 3D] 图形性能优化之Draw Call Batching(绘制调用批处理)

手动翻译如有不对欢迎指出。。。

Unity官方文档:https://docs.unity3d.com/Manual/DrawCallBatching.html

绘制调用批处理

想要在屏幕上绘制一个物体,引擎必须向图形AP(比如OpenGL和Direct3D)发送一个Draw Call(绘制调用)。这些Draw Call通常会占用大量资源,图形API会对每一个Darw Call做重要的处理,这将引起CPU的性能开销。这主要是由于Draw Call间的状态切换引起的(比如切换到一个不同的材料),导致占用图形驱动的大量的资源来确认和转换。

Unity使用以下两个技术来解决这个问题:

  • 动态批处理:对于较小的Mesh,使用CPU来转换他们的顶点,将相似的顶点组合在一起,一次性绘制出来。
  • 静态批处理:将静态(不动)的GameObject结合进大的Mesh,用更快的方式去渲染他们。

内置的批处理对比手动合并GameObject有几个好处;最显著的是,GameObject依然可以被单独的拿出来。然而,他也有些不好的地方;静态批处理可能会引起内存和存储的开销,动态批处理会引起CPU的开销。

注意: 动态批处理和图形处理不兼容(参照Player Settings)。在单独的构建项目里,如果graphics jobs被开启了,动态批处理将不能使用。

Draw call batching

To draw a GameObject
 on the screen, the engine has to issue a draw call to the graphics API (such as OpenGL or Direct3D). Draw calls are often resource-intensive, with the graphics API doing significant work for every draw call, causing performance overhead on the CPU side. This is mostly caused by the state changes done between the draw calls (such as switching to a different Material), which causes resource-intensive validation and translation steps in the graphics driver.

Unity uses two techniques to address this:

  • Dynamic batching: for small enough Meshes, this transforms their vertices on the CPU, groups many similar vertices together, and draws them all in one go.
  • Static batching: combines static (not moving) GameObjects into big Meshes, and renders them in a faster way.

Built-in batching has several benefits compared to manually merging GameObjects together; most notably, GameObjects can still be culled individually. However, it also has some downsides; static batching incurs memory and storage overhead, and dynamic batching incurs some CPU overhead.

Note: Dynamic batching is not compatible with graphics jobs (see Player Settings). If graphics jobs are enabled, dynamic batching are disabled in Standalone builds.

批处理的Material配置

只有使用了同样Material的GameObject才能被一起批处理。因此,如果你想要实现好的批处理,你应该尽可能在不同的GameObject中使用同样的Material。

如果两个除了Texture之外完全相同的Material,你可以将这两个Texture结合成一个大的Texture。这个过程通常被称为Texture atlasing。只要两个Texture在一个Atlas里,你就可以使用一个Material了。

如果需要从脚本访问共享Material属性,则必须注意修改Renderer.material会创建Material的副本。所以需要使用Renderer.sharedMaterial来保持Material是共享的。

Shadow casters通常在渲染的时候会被一起批处理,即使他们的Material是不同的。在Unity里Shadow casters就算使用了不同的Material也可以被一起批处理,只要Material里shadow pass需要的值相同就行了。例如,许多crates可以使用不同Texture的Material,但是对于shadow caster来说,渲染Texture与他无关,所以在这种情况下,它们可以被一起批处理。

Material set-up for batching

Only GameObjects sharing the same Material can be batched together. Therefore, if you want to achieve good batching, you should aim to share Materials among as many different GameObjects as possible.

If you have two identical Materials which differ only in Texture, you can combine those Textures into a single big Texture. This process is often called Texture atlasing (see the Wikipedia page on Texture atlases for more information). Once Textures are in the same atlas, you can use a single Material instead.

If you need to access shared Material properties from the scripts, then it is important to note that modifying Renderer.material creates a copy of the Material. Instead, use Renderer.sharedMaterial to keep Materials shared.

Shadow casters can often be batched together while rendering, even if their Materials are different. Shadow casters in Unity can use dynamic batching even with different Materials, as long as the values in the Materials needed by the shadow pass are the same. For example, many crates could use Materials with different Textures on them, but for the shadow caster rendering the textures are not relevant, so in this case they can be batched together.

动态批处理

Unity会自动将会移动的GameObject批处理为相同的Draw Call,前提是他们使用了同样的Material并满足其他标准。动态批处理是自动完成的,不需要你进行任何额外的操作。

  • 批处理动态GameObject的每个顶点都有一定的开销,所以动态批处理只适用于顶点总数少于900的Mesh。

• 如果你的Shader使用了Vertex Position,Normal和single UC,你可以批处理300个顶点,如果你的Shader使用了Vertex Position,Normal,UV0,UV1和Tangent,那你只能批处理180个顶点。

• 注意:这些属性限制未来可能会改变。

  • 如果GameObject的transform中含有镜面映射将不会被批处理(例如GameObject A的scale为+1而GameObject B的scale为-1那他们将不能被批处理)。
  • 使用了不同的Material实例会导致GameObject不能被批处理,即使他们本质上是相同的。但shadow caster渲染是例外。
  • 带有lightmaps的GameObject有额外的渲染参数:光照贴图索引和偏移/缩放到光照贴图。通常,动态光映射GameObject应该指向完全相同的光照贴图位置。
  • 多通道Shader不会被批处理

• 几乎所有Unity Shader都支持多光线向前渲染,可以为他们有效地处理附加通道。“additional per-pixel lights”的Draw Call不会被批处理。

• 传统的延迟的(逐通道光照)渲染路径禁用了动态批处理,因为他必须绘制GameObject两次。

Dynamic batching

Unity can automatically batch moving GameObjects into the same draw call if they share the same Material and fulfill other criteria. Dynamic batching is done automatically and does not require any additional effort on your side.

  • Batching dynamic GameObjects has certain overhead per vertex, so batching is applied only to Meshes containing fewer than 900 vertex attributes in total.
    • If your Shader
       is using Vertex Position, Normal and single UV, then you can batch up to 300 verts, while if your Shader is using Vertex Position, Normal, UV0, UV1 and Tangent, then only 180 verts.
    • Note: attribute count limit might be changed in future.
  • GameObjects are not batched if they contain mirroring on the transform (for example GameObject A with +1 scale and GameObject B with –1 scale cannot be batched together).
  • Using different Material instances causes GameObjects not to batch together, even if they are essentially the same. The exception is shadow caster rendering.
  • GameObjects with lightmaps
     have additional renderer parameters: lightmap index and offset/scale into the lightmap. Generally, dynamic lightmapped GameObjects should point to exactly the same lightmap location to be batched.
  • Multi-pass Shaders break batching.
    • Almost all Unity Shaders support several Lights in forward rendering
      , effectively doing additional passes for them. The draw calls for “additional per-pixel lights” are not batched.
    • The Legacy Deferred (light pre-pass) rendering path
       has dynamic batching disabled, because it has to draw GameObjects twice.

 静态批处理

静态批处理允许引擎减少对任意大小的几何图形的绘制调用,只要它使用了相同的Material,并且不能移动。这通常比动态批处理效率更高(他不需要再CPU上转换顶点),但是会占用更多内存。

为了可以利用静态批处理,你需要显式地指定GaneObject为静态的,并且GameObject在游戏中不能移动、旋转或缩放。为此,在GameObject的Inspector中勾选Static复选框来将GameObject标记为静态的:

使用静态批处理需要更多的内存来储存组合的几何结构。如果多个GameObject在静态批处理之前使用了相同的几何结构,那么会为每个GameObject都创建一个几何结构的副本,无论是在编辑器还是在运行的时候。这可能不是一个好办法;有时你不得不牺牲渲染性能,避免对一些GameObject进行静态批处理以保持较小的内存占用。例如,在茂密的森林中把每棵树木都标记为静态会对内存产生严重的影响。

在内部,静态批处理通过将静态GameObject转换为世界空间并为它们创建一个大的顶点和索引缓冲区来处理。然后,对于同一批处理的GameObject,一系列简单的Draw Call完成后,期间几乎没有状态变化。从技术上讲,它不会保存3D API Draw Call,但会保存它们之间的状态变化(资源密集的部分)。在大多数平台上,批处理仅限于64k个顶点和64k个索引(OpenGLES为48k个索引,macOS上为32k个索引)。

Static batching

Static batching allows the engine to reduce draw calls for geometry of any size provided it shares the same material, and does not move. It is often more efficient than dynamic batching (it does not transform vertices on the CPU), but it uses more memory.

In order to take advantage of static batching, you need to explicitly specify that certain GameObjects are static and do not move, rotate or scale in the game. To do so, mark GameObjects as static using the Static checkbox in the Inspector:

Using static batching requires additional memory for storing the combined geometry. If several GameObjects shared the same geometry before static batching, then a copy of geometry is created for each GameObject, either in the Editor or at runtime. This might not always be a good idea; sometimes you have to sacrifice rendering performance by avoiding static batching for some GameObjects to keep a smaller memory footprint. For example, marking trees as static in a dense forest level can have serious memory impact.

Internally, static batching works by transforming the static GameObjects into world space and building a big vertex and index buffer for them. Then, for visible GameObjects in the same batch, a series of simple draw calls are done, with almost no state changes in between. Technically it does not save 3D API draw calls, but it saves on state changes between them (which is the resource-intensive part). Batches are limited to 64k vertices and 64k indices on most platforms (48k indices on OpenGLES, 32k indices on macOS).

提示

当前只有Mesh RenderersTrail RenderersLine RenderersParticle SystemsSprite Renderers可以被批处理。这意味skinned Meshes、Cloth和其他类型的渲染组件都不能被批处理

Renderer只能和相同类型的Renderer批处理。

半透明的Shader通常要求GameObject的渲染顺序要前后一致,这样才能保证透明度。Unity首先按照这个顺序处理GameObject,然后尝试对它们进行批处理,但是由于必须严格满足这个顺序,这意味着与不透明的GameObject相比,只能实现更少的批处理。

手动组合与彼此相近的GameObject来进行批处理是一个不错的选择。比如,一颗有很多抽屉的静态碗橱通常可以组合成一个Mesh,无论是在一个3D建模应用还是使用Mesh.CombineMeshes

Tips

Currently, only Mesh RenderersTrail RenderersLine RenderersParticle Systems and Sprite Renderers are batched. This means that skinned Meshes, Cloth, and other types of rendering components are not batched.

Renderers only ever batch with other Renderers of the same type.

Semi-transparent Shaders usually require GameObjects to be rendered in back-to-front order for transparency to work. Unity first orders GameObjects in this order, and then tries to batch them, but because the order must be strictly satisfied, this often means less batching can be achieved than with opaque GameObjects.

Manually combining GameObjects that are close to each other can be a very good alternative to draw call batching. For example, a static cupboard with lots of drawers often makes sense to just combine into a single Mesh, either in a 3D modeling application or using Mesh.CombineMeshes.


  • 2017–10–26 Page amended with limited editorial review

  • Added note on dynamic batching being incompatible with graphics jobs in 2017.2

猜你喜欢

转载自blog.csdn.net/iFasWind/article/details/81200829