Optimizing graphics rendering in Unity games

Introduction

In this article we will learn what happens behind the scenes when Unity renders a frame, what kind of performance problems can occur when rendering and how to fix performance problems related to rendering.
在本文中,我们将了解 Unity 渲染帧时幕后发生的情况,渲染时会出现什么样的性能问题以及如何修复与渲染相关的性能问题。
Before we read this article, it is vital to understand that there is no one size fits all approach to improving rendering performance. Rendering performance is affected by many factors within our game and is also highly dependent on the hardware and operating system that our game runs on. The most important thing to remember is that we solve performance problems by investigating, experimenting and rigorously profiling the results of our experiments.
在阅读本文之前,重要的是要了解没有一种万能的方法来提高渲染性能。渲染性能受我们游戏中的许多因素影响,并且高度依赖于我们游戏运行的硬件和操作系统。要记住的最重要的事情是,我们通过调查、实验和严格分析我们的实验结果来解决性能问题。
This article contains information on most common rendering performance problems with suggestions on how to fix them and links to further reading. It’s possible that our game could have a problem - or combination of problems - not covered here. This article, however, will still help us to understand our problem and give us the knowledge and vocabulary to effectively search for a solution.
本文包含有关最常见的渲染性能问题的信息,以及有关如何修复这些问题的建议以及进一步阅读的链接。我们的游戏可能存在问题 - 或问题组合 - 此处未涵盖。然而,本文仍将帮助我们理解我们的问题,并为我们提供有效寻找解决方案的知识和词汇。

A brief introduction to rendering

Before we begin, let’s take a quick and somewhat simplified look at what happens when Unity renders a frame. Understanding the flow of events and the correct terms for things will help us to understand, research and work towards fixing our performance problems.
在开始之前,让我们快速简单地看一下 Unity 渲染帧时会发生什么。了解事件的流程和事物的正确术语将有助于我们理解、研究和努力解决我们的性能问题。
NB: Throughout this article, we will use the term “object” to mean an object that may be rendered in our game. Any GameObject with a Renderer component will be referred to as an object.
注意:在整篇文章中,我们将使用术语“对象”来表示可以在我们的游戏中渲染的对象。任何带有 Renderer 组件的 GameObject 都将被称为对象。
At the most basic level, rendering can be described as follows:
在最基本的层面上,渲染可以描述如下:

  • The central processing unit, known as the CPU, works out what must be drawn and how it must be drawn.
    中央处理单元,称为 CPU,计算必须绘制什么以及必须如何绘制。
  • The CPU sends instructions to the graphics processing unit, known as the GPU.
    CPU 将指令发送到称为 GPU 的图形处理单元。
  • The GPU draws things according to the CPU’s instructions.
    GPU 根据 CPU 的指令绘制事物。
    Now let’s take a closer look at what happens. We’ll cover each of these steps in greater detail later in the article, but for now let’s just familiarise ourselves with the words used and understand the different roles that the CPU and GPU play in rendering.
    现在让我们仔细看看会发生什么。我们将在本文后面更详细地介绍这些步骤中的每一个,但现在让我们熟悉所使用的术语并了解 CPU 和 GPU 在渲染中所扮演的不同角色。
    The phrase often used to describe rendering is the rendering pipeline, and this is a useful image to bear in mind; efficient rendering is all about keeping information flowing.
    经常用来描述渲染的短语是渲染管道,这是一个需要记住的有用图像;高效的渲染就是保持信息流动。

For every frame that is rendered, the CPU does the following work:

  • The CPU checks every object in the scene to determine whether it should be rendered. An object is only rendered if it meets certain criteria; for example, some part of its bounding box must be within a camera’s view frustum. Objects that will not be rendered are said to be culled. For more information on the frustum and frustum culling please see this page.
    CPU 检查场景中的每个对象以确定是否应该渲染它。一个对象只有在满足某些条件时才会被渲染;例如,它的边界框的某些部分必须在相机的视锥体内。不会被渲染的对象被称为被剔除。有关截锥体和截锥体剔除的更多信息,请参阅此页面。

  • The CPU gathers information about every object that will be rendered and sorts this data into commands known as draw calls. A draw call contains data about a single mesh and how that mesh should be rendered; for example, which textures should be used. Under certain circumstances, objects that share settings may be combined into the same draw call. Combining data for different objects into the same draw call is known as batching.
    CPU 收集有关将被渲染的每个对象的信息,并将这些数据分类为称为绘制调用的命令。绘制调用包含有关单个网格的数据以及应该如何渲染该网格;例如,应该使用哪些纹理。在某些情况下,共享设置的对象可能会合并到同一个绘图调用中。将不同对象的数据组合到同一个绘图调用中称为批处理。

  • The CPU creates a packet of data called a batch for each draw call. Batches may sometimes contain data other than draw calls, but these situations are unlikely to contribute to common performance issues and we therefore won’t consider these in this article.
    CPU 为每个绘图调用创建一个称为批处理的数据包。批处理有时可能包含绘图调用以外的数据,但这些情况不太可能导致常见的性能问题,因此我们不会在本文中考虑这些问题。

For every batch that contains a draw call, the CPU now must do the following:

  • The CPU may send a command to the GPU to change a number of variables known collectively as the render state. This command is known as a SetPass call. A SetPass call tells the GPU which settings to use to render the next mesh. A SetPass call is sent only if the next mesh to be rendered requires a change in render state from the previous mesh.
    CPU 可以向 GPU 发送命令以更改统称为渲染状态的多个变量。 此命令称为 SetPass 调用。 SetPass 调用告诉 GPU 使用哪些设置来渲染下一个网格。 仅当要渲染的下一个网格需要更改前一个网格的渲染状态时,才会发送 SetPass 调用。

  • The CPU sends the draw call to the GPU. The draw call instructs the GPU to render the specified mesh using the settings defined in the most recent SetPass call.
    CPU 将绘制调用发送到 GPU。 绘制调用指示 GPU 使用最近的 SetPass 调用中定义的设置来渲染指定的网格。

  • Under certain circumstances, more than one pass may be required for the batch. A pass is a section of shader code and a new pass requires a change to the render state. For each pass in the batch, the CPU must send a new SetPass call and then must send the draw call again.
    在某些情况下,批次可能需要多次pass。 pass 是着色器代码的一部分,新的 pass 需要更改渲染状态。 对于批处理中的每次传递,CPU 必须发送一个新的 SetPass 调用,然后必须再次发送绘图调用。

Meanwhile, the GPU does the following work:

  • The GPU handles tasks from the CPU in the order that they were sent.
    GPU 按照发送顺序处理来自 CPU 的任务。

  • If the current task is a SetPass call, the GPU updates the render state.
    如果当前任务是 SetPass 调用,GPU 会更新渲染状态。

  • If the current task is a draw call, the GPU renders the mesh. This happens in stages, defined by separate sections of shader code. This part of rendering is complex and we won’t cover it in great detail, but it’s useful for us to understand that a section of code called the vertex shader tells the GPU how to process the mesh’s vertices and then a section of code called the fragment shader tells the GPU how to draw the individual pixels.
    如果当前任务是绘制调用,GPU 会渲染网格。 这分阶段发生,由单独的着色器代码部分定义。 这部分渲染很复杂,我们不会详细介绍,但我们需要了解一段称为顶点着色器的代码告诉 GPU 如何处理网格的顶点,然后一段代码称为 片段着色器告诉 GPU 如何绘制单个像素。

  • This process repeats until all tasks sent from the CPU have been processed by the GPU.
    这个过程一直重复,直到所有从 CPU 发出的任务都被 GPU 处理完。

Now that we understand what’s happening when Unity renders a frame, let’s consider the sort of problems that can occur when rendering.
现在我们了解了 Unity 渲染帧时发生的情况,让我们考虑渲染时可能出现的问题。

Types of rendering problems

The most important thing to understand about rendering is this: both the CPU and the GPU must finish all of their tasks in order to render the frame. If any one of these tasks takes too long to complete, it will cause a delay to the rendering of the frame.
关于渲染,最重要的一点是:CPU 和 GPU 都必须完成所有任务才能渲染帧。如果其中任何一项任务的完成时间过长,都会导致帧渲染延迟。
Rendering problems have two fundamental causes. The first type of problem is caused by an inefficient pipeline. An inefficient pipeline occurs when one or more of the steps in the rendering pipeline takes too long to complete, interrupting the smooth flow of data. Inefficiencies within the pipeline are known as bottlenecks. The second type of problem is caused by simply trying to push too much data through the pipeline. Even the most efficient pipeline has a limit to how much data it can handle in a single frame.
渲染问题有两个根本原因。第一类问题是由效率低下的管道引起的。当渲染管道中的一个或多个步骤需要太长时间才能完成时,就会出现效率低下的管道,从而中断数据的流畅流动。管道内的低效率被称为瓶颈。第二种类型的问题是由于试图通过管道推送太多数据而引起的。即使是最高效的管道,它在单个帧中可以处理的数据量也是有限的。
When our game takes too long to render a frame because the CPU takes too long to perform its rendering tasks, our game is what is known as CPU bound. When our game takes too long to render a frame because the GPU takes too long to perform its rendering tasks, our game is what is known as GPU bound.
当我们的游戏因为 CPU 执行渲染任务花费的时间太长而导致渲染帧的时间过长时,我们的游戏就是所谓的 CPU 受限。当我们的游戏因为 GPU 执行其渲染任务的时间太长而导致渲染帧的时间太长时,我们的游戏就是所谓的 GPU 绑定。

Understanding rendering problems

It is vital that we use profiling tools to understand the cause of performance problems before we make any changes. Different problems require different solutions. It is also very important that we measure the effects of every change we make; fixing performance problems is a balancing act, and improving one aspect of performance can negatively impact another.
在进行任何更改之前,使用分析工具了解性能问题的原因至关重要。 不同的问题需要不同的解决方案。 衡量我们所做的每一次改变的效果也很重要; 解决性能问题是一种平衡行为,提高性能的一个方面可能会对另一个方面产生负面影响。
We will use two tools to help us understand and fix our rendering performance problems: the Profiler window and the Frame Debugger. Both of these tools are built into Unity.
我们将使用两个工具来帮助我们理解和修复我们的渲染性能问题:Profiler 窗口和 Frame Debugger。 这两个工具都内置在 Unity 中。

Finding the cause of performance problems

Before we try to improve the rendering performance of our game, we must be certain that our game is running slowly due to rendering problems. There is no point trying to optimize our rendering performance if the real cause of our problem is overly complex user scripts! If you’re not sure whether your performance problems relate to rendering, you should follow this tutorial.
在我们尝试提高游戏的渲染性能之前,我们必须确定我们的游戏由于渲染问题而运行缓慢。 如果我们的问题的真正原因是过于复杂的用户脚本,那么尝试优化我们的渲染性能是没有意义的! 如果您不确定您的性能问题是否与渲染有关,您应该遵循 this tutorial
Once we have established that our problems relate to rendering, we must also understand whether our game is CPU bound or GPU bound. These different problems require different solutions, so it’s vital that we understand the cause of the problem before trying to fix it. If you’re not yet sure whether your game is CPU bound or GPU bound, you should follow this tutorial.
一旦我们确定我们的问题与渲染有关,我们还必须了解我们的游戏是受 CPU 限制还是 GPU 限制。 这些不同的问题需要不同的解决方案,因此在尝试解决问题之前了解问题的原因至关重要。 如果您还不确定您的游戏是受 CPU 限制还是受 GPU 限制,您应该遵循this tutorial
If we are certain that our problems relate to rendering and we know whether our game is CPU bound or GPU bound, we are ready to read on.
如果我们确定我们的问题与渲染有关并且我们知道我们的游戏是受 CPU 限制还是受 GPU 限制,我们就可以继续阅读了。

If our game is CPU bound

Broadly speaking, the work that must be carried out by the CPU in order to render a frame is divided into three categories:
从广义上讲,为了渲染一帧必须由 CPU 执行的工作分为三类:

  • Determining what must be drawn
    确定必须绘制的内容

  • Preparing commands for the GPU
    为 GPU 准备命令

  • Sending commands to the GPU
    向 GPU 发送命令

These broad categories contain many individual tasks, and these tasks may be carried out across multiple threads. Threads allow separate tasks to happen simultaneously; while one thread performs one task, another thread can perform a completely separate task. This means that the work can be done more quickly. When rendering tasks are split across separate threads, this is known as multithreaded rendering.
这些广泛的类别包含许多单独的任务,并且这些任务可以跨多个线程执行。 线程允许不同的任务同时发生; 当一个线程执行一项任务时,另一个线程可以执行完全独立的任务。 这意味着可以更快地完成工作。 当渲染任务被拆分到不同的线程时,这称为多线程渲染
There are three types of thread involved in Unity’s rendering process: the main thread, the render thread and worker threads. The main thread is where the majority of CPU tasks for our game take place, including some rendering tasks. The render thread is a specialised thread that sends commands to the GPU. Worker threads each perform a single task, such as culling or mesh skinning. Which tasks are performed by which thread depends on our game’s settings and the hardware on which our game runs. For example, the more CPU cores our target hardware has, the more worker threads can be spawned. For this reason, it is very important to profile our game on target hardware; our game may perform very differently on different devices.
Unity的渲染过程涉及三种线程:主线程、渲染线程和工作线程。 主线程是我们游戏的大部分 CPU 任务发生的地方,包括一些渲染任务。 渲染线程是向 GPU 发送命令的专用线程。 每个工作线程执行一个任务,例如剔除或网格蒙皮。 哪些任务由哪个线程执行取决于我们的游戏设置和运行游戏的硬件。 例如,我们的目标硬件拥有的 CPU 内核越多,可以产生的工作线程就越多。 因此,在目标硬件上分析我们的游戏非常重要; 我们的游戏在不同设备上的表现可能会有很大差异。
Because multithreaded rendering is complex and hardware-dependent, we must understand which tasks are causing our game to be CPU bound before we try to improve performance. If our game is running slowly because culling operations are taking too long on one thread, then it won’t help us to reduce the amount of time it takes to send commands to the GPU on a different thread.
因为多线程渲染很复杂并且依赖于硬件,所以在我们尝试提高性能之前,我们必须了解哪些任务导致我们的游戏受 CPU 限制。 如果我们的游戏运行缓慢是因为剔除操作在一个线程上花费的时间太长,那么它不会帮助我们减少在不同线程上向 GPU 发送命令所需的时间。
NB: Not all platforms support multithreaded rendering; at the time of writing, WebGL does not support this feature. On platforms that do not support multithreaded rendering, all CPU tasks are carried out on the same thread. If we are CPU bound on such a platform, optimizing any CPU work will improve CPU performance. If this is the case for our game, we should read all of the following sections and consider which optimizations may be most suitable for our game.
注意:并非所有平台都支持多线程渲染; 在撰写本文时,WebGL 不支持此功能。 在不支持多线程渲染的平台上,所有 CPU 任务都在同一个线程上执行。 如果我们在这样的平台上受 CPU 限制,那么优化任何 CPU 工作都会提高 CPU 性能。 如果我们的游戏是这种情况,我们应该阅读以下所有部分并考虑哪些优化可能最适合我们的游戏。

Graphics jobs

The Graphics jobs option in Player Settings determines whether Unity uses worker threads to carry out rendering tasks that would otherwise be done on the main thread and, in some cases, the render thread. On platforms where this feature is available, it can deliver a considerable performance boost. If we wish to use this feature, we should profile our game with and without Graphics jobs enabled and observe the effect that it has on performance.
播放器设置中的图形作业选项确定 Unity 是否使用工作线程来执行渲染任务,否则这些任务将在主线程上完成,在某些情况下,在渲染线程上完成。 在提供此功能的平台上,它可以提供相当大的性能提升。 如果我们希望使用此功能,我们应该在启用和不启用图形作业的情况下分析我们的游戏,并观察它对性能的影响。

Finding out which tasks are contributing to problems

We can determine which tasks are causing our game to be CPU bound by using the Profiler window. This tutorial shows how to determine where the problems lie.
我们可以使用 Profiler 窗口确定哪些任务导致我们的游戏受 CPU 限制。 本教程展示了如何确定问题所在。
Now that we understand which tasks are causing our game to be CPU bound, let’s look at a few common problems and their solutions.
现在我们了解了哪些任务导致我们的游戏受 CPU 限制,让我们看看一些常见问题及其解决方案。

Sending commands to the GPU

The time taken to send commands to the GPU is the most common reason for a game to be CPU bound. This task is performed on the render thread on most platforms, although on certain platforms (for example, PlayStation 4) this may be performed by worker threads.
向 GPU 发送命令所花费的时间是游戏受 CPU 限制的最常见原因。此任务在大多数平台上的渲染线程上执行,但在某些平台(例如,PlayStation 4)上,这可能由工作线程执行。
The most costly operation that occurs when sending commands to the GPU is the SetPass call. If our game is CPU bound due to sending commands to the GPU, reducing the number of SetPass calls is likely to be the best way to improve performance.
向 GPU 发送命令时发生的成本最高的操作是 SetPass 调用。如果我们的游戏由于向 GPU 发送命令而受到 CPU 限制,那么减少 SetPass 调用的数量可能是提高性能的最佳方法。
We can see how many SetPass calls and batches are being sent in Rendering profiler of Unity’s Profiler window. The number of SetPass calls that can be sent before performance suffers depends very much on the target hardware; a high-end PC can send many more SetPass calls before performance suffers than a mobile device.
我们可以在 Unity 的 Profiler 窗口的 Rendering profiler 中看到发送了多少 SetPass 调用和批处理。在性能受到影响之前可以发送的 SetPass 调用的数量很大程度上取决于目标硬件;在性能受到影响之前,高端 PC 可以发送比移动设备更多的 SetPass 调用。
The number of SetPass calls and its relationship to the number of batches depends on several factors, and we’ll cover these topics in more detail later in the article. However, it’s usually the case that:
SetPass 调用的数量及其与批次数量的关系取决于几个因素,我们将在本文后面更详细地介绍这些主题。但是,通常情况是:

  • Reducing the number of batches and/or making more objects share the same render state will, in most cases, reduce the number of SetPass calls.
    在大多数情况下,减少批处理数量和/或使更多对象共享相同的渲染状态将减少 SetPass 调用的数量。

  • Reducing the number of SetPass calls will, in most cases, improve CPU performance.
    在大多数情况下,减少 SetPass 调用的数量会提高 CPU 性能。

If reducing the number of batches doesn’t reduce the number of SetPass calls, it may still lead to performance improvements in its own right. This is because the CPU can more efficiently process a single batch than several batches, even if they contain the same amount of mesh data.
如果减少批次数量并不能减少 SetPass 调用的数量,它本身仍可能导致性能改进。 这是因为 CPU 可以比处理多个批次更有效地处理单个批次,即使它们包含相同数量的网格数据。
There are, broadly, three ways of reducing the number of batches and SetPass calls. We will look more in-depth at each one of these:
总的来说,有三种方法可以减少批次和 SetPass 调用的数量。 我们将更深入地研究其中的每一个:

  • Reducing the number of objects to be rendered will likely reduce both batches and SetPass calls.
    减少要渲染的对象数量可能会减少批处理和 SetPass 调用。

  • Reducing the number of times each object must be rendered will usually reduce the number of SetPass calls.
    减少必须渲染每个对象的次数通常会减少 SetPass 调用的次数。

  • Combining the data from objects that must be rendered into fewer batches will reduce the number of batches.
    将来自必须渲染的对象的数据组合成更少的批次将减少批次的数量。

Different techniques will be suitable for different games, so we should consider all of these options, decide which ones could work in our game and experiment.
不同的技术将适用于不同的游戏,所以我们应该考虑所有这些选项,决定哪些可以在我们的游戏中工作并进行实验。

Reducing the number of objects being rendered

Reducing the number of objects that must be rendered is the simplest way to reduce the number of batches and SetPass calls. There are a several techniques we can use to reduce the number of objects being rendered.
减少必须渲染的对象数量是减少批处理和 SetPass 调用数量的最简单方法。我们可以使用多种技术来减少渲染对象的数量。

  • Simply reducing the number of visible objects in our scene can be an effective solution. If, for example, we are rendering a large number of different characters in a crowd, we can experiment with simply having fewer of these characters in the scene. If the scene still looks good and performance improves, this will likely be a much quicker solution than more sophisticated techniques.
    简单地减少场景中可见对象的数量可能是一种有效的解决方案。例如,如果我们在人群中渲染大量不同的角色,我们可以尝试在场景中简单地减少这些角色。如果场景看起来仍然不错并且性能有所提高,那么这可能是比更复杂的技术更快的解决方案。

  • We can reduce our camera’s draw distance using the camera’s Far Clip Plane property. This property is the distance beyond which objects are no longer rendered by the camera. If we wish to disguise the fact that distant objects are no longer visible, we can trying using fog to hide the lack of distant objects.
    我们可以使用相机的 Far Clip Plane 属性来减少相机的绘制距离。此属性是相机不再渲染对象的距离。如果我们想掩盖远处物体不再可见的事实,我们可以尝试使用雾来隐藏远处物体的缺失。

  • For a more fine-grained approach to hiding objects based on distance, we can use our camera’s Layer Cull Distances property to provide custom culling distances for objects that are on separate layers. This approach can be useful if we have lots of small foreground decorative details; we could hide these details at a much shorter distance than large terrain features.
    对于基于距离隐藏对象的更细粒度的方法,我们可以使用相机的图层剔除距离属性为单独图层上的对象提供自定义剔除距离。如果我们有很多小的前景装饰细节,这种方法会很有用;我们可以在比大型地形特征更短的距离内隐藏这些细节。

  • We can use a technique called occlusion culling to disable the rendering of objects that are hidden by other objects. For example, if there is a large building in our scene we can use occlusion culling to disable the rendering of objects behind it. Unity’s occlusion culling is not suitable for all scenes, can lead to additional CPU overhead and can be complex to set up, but it can greatly improve performance in some scenes. This Unity blog post on occlusion culling best practices is a great guide to to the subject. In addition to using Unity’s occlusion culling, we can also implement our own form of occlusion culling by manually deactivating objects that we know cannot be seen by the player. For example, if our scene contains objects that are used for a cutscene but aren’t visible before or afterwards, we should deactivate them. Using our knowledge of our own game is always more efficient than asking Unity to work things out dynamically.
    我们可以使用一种称为遮挡剔除的技术来禁用被其他对象隐藏的对象的渲染。例如,如果我们的场景中有一座大型建筑物,我们可以使用遮挡剔除来禁用其后面对象的渲染。 Unity 的遮挡剔除并非适用于所有场景,会导致额外的 CPU 开销并且设置起来可能很复杂,但它可以在某些场景中大大提高性能。这篇关于遮挡剔除最佳实践的 Unity 博客文章是该主题的绝佳指南。除了使用 Unity 的遮挡剔除,我们还可以通过手动停用我们知道玩家看不到的对象来实现我们自己的遮挡剔除形式。例如,如果我们的场景包含用于过场动画但在之前或之后不可见的对象,我们应该停用它们。使用我们自己的游戏知识总是比要求 Unity 动态解决问题更有效。

Reducing the number of times each object must be rendered

Realtime lighting, shadows and reflections add a great deal of realism to games but can be very expensive. Using these features can lead to objects to be rendered multiple times, which can greatly impact performance.
实时光照、阴影和反射为游戏增加了大量的真实感,但可能非常昂贵。 使用这些功能可能会导致对象被多次渲染,这会极大地影响性能。
The exact impact of these features depends on the rendering path that we choose for our game. Rendering path is the term for the order in which calculations are performed when drawing the scene, and the major difference between rendering paths is how they handle realtime lights, shadows and reflections. As a general rule, Deferred Rendering is likely to be a better choice if our game runs on higher-end hardware and uses a lot of realtime lights, shadows and reflections. Forward Rendering is likely to be more suitable if our game runs on lower-end hardware and does not use these features. However, this is a very complex issue and if we wish to make use of realtime lights, shadows and reflections it is best to research the subject and experiment. This page of the Unity Manual gives more information on the different rendering paths available in Unity and is a useful jumping-off point. This tutorial contains useful information on the subject of lighting in Unity.
这些功能的确切影响取决于我们为游戏选择的渲染路径。 渲染路径是绘制场景时执行计算的顺序的术语,渲染路径之间的主要区别在于它们如何处理实时灯光、阴影和反射。 作为一般规则,如果我们的游戏在高端硬件上运行并使用大量实时灯光、阴影和反射,那么延迟渲染可能是更好的选择。 如果我们的游戏在低端硬件上运行并且不使用这些功能,前向渲染可能更适合。 然而,这是一个非常复杂的问题,如果我们希望利用实时灯光、阴影和反射,最好研究主题和实验。 Unity 手册的这一页提供了有关 Unity 中可用的不同渲染路径的更多信息,并且是一个有用的起点。 本教程包含有关 Unity 照明主题的有用信息。
Regardless of the rendering path chosen, the use of realtime lights, shadows and reflections can impact our game’s performance and it’s important to understand how to optimize them.
无论选择何种渲染路径,实时灯光、阴影和反射的使用都会影响我们游戏的性能,了解如何优化它们很重要。

  • Dynamic lighting in Unity is a very complex subject and discussing it in depth is beyond the scope of this article, but this tutorial is an excellent introduction to the subject and this page of the Unity Manual has details on common lighting optimizations.
    Unity 中的动态光照是一个非常复杂的主题,深入讨论它超出了本文的范围,但本教程是对该主题的出色介绍,Unity 手册的这一页详细介绍了常见的光照优化。

  • Dynamic lighting is expensive. When our scene contains objects that don’t move, such as scenery, we can use a technique called baking to precompute the lighting for the scene so that runtime lighting calculations are not required. This tutorial gives an introduction to the technique, and this section of the Unity Manual covers baked lighting in detail.
    动态照明很昂贵。当我们的场景包含不移动的对象(例如风景)时,我们可以使用一种称为烘焙的技术来预先计算场景的光照,这样就不需要运行时光照计算。本教程介绍了该技术,Unity 手册的这一部分详细介绍了烘焙光照。

  • If we wish to use realtime shadows in our game, this is likely an area where we can improve performance. This page of the Unity Manual is a good guide to the shadow properties that can be tweaked in Quality Settings and how these will affect appearance and performance. For example, we can use the Shadow Distance property to ensure that only nearby objects cast shadows.
    如果我们希望在游戏中使用实时阴影,这可能是我们可以提高性能的一个领域。 Unity 手册的这一页很好地指导了可以在质量设置中调整的阴影属性以及这些属性将如何影响外观和性能。例如,我们可以使用 Shadow Distance 属性来确保只有附近的物体才会投射阴影。

  • Reflection probes create realistic reflections but can be very costly in terms of batches. It’s best to keep our use of reflections to a minimum where performance is a concern, and to optimize them as much as possible where they are used. This page of the Unity Manual is a useful guide to optimizing reflection probes.
    反射探头产生逼真的反射,但在批量方面可能非常昂贵。最好在性能受到关注的情况下将反射的使用保持在最低限度,并在使用它们的地方尽可能地优化它们。 Unity 手册的这一页是优化反射探针的有用指南。

Combining objects into fewer batches

A batch can contain the data for multiple objects when certain conditions are met. To be eligible for batching, objects must:
当满足某些条件时,一个批次可以包含多个对象的数据。要符合批处理条件,对象必须:

  • Share the same instance of the same material
    共享相同材质的相同实例

  • Have identical material settings (i.e., texture, shader and shader parameters)
    具有相同的材质设置(即纹理、着色器和着色器参数)

Batching eligible objects can improve performance, although as with all optimization techniques we must profile carefully to ensure that the cost of batching does not exceed the performance gains.
批处理合格对象可以提高性能,尽管与所有优化技术一样,我们必须仔细分析以确保批处理成本不超过性能增益。
There are a few different techniques for batching eligible objects:
有几种不同的技术可用于批处理合格对象:

  • Static batching is a technique that allows Unity to batch nearby eligible objects that do not move. A good example of something that could benefit from static batching is a pile of similar objects, such as boulders. This page of the Unity Manual contains instructions on setting up static batching in our game. Static batching can lead to higher memory usage so we should bear this cost in mind when profiling our game.
    静态批处理是一种允许 Unity 批处理附近不移动的合格对象的技术。可以从静态批处理中受益的一个很好的例子是一堆类似的对象,例如巨石。 Unity 手册的这一页包含在我们的游戏中设置静态批处理的说明。静态批处理会导致更高的内存使用率,因此我们在分析游戏时应该牢记这一成本。

  • Dynamic batching is another technique that allows Unity to batch eligible objects, whether they move or not. There are a few restrictions on the objects that can be batched using this technique. These restrictions are listed, along with instructions, on this page of the Unity Manual. Dynamic batching has an impact on CPU usage that can cause it to cost more in CPU time than it saves. We should bear this cost in mind when experimenting with this technique and be cautious with its use.
    动态批处理是另一种允许 Unity 批处理符合条件的对象的技术,无论它们是否移动。使用这种技术可以批处理的对象有一些限制。 Unity 手册的此页面上列出了这些限制以及说明。动态批处理对 CPU 使用率有影响,这可能导致它在 CPU 时间上花费的时间比它节省的时间要多。在试验这种技术时,我们应该牢记这一成本,并谨慎使用它。

  • Batching Unity’s UI elements is a little more complex, as it can be affected by the layout of our UI. This video from Unite Bangkok 2015 gives a good overview of the subject and this guide to optimizing Unity UI provides in-depth information on how to ensure that UI batching works as we intend it to.
    批处理 Unity 的 UI 元素稍微复杂一些,因为它可能会受到 UI 布局的影响。 Unite Bangkok 2015 的这段视频很好地概述了该主题,本优化 Unity UI 指南提供了有关如何确保 UI 批处理按我们预期工作的深入信息。

  • GPU instancing is a technique that allows large numbers of identical objects to be very efficiently batched. There are limitations to its use and it is not supported by all hardware, but if our game has many identical objects onscreen at once we may be able to benefit from this technique. This page of the Unity Manual contains an introduction to GPU instancing in Unitywith details of how to use it, which platforms support it and the circumstances under which it may benefit our game.
    GPU 实例化是一种允许非常有效地批处理大量相同对象的技术。它的使用存在限制,并且并非所有硬件都支持它,但如果我们的游戏同时在屏幕上显示许多相同的对象,我们可能会从这种技术中受益。 Unity 手册的这一页介绍了 Unity 中的 GPU 实例化,详细说明了如何使用它、哪些平台支持它以及它可能使我们的游戏受益的情况。

  • Texture atlasing is a technique where multiple textures are combined into one larger texture. It is commonly used in 2D games and UI systems, but can also be used in 3D games. If we use this technique when creating art for our game, we can ensure that objects share textures and are therefore eligible for batching. Unity has a built-in texture atlasing tool called Sprite Packer for use with 2D games.
    纹理图集是一种将多个纹理组合成一个更大纹理的技术。它常用于 2D 游戏和 UI 系统,但也可用于 3D 游戏。如果我们在为我们的游戏创建艺术时使用这种技术,我们可以确保对象共享纹理,因此可以进行批处理。 Unity 有一个名为 Sprite Packer 的内置纹理图集工具,可用于 2D 游戏。

  • It is possible to manually combine meshes that share the same material and texture, either in the Unity Editor or via code at runtime. When combining meshes in this way, we must be aware that shadows, lighting and culling will still operate on a per-object level; this means that a performance increase from combining meshes could be counteracted by no longer being able to cull those objects when they would otherwise not have been rendered. If we wish to investigate this approach, we should examine the the Mesh.CombineMeshes function. The CombineChildren script in Unity’s Standard Assets package is an example of this technique.
    可以在 Unity 编辑器中或在运行时通过代码手动组合共享相同材质和纹理的网格。以这种方式组合网格时,我们必须意识到阴影、照明和剔除仍将在每个对象级别上运行;这意味着组合网格带来的性能提升可以通过不再能够剔除那些原本不会被渲染的对象来抵消。如果我们想研究这种方法,我们应该检查 Mesh.CombineMeshes 函数。 Unity 标准资源包中的 CombineChildren 脚本就是这种技术的一个例子。

  • We must be very careful when accessing Renderer.material in scripts. This duplicates the material and returns a reference to the new copy. Doing so will break batching if the renderer was part of a batch because the renderer no longer has a reference to the same instance of the material. If we wish to access a batched object’s material in a script, we should use Renderer.sharedMaterial.
    在脚本中访问 Renderer.material 时,我们必须非常小心。这将复制材料并返回对新副本的引用。如果渲染器是批处理的一部分,则这样做会中断批处理,因为渲染器不再具有对同一材质实例的引用。如果我们希望在脚本中访问批处理对象的材质,我们应该使用 Renderer.sharedMaterial。

Culling, sorting and batching

Culling, gathering data on objects that will be drawn, sorting this data into batches and generating GPU commands can all contribute to being CPU bound. These tasks will either be performed on the main thread or on individual worker threads, depending on our game’s settings and target hardware.
剔除、收集将要绘制的对象的数据、将这些数据分批并生成 GPU 命令都可能导致 CPU 受限。 这些任务将在主线程或单独的工作线程上执行,具体取决于我们的游戏设置和目标硬件。

  • Culling is unlikely to be very costly on its own, but reducing unnecessary culling may help performance. There is a per-object-per-camera overhead for all active scene objects, even those which are on layers that are not being rendered. To reduce this, we should disable cameras and deactivate or disable renderers that are not currently in use.
    剔除本身不太可能非常昂贵,但减少不必要的剔除可能有助于提高性能。 所有活动场景对象都有每个对象每个摄像机的开销,即使是那些位于未渲染层上的对象也是如此。 为了减少这种情况,我们应该禁用相机并停用或禁用当前未使用的渲染器。

  • Batching can greatly improve the speed of sending commands to the GPU, but it can sometimes add unwanted overhead elsewhere. If batching operations are contributing to our game being CPU bound, we may wish to limit the number of manual or automatic batching operations in our game.
    批处理可以大大提高向 GPU 发送命令的速度,但有时会在其他地方增加不必要的开销。 如果批处理操作导致我们的游戏受 CPU 限制,我们可能希望限制游戏中手动或自动批处理操作的数量。

Skinned meshes

SkinnedMeshRenderers are used when we animate a mesh by deforming it using a technique called bone animation. It’s most commonly used in animated characters. Tasks related to rendering skinned meshes will usually be performed on the main thread or on individual worker threads, depending on our game’s settings and target hardware.
SkinnedMeshRenderers 用于通过使用称为骨骼动画的技术对网格进行变形来为网格设置动画。它最常用于动画角色。与渲染蒙皮网格相关的任务通常会在主线程或单独的工作线程上执行,具体取决于我们游戏的设置和目标硬件。
Rendering skinned meshes can be a costly operation. If we can see in Profiler window that rendering skinned meshes is contributing to our game being CPU bound, there are a few things we can try to improve performance:
渲染蒙皮网格可能是一项昂贵的操作。如果我们可以在 Profiler 窗口中看到渲染蒙皮网格导致我们的游戏受 CPU 限制,那么我们可以尝试一些事情来提高性能:

  • We should consider whether we need to use SkinnedMeshRenderer components for every object that currently uses one. It may be that we have imported a model that uses a SkinnedMeshRenderer component but we are not actually animating it, for example. In a case like this, replacing the SkinnedMeshRenderer component with a MeshRenderer component will aid performance. When importing models into Unity, if we choose not to import animations in the model’s Import Settings, the model will have a MeshRenderer instead of a SkinnedMeshRenderer.
    我们应该考虑是否需要为当前使用的每个对象使用 SkinnedMeshRenderer 组件。例如,我们可能已经导入了一个使用 SkinnedMeshRenderer 组件的模型,但我们实际上并没有对其进行动画处理。在这种情况下,将 SkinnedMeshRenderer 组件替换为 MeshRenderer 组件将有助于提高性能。将模型导入 Unity 时,如果我们在模型的 Import Settings 中选择不导入动画,则模型将具有 MeshRenderer 而不是 SkinnedMeshRenderer。

  • If we are animating our object only some of the time (for example, only on start up or only when it is within a certain distance of the camera), we could switch its mesh for a less detailed version or its SkinnedMeshRenderer component for a MeshRenderer component. The SkinnedMeshRenderer component has a BakeMesh function that can create a mesh in a matching pose, which is useful for swapping between different meshes or renderers without any visible change to the object.
    如果我们只在某些时候为我们的对象设置动画(例如,仅在启动时或仅当它在相机一定距离内时),我们可以将其网格切换为不太详细的版本或将其 SkinnedMeshRenderer 组件切换为 MeshRenderer零件。 SkinnedMeshRenderer 组件有一个 BakeMesh 函数,可以创建一个匹配姿势的网格,这对于在不同的网格或渲染器之间进行交换而不会对对象进行任何可见的更改非常有用。

  • This page of the Unity Manual contains advice on optimizing animated characters that use skinned meshes, and the Unity Manual page on the SkinnedMeshRenderer component includes tweaks that can improve performance. In addition to the suggestions on these pages, it is worth bearing in mind that the cost of mesh skinning increases per vertex; therefore using fewer vertices in our models with reduce the amount of work that must be done.
    Unity 手册的此页面包含有关优化使用蒙皮网格的动画角色的建议,SkinnedMeshRenderer 组件上的 Unity 手册页面包含可以提高性能的调整。除了这些页面上的建议之外,值得记住的是,网格蒙皮的成本会增加每个顶点;因此在我们的模型中使用更少的顶点来减少必须完成的工作量。

  • On certain platforms, skinning can be handled by the GPU rather than the CPU. This option may be worth experimenting with if we have a lot of capacity on the GPU. We can enable GPU skinning for the current platform and quality target in Player Settings.
    在某些平台上,蒙皮可以由 GPU 而不是 CPU 处理。如果我们在 GPU 上有很多容量,这个选项可能值得尝试。我们可以在 Player Settings 中为当前平台和质量目标启用 GPU 蒙皮。

Main thread operations unrelated to rendering

It’s important to understand that many CPU tasks unrelated to rendering take place on the main thread. This means that if we are CPU bound on the main thread, we may be able to improve performance by reducing the CPU time spent on tasks not related to rendering.
重要的是要了解许多与渲染无关的 CPU 任务都发生在主线程上。 这意味着如果我们在主线程上受 CPU 限制,我们可以通过减少在与渲染无关的任务上花费的 CPU 时间来提高性能。

As an example, our game may be carrying out expensive rendering operations and expensive user script operations on the main thread at a certain point in our game, making us CPU bound. If we have optimized the rendering operations as much as we can without losing visual fidelity, it is possible that we may be able to reduce the CPU cost of our own scripts to improve performance.
例如,我们的游戏可能在游戏的某个时间点在主线程上执行昂贵的渲染操作和昂贵的用户脚本操作,从而使我们受到 CPU 限制。 如果我们在不损失视觉保真度的情况下尽可能多地优化渲染操作,我们就有可能降低自己脚本的 CPU 成本以提高性能。

If our game is GPU bound

The first thing to do if our game is GPU bound is to find out what is causing the GPU bottleneck. GPU performance is most often limited by fill rate, especially on mobile devices, but memory bandwidth and vertex processing can also be concerns. Let’s examine each of these problems and learn what causes it, how to diagnose it and how to fix it.
如果我们的游戏受 GPU 限制,首先要做的是找出导致 GPU 瓶颈的原因。 GPU 性能通常受到填充率的限制,尤其是在移动设备上,但内存带宽和顶点处理也可能是问题。 让我们检查这些问题中的每一个,并了解导致它的原因、如何诊断它以及如何解决它。

Fill rate

Fill rate refers to the number of pixels the GPU can render to the screen each second. If our game is limited by fill rate, this means that our game is trying to draw more pixels per frame than the GPU can handle.
填充率是指 GPU 每秒可以渲染到屏幕的像素数。 如果我们的游戏受到填充率的限制,这意味着我们的游戏试图每帧绘制比 GPU 处理能力更多的像素。
It’s simple to check if fill rate is causing our game to be GPU bound:
检查填充率是否导致我们的游戏受 GPU 限制很简单:

  • Profile the game and note the GPU time.
    分析游戏并记下 GPU 时间。

  • Decrease the display resolution in Player Settings.
    降低播放器设置中的显示分辨率。

  • Profile the game again. If performance has improved, it is likely that fill rate is the problem.
    再次配置游戏。 如果性能有所提高,则填充率很可能是问题所在。

If fill rate is the cause of our problem, there are a few approaches that may help us to fix the problem.
如果填充率是我们问题的原因,有一些方法可以帮助我们解决问题。

  • Fragment shaders are the sections of shader code that tell the GPU how to draw a single pixel. This code is executed by the GPU for every pixel it must draw, so if the code is inefficient then performance problems can easily stack up. Complex fragment shaders are a very common cause of fill rate problems.
    片段着色器是告诉 GPU 如何绘制单个像素的着色器代码部分。这段代码由 GPU 为它必须绘制的每个像素执行,所以如果代码效率低下,那么性能问题很容易叠加。复杂的片段着色器是填充率问题的常见原因。

  • If our game is using built-in shaders, we should aim to use the simplest and most optimized shaders possible for the visual effect we want. As an example, the mobile shaders that ship with Unity are highly optimized; we should experiment with using them and see if this improves performance without affecting the look of our game. These shaders were designed for use on mobile platforms, but they are suitable for any project. It is perfectly fine to use “mobile” shaders on non-mobile platforms to increase performance if they give the visual fidelity required for the project.
    如果我们的游戏使用内置着色器,我们的目标应该是尽可能使用最简单和最优化的着色器来获得我们想要的视觉效果。例如,Unity 附带的移动着色器经过高度优化;我们应该尝试使用它们,看看这是否可以在不影响游戏外观的情况下提高性能。这些着色器是为在移动平台上使用而设计的,但它们适用于任何项目。如果它们提供项目所需的视觉保真度,那么在非移动平台上使用“移动”着色器来提高性能是非常好的。

  • If objects in our game use Unity’s Standard Shader, it is important to understand that Unity compiles this shader based on the current material settings. Only features that are currently being used are compiled. This means that removing features such as detail maps can result in much less complex fragment shader code which can greatly benefit performance. Again, if this is the case in our game, we should experiment with the settings and see if we are able to improve performance without affecting visual quality.
    如果我们游戏中的对象使用 Unity 的标准着色器,请务必了解 Unity 根据当前材质设置编译此着色器。仅编译当前正在使用的功能。这意味着删除细节贴图等特征可以减少复杂的片段着色器代码,从而大大提高性能。同样,如果我们的游戏中出现这种情况,我们应该对这些设置进行试验,看看我们是否能够在不影响视觉质量的情况下提高性能。

  • If our project uses bespoke shaders, we should aim to optimize them as much as possible. Optimizing shaders is a complex subject, but this page of the Unity Manual and the Shader optimization section of this page of the Unity Manual contain useful starting points for optimizing our shader code.
    如果我们的项目使用定制着色器,我们应该尽可能优化它们。优化着色器是一个复杂的主题,但 Unity 手册的这个页面和 Unity 手册的这个页面的着色器优化部分包含了优化着色器代码的有用起点。

  • Overdraw is the term for when the same pixel is drawn multiple times. This happens when objects are drawn on top of other objects and contributes greatly to fill rate issues. To understand overdraw, we must understand the order in which Unity draws objects in the scene. An object’s shader determines its draw order, usually by specifying which render queue the object is in. Unity uses this information to draw objects in a strict order, as detailed on this page of the Unity Manual. Additionally, the objects in different render queues are sorted differently before they are drawn. For example, Unity sorts items front-to-back in the Geometry queue to minimize overdraw, but sorts objects back-to-front in the Transparent queue to achieve the required visual effect. This back-to-front sorting actually has the effect of maximizing overdraw for objects in the Transparent queue. Overdraw is a complex subject and there is no one size fits all approach to solving overdraw problems, but reducing the number of overlapping objects that Unity cannot automatically sort is key. The best place to start investigating this issue is in Unity’s Scene view; there is a Draw Mode that allows us to see overdraw in our scene and, from there, identify where we can work to reduce it. The most common culprits for excessive overdraw are transparent materials, unoptimized particles and overlapping UI elements, so we should experiment with optimizing or reducing these. This article on the Unity Learn site focuses primarily on Unity UI, but also contains good general guidance on overdraw.
    过度绘制是指多次绘制同一像素时的术语。当对象被绘制在其他对象之上并且对填充率问题有很大贡献时,就会发生这种情况。要理解过度绘制,我们必须了解 Unity 在场景中绘制对象的顺序。对象的着色器确定其绘制顺序,通常通过指定对象所在的渲染队列来确定。Unity 使用此信息以严格的顺序绘制对象,详见 Unity 手册的这一页。此外,不同渲染队列中的对象在绘制之前进行了不同的排序。例如,Unity 在 Geometry 队列中将项目从前到后排序以最大程度地减少过度绘制,但在透明队列中将对象从后到前排序以达到所需的视觉效果。这种从后到前的排序实际上具有最大化透明队列中对象的透支效果。过度绘制是一个复杂的主题,没有一种万能的方法来解决过度绘制问题,但减少 Unity 无法自动排序的重叠对象的数量是关键。开始调查此问题的最佳位置是在 Unity 的场景视图中;有一个绘制模式可以让我们看到场景中的过度绘制,并从那里确定我们可以在哪里减少它。过度过度绘制最常见的罪魁祸首是透明材质、未优化的粒子和重叠的 UI 元素,因此我们应该尝试优化或减少这些。 Unity Learn 站点上的这篇文章主要关注 Unity UI,但也包含关于透支的良好一般指导。

  • The use of image effects can greatly contribute to fill rate issues, especially if we are using more than one image effect. If our game makes use of image effects and is struggling with fill rate issues, we may wish to experiment with different settings or more optimized versions of the image effects (such as Bloom (Optimized) in place of Bloom). If our game uses more than one image effect on the same camera, this will result in multiple shader passes. In this case, it may be beneficial to combine the shader code for our image effects into a single pass, such as in Unity’s PostProcessing Stack. If we have optimized our image effects and are still having fill rate issues, we may need to consider disabling image effects, particularly on lower-end devices.
    图像效果的使用会极大地影响填充率问题,尤其是当我们使用多个图像效果时。如果我们的游戏使用图像效果并且正在努力解决填充率问题,我们可能希望尝试不同的设置或更优化的图像效果版本(例如 Bloom(优化)代替 Bloom)。如果我们的游戏在同一个相机上使用多个图像效果,这将导致多个着色器通道。在这种情况下,将我们的图像效果的着色器代码组合到一个通道中可能是有益的,例如在 Unity 的 PostProcessing Stack 中。如果我们优化了图像效果并且仍然存在填充率问题,我们可能需要考虑禁用图像效果,尤其是在低端设备上。

Memory bandwidth

Memory bandwidth refers to the rate at which the GPU can read from and write to its dedicated memory. If our game is limited by memory bandwidth, this usually means that we are using textures that are too large for the GPU to handle quickly.
内存带宽是指 GPU 可以读取和写入其专用内存的速率。 如果我们的游戏受到内存带宽的限制,这通常意味着我们使用的纹理太大,GPU 无法快速处理。
To check if memory bandwidth is a problem, we can do the following:
要检查内存带宽是否有问题,我们可以执行以下操作:

  • Profile the game and note the GPU time.
    分析游戏并记下 GPU 时间。

  • Reduce the Texture Quality for the current platform and quality target in Quality Settings.
    降低当前平台的纹理质量和质量设置中的质量目标。

  • Profile the game again and note the GPU time. If performance has improved, it is likely that memory bandwidth is the problem.
    再次分析游戏并记下 GPU 时间。 如果性能有所提高,则内存带宽很可能是问题所在。

If memory bandwidth is our problem, we need to reduce the texture memory usage in our game. Again, the technique that works best for each game will be different, but there are a few ways in which we can optimize our textures.
如果内存带宽是我们的问题,我们需要减少游戏中的纹理内存使用量。 同样,最适合每个游戏的技术会有所不同,但有几种方法可以优化我们的纹理。

  • Texture compression is a technique that can greatly reduce the size of textures both on disk and in memory. If memory bandwidth is a concern in our game, using texture compression to reduce the size of textures in memory can aid performance. There are lots of different texture compression formats and settings available within Unity, and each texture can have separate settings. As a general rule, some form of texture compression should be used whenever possible; however, a trial and error approach to find the best setting for each texture works best. This page in the Unity Manual contains useful information on different compression formats and settings.
    纹理压缩是一种可以大大减小磁盘和内存中纹理大小的技术。如果内存带宽是我们游戏中的一个问题,那么使用纹理压缩来减少内存中纹理的大小可以提高性能。 Unity 中有许多不同的纹理压缩格式和设置可用,每个纹理都可以有单独的设置。作为一般规则,应尽可能使用某种形式的纹理压缩;但是,为每个纹理找到最佳设置的试错方法效果最好。 Unity 手册中的此页面包含有关不同压缩格式和设置的有用信息。

  • Mipmaps are lower resolution versions of textures that Unity can use on distant objects. If our scene contains objects that are far from the camera, we may be able to use mipmaps to ease problems with memory bandwidth. The Mipmaps Draw Mode in Scene view allows us to see which objects in our scene could benefit from mipmaps, and this page of the Unity Manual contains more information on enabling mipmaps for textures.
    Mipmap 是 Unity 可以在远处物体上使用的低分辨率版本的纹理。如果我们的场景包含远离相机的对象,我们可以使用 mipmap 来缓解内存带宽问题。场景视图中的 Mipmaps Draw Mode 允许我们查看场景中的哪些对象可以从 mipmaps 中受益,Unity 手册的这个页面包含有关为纹理启用 mipmaps 的更多信息。

Vertex processing

Vertex processing refers to the work that the GPU must do to render each vertex in a mesh. The cost of vertex processing is impacted by two things: the number of vertices that must be rendered, and the number of operations that must be performed on each vertex.
顶点处理是指 GPU 必须做的工作来渲染网格中的每个顶点。 顶点处理的成本受两件事影响:必须渲染的顶点数量,以及必须对每个顶点执行的操作数量。
If our game is GPU bound and we have established that it isn’t limited by fill rate or memory bandwidth, then it is likely that vertex processing is the cause of the problem. If this is the case, experimenting with reducing the amount of vertex processing that the GPU must do is likely to result in performance gains.
如果我们的游戏受 GPU 限制,并且我们已经确定它不受填充率或内存带宽的限制,那么顶点处理很可能是问题的原因。 如果是这种情况,尝试减少 GPU 必须执行的顶点处理量可能会带来性能提升。
There are a few approaches we could consider to help us reduce the number of vertices or the number of operations that we are performing on each vertex.
我们可以考虑一些方法来帮助我们减少顶点数量或我们在每个顶点上执行的操作数量。

  • Firstly, we should aim to reduce any unnecessary mesh complexity. If we are using meshes that have a level of detail that cannot be seen in game, or inefficient meshes that have too many vertices due to errors in creating them, this is wasted work for the GPU. The simplest way to reduce the cost of vertex processing is to create meshes with a lower vertex count in our 3D art program.
    首先,我们应该致力于减少任何不必要的网格复杂度。如果我们使用的网格具有在游戏中看不到的细节级别,或者由于创建错误而导致顶点过多的低效网格,这对 GPU 来说是浪费的工作。降低顶点处理成本的最简单方法是在我们的 3D 艺术程序中创建具有较少顶点数的网格。

  • We can experiment with a technique called normal mapping, which is where textures are used to create the illusion of greater geometric complexity on a mesh. Although there is some GPU overhead to this technique, it will in many cases result in a performance gain. This page of the Unity Manual has a useful guide to using normal mapping to simulate complex geometry in our meshes.
    我们可以尝试一种称为法线映射的技术,在这种技术中,纹理用于在网格上创建更大几何复杂性的错觉。尽管这种技术有一些 GPU 开销,但在许多情况下它会带来性能提升。 Unity 手册的这一页提供了使用法线映射模拟网格中复杂几何图形的有用指南。

  • If a mesh in our game does not make use of normal mapping, we can often disable the use of vertex tangents for that mesh in the mesh’s import settings. This reduces the amount of data that is sent to the GPU for each vertex.
    如果我们游戏中的网格不使用法线贴图,我们通常可以在网格的导入设置中禁用该网格的顶点切线。这减少了为每个顶点发送到 GPU 的数据量。

  • Level of detail, also known as LOD, is an optimisation technique where meshes that are far from the camera are reduced in complexity. This reduces the the number of vertices that the GPU has to render without affecting the visual quality of the game. The LOD Group page of the Unity Manual contains more information on how to set up LOD in our game.
    细节层次,也称为 LOD,是一种优化技术,可以降低远离相机的网格的复杂性。这减少了 GPU 必须渲染的顶点数量,而不会影响游戏的视觉质量。 Unity 手册的 LOD 组页面包含有关如何在我们的游戏中设置 LOD 的更多信息。

  • Vertex shaders are blocks of shader code that tell the GPU how to draw each vertex. If our game is limited by vertex processing, then reducing the complexity of our vertex shaders may help.
    顶点着色器是告诉 GPU 如何绘制每个顶点的着色器代码块。如果我们的游戏受到顶点处理的限制,那么降低顶点着色器的复杂性可能会有所帮助。

  • If our game uses built-in shaders, we should aim to use the simplest and most optimized shaders possible for the visual effect we want. As an example, the mobile shaders that ship with Unity are highly optimized; we should experiment with using them and see if this improves performance without affecting the look of our game.
    如果我们的游戏使用内置着色器,我们应该尽可能使用最简单和最优化的着色器来获得我们想要的视觉效果。例如,Unity 附带的移动着色器经过高度优化;我们应该尝试使用它们,看看这是否可以在不影响游戏外观的情况下提高性能。

  • If our project uses bespoke shaders, we should aim to optimize them as much as possible. Optimizing shaders is a complex subject, but this page of the Unity Manual and the Shader optimization section of this page of the Unity Manual contain useful starting points for optimizing our shader code.
    如果我们的项目使用定制着色器,我们应该尽可能优化它们。优化着色器是一个复杂的主题,但 Unity 手册的这个页面和 Unity 手册的这个页面的着色器优化部分包含了优化着色器代码的有用起点。

Conclusion

We’ve learned how rendering works in Unity, what sort of problems can occur when rendering and how to improve rendering performance in our game. Using this knowledge and our profiling tools, we can fix performance problems related to rendering and structure our games so that they have a smooth and efficient rendering pipeline.
我们已经了解了 Unity 中的渲染是如何工作的,渲染时会出现什么样的问题,以及如何提高游戏中的渲染性能。 使用这些知识和我们的分析工具,我们可以解决与渲染相关的性能问题,并构建我们的游戏,使它们拥有流畅高效的渲染管道。
The links below provide further information on the topics covered in this article.
下面的链接提供了有关本文所涵盖主题的更多信息。

Resources

Unity Learn: Optimizing Unity UI

Unity Knowledge Base: Why is my static batching breaking or otherwise not working as expected?

Fabian Giesen: A trip through the graphics pipeline

Simon Schreibt: Render hell

Gamasutra: How to choose between Forward or Deferred rendering paths in Unity

Gamasutra: Batching independently moving GameObjects into a single mesh to reduce draw calls

FlameBait Games: Optimizing SkinnedMeshRenderers for Unity 5

Pencil Square Games: Reducing draw calls (also named SetPass calls) in Unity 5

猜你喜欢

转载自blog.csdn.net/fztfztfzt/article/details/122785911