Unity performance optimization - related summary

foreword

  This article mainly wants to share the optimization methods used in my actual work and study for your reference and study. At the same time, I also have a record and summary for myself to consolidate knowledge.
  In fact, optimization is nothing more than starting from several major levels, that is, the three cores we are familiar with. CPU, memory, GPU . It should be the one we have the most contact with and the one that needs to be understood the most. I will discuss these three aspects below.

CPU optimization

1. The CPU itself

  First of all, what is the CPU made of? A rough summary is controllers, arithmetic units, registers, caches, and buses. There is no need to understand all of them in particular, we mainly pay attention to the cache and registers.
insert image description here
  Register (Register) is a very small, very fast storage unit inside the CPU. Registers are closer to the CPU, faster than memory , but smaller, and are generally used to store some small temporary data, such as the return value of a function (such as int char bool), the results of mathematical operations, and so on.
  So to optimize the CPU, you must first understand where the CPU is needed. The job responsibilities of the CPU are too many and too broad. I can only try to list some of the encounters

2. CPU optimization record

GC optimization

  GC - garbage collection mechanism. It stands to reason that GC mainly operates on memory, so why is it put in CPU optimization? In fact, it is very simple. GC is essentially to clean up the objects in the managed heap, that is, mark-clear-compress. For specific implementation, please refer to related algorithms such as Boehm GC and S-GenGC .
  In fact, these algorithms mark whether the object can be cleared by traversing to find the reference between the object and the object. So if there are more objects in the managed heap, the higher the cost of GC recycling once. The recycling of GC itself is also handled by the CPU, so GC is actually closely related to the CPU.

  • If a class is relatively simple, consider using a structure replacement. higher efficiency.
  • Improve the cache hit rate of the CPU and try to ensure the continuity of the memory.
  • For high-frequency New Class/Container/Array, etc., it is necessary to avoid boxing operations as much as possible in Update, and cache as much as possible in local variables.
  • Anonymous functions and delegates should be used with caution. When passing a method as a parameter, if it is a defined method, there will be heap memory allocation. If it is an anonymous function, only non-closures will not generate additional GC.
  • multi-purpose object pool
  • Do not use the ToArray method of List, because it is a new copy of an array for copying, and there are more GCs.
  • yield return 0;//Garbage will be generated, and the int variable 0 is boxed. Change to yield return null
  • Actively call GC.collect at the right time.
  • Linq and regular expressions have boxing operations behind the scenes that generate garbage and are best used sparingly.
  • When dealing with strings, you need to understand that the string itself is a constant, and it cannot be changed after it is created. If you change its value, it is actually a copy, which will generate additional GC. Try to avoid manipulating strings in Update, including assignment, splicing, etc. Stitching should be done with StringBuilder as much as possible. Internally, it is implemented as a Char array.

Code efficiency optimization

  • Cache : space for time, try to cache frequently used objects.
  • Preprocessing : advance the calculation at the same time to avoid excessive CPU usage caused by processing too much data at the same time.
  • Frame limit method : If some modules do not need to process a refresh every frame, such as the Buff module. You can optimize Tick once per second, or Tick once every half second.
  • Framing method : For example, to generate monsters, one generation can be allocated to multiple frames to reduce the overhead of the same frame.
  • Multithreading : In the network module and file reading and writing module, multithreading can be considered to optimize CPU efficiency.
  • Event mechanism : Some logic that requires Tick to judge can be changed to event mechanism triggering. It does not need to be judged every frame.

Reduce DrawCall

  DrawCall is actually a rendering command, that is, an interface for calling graphics rendering. This process is mainly handled by the CPU. The reduction of DrawCall is essentially more de-batching, that is, more vertex data is merged and submitted at one time.

  • Static batching , memory for CPU. To be precise, it is not to reduce the DC, but to combine batches, that is, only need to set the rendering state once, and the DC remains unchanged. The advantages are obvious. One is to reduce the Batch, and the other is to reduce the amount of calculation of the CPU. If it is merged in the art production stage, Unity cannot judge the visibility of the sub-model. In this way, the calculation amount of the CPU increases. Of course, there are also disadvantages: that is, if a GameObject references a shared model, multiple copies of grid data will be copied if static batching is performed, and all GameObjects in the scene that reference the same model must copy the model vertex information and change it to the final one after calculation. In world space, stored in the resulting Vertex buffer. This leads to an increase in the size of the package and the memory usage at runtime.
  • Dynamic batching : It should be noted that there is a limit of 900 vertex attributes, not the number. It is necessary to include the normal vector, UV, and tangents, and the total number should not exceed 900. And this is run-time processing, which inevitably increases CPU consumption.
  • GPUInstancing

memory optimization records

  Our main concern is Native memory and Mono memory . Simply put, one is where Unity resources are stored, and the other is where code is applied.insert image description here

resource optimization

1. Texture

  • Compression format:Only support OpenGL ES 3.0 and above models, use Crunched compression, RGB compression is RGB Crunched ETC, RGBA compression is RGBA Crunched ETC2.
  • The texture size should be as small as possible, and try to ensure that it is the N power of 2.
  • Determine whether to enable mipmap according to the requirements. The UI does not need to be turned on.
  • Do not enable Read / Write unless you need to dynamically manipulate textures.
  • Atlas issues: For example, the utilization rate of the atlas, etc., it is recommended to split into multiple small images.

2. Grid

  • The number of vertices should not be too many, and try not to exceed the upper limit of static batching.

  • PlayerSetting > OtherSetting > Optimization > Vertex Compression can reduce the precision of vertex data after it is turned on, from 32 bits to 16 bits. The same option also has Optimize Mesh Data. After this is turned on, some vertex data that is not used can be eliminated, such as normals, tangents, etc. There seems to be a bug at the moment, test use.
    insert image description here

  • In the import settings of the model, there is also Optimize Mesh, the principle is to rearrange the vertex index to optimize the rendering efficiency at runtime
    insert image description here

  • Read/Write Enabled is the same as texture, don't open it unless necessary.

  • Weld Vertices: Welding vertices, it is recommended to open, you can merge some vertices at the same position to reduce overhead.

  • Normals&Tangents: You can choose not to import normal tangents, if the model is not used

  • The splitting of the model is also very important, and the granularity should not be too large or too small. If it is too large, it is easy to cause the frustum to be removed.

  • If the vertices and face art of the model cannot be modified, you can consider merging some meshes manually, using the CombineMeshes method in the CombineInstance class. Of course, if the textures are different, you have to merge the textures. Reset UVs.

3. Animation

  • Skeleton import settings Check OptimizeGameObjects , you can remove some bone nodes that only have Transform. Improve CPU efficiency
  • Control the trigger frequency of Animator.Initialize . It is recommended that for frequently instantiated animated characters, you can try to use the buffer pool to process them. When you need to hide the animated character, you should not directly Deactive the GameObject of the character, but DisableAnimator component, and move the GameObject to Offscreen, thereby reducing the frequency of calls to Animator.Initialize.
  • Animator——CullingMode can be set to Cull Update Transforms: When the object is not visible by the camera, only calculate the displacement implantation of the root node to ensure the correct position of the object, or Cull Completely: When the object is not visible by the camera, completely terminate the animation running.
  • To reduce the precision of animation data, the default keyframe data is stored in float. The general implementation idea is to read the data of the Anim file itself, modify the value inside the curve, and reduce the accuracy to three digits.

insert image description here

  • You can also eliminate redundant curves, such as some curves that do not change from beginning to end, such as scaling curves . Optimize memory usage.
    insert image description here
  • If there are many similar objects rendered at one time, consider using GPU Skinning+GPU Instancing. That is, bake the animation data to the texture, and hand over the vertex motion to the GPU for processing.

4. Font

You can use the font cropping tool to cut out some rare characters, etc., to reduce the size.

5、AssetBundle

This part is mainly about macro resource management, if the loaded AB package and Asset are handled well.

  • Reference counting: AB's reference counting of Assets is recorded by counting.

Mono memory optimization

  The optimization of this part is similar to the above GC optimization, and its essence is actually similar. Reducing the allocation of heap memory is to reduce the frequency of GC triggers to a certain extent.

  • Object pool: For objects that are instantiated and destroyed with high frequency, try to use the object pool to manage them. Reduce instantiation consumption and reuse memory.
  • Consider actively calling GC.Collect to actively collect garbage.

rendering optimization

UGUI optimization

  There are many and complicated UGUI optimizations, but in fact the principles are similar. That is, how to ensure more grid batches and how to reduce frequent rebuilds.

  • Atlas, the atlas strategy is very important, it is best to ensure all the textures in a large UI function. can be typed into an atlas, and then some general UI textures are entered into the general atlas. Ensure that a large UI only uses two atlases at the same time.
  • Atlas utilization problem: If there are many blank spaces in an atlas, consider cutting the transparent areas of some textures, or cutting long textures into two.
  • Separation of static and dynamic to prevent frequently moving UI elements from frequently causing rebuilds under the entire Canvas.
  • The empty Sprite of the Image component, if you need to receive click events, use EmptyRayCast to remove the submission of drawing vertex data.
  • Pay attention to the Z value of the UI object, if it is not 0, it is possible to interrupt the batching.
  • Pay attention to the material problem. If you don’t use the UI Default material, try not to overlap too much UI, which will interrupt the batching.
  • Mask mask, try to use RectMask2D instead. Mask uses the template buffer, adding two drawCalls, and RectMask2D, for clipping.
  • Try to simplify the hierarchical structure of the UI interface, not too complicated. Rebatch needs to calculate depth traversal.

GPU optimization

  • The number of screens should not be too high.
  • LOD。
  • The variable data format in the shader is reduced, for example, half is used for Float.
  • The camera turns on occlusion culling, and the scene objects also remember to check it.
  • light baking
  • Be wary of transparent objects, opaque objects should not be placed in the transparent rendering queue.
  • Texture resolution control, if it is a 3D game with an overhead view, the texture resolution can be appropriately reduced to save bandwidth.
  • Shadows should be replaced with simple textures if possible.
  • Fog {Mode Off } Turn off the fog effect

Guess you like

Origin blog.csdn.net/a525324105/article/details/129749078