Unity performance optimization part

[Unity skills] Optimization technology in Unity_Mom says girls should be independent and self-reliant blog-CSDN blog_Unity optimization model

Unity optimization skills (medium) - Zhihu



Unity optimization tips (Part 2) - Zhihu

1. Optimization direction

1. Vertex optimization

     (1) Optimize geometry : Reduce the number of triangles in the model as much as possible and reuse vertices as much as possible

    (2) Use LOD (Level of detail) technology

     Detailed explanation of unityLOD optimization technology_丶Bo Liang’s blog-CSDN blog_lod optimization

    (3) Use occlusion culling technology

2. Pixel optimization

(1) The focus of pixel optimization is to reduce overdraw. Overdraw refers to a pixel being drawn multiple times. The key is to control the drawing order .

When drawing transparent objects we should try to draw them from front to back. The reason why drawing from front to back can reduce overdraw is because of the depth check.

In Unity, objects that are set to the "Geometry" queue in the Shader are always drawn from front to back , while objects in other fixed queues (such as "Transparent", "Overla", etc.) are drawn from back to front. . For example, for the skybox , it covers almost all pixels and we know it will always be behind everything, so its queue can be set to "Geometry+1" .

(2) We can hand over GUI drawing and 3D scene drawing to different cameras, and the viewing angle range of the camera responsible for the 3D scene should not overlap with the GUI as much as possible.

(3) Reduce real-time lighting : If a scene contains three pixel-by-pixel point light sources and uses a pixel-by-pixel shader, it is likely that the Draw Calls will be tripled and overdraws will also be increased. Pixel-by-pixel light sources, objects illuminated by these light sources are rendered again. What's worse is that neither dynamic batching nor dynamic batching can be batched for such pixel-by-pixel passes, that is, they break batching.

(4) Use Lightmaps

Store the lighting information in the scene in a lighting texture in advance, and then only need to sample the lighting information based on the texture at runtime.

3. CPU optimization

Reduce Draw Calls

        (1) Batching : For objects using the same material, the only difference between them is the difference in vertex data, that is, the meshes used are different. We can merge these vertex data together and send them to the GPU together to complete a batch processing.

        The conditions for unity to perform dynamic batch processing are that the objects use the same material and meet some specific conditions. The vertices cannot exceed 900, and the xyz scaling must be unified. Objects using lightmap will not be batch processed. Multi-pass shaders will interrupt batch processing and accept real-time Neither will the shadow ones .

        Static batch processing: Check "Static Flag" and click the triangle drop-down box behind Static. We will see that this step actually sets a lot of things. What we want here is just "Batching static". If there are some objects sharing the same mesh before static batching (for example, two identical boxes), then each object will have a copy of the mesh, that is, one mesh will become multiple meshes. Sent to the GPU, the disadvantage is that it takes up more memory.

 (2) Merge textures (Atlas): merge as many small textures into one large texture (Atlas) as possible

4. Bandwidth optimization

(1) Reduce the texture size and adjust parameters through the Advance panel of the texture. The main options related to optimization include "Generate Mip Maps", "Max Size" and "Format".

"Generate Mip Maps" will create many small textures of different sizes for the same texture, forming a texture pyramid. In the game, you can dynamically select which texture to use based on the distance from the object, so it will take up more memory.

"Max Size" determines the length and width of the texture. If the texture itself exceeds this maximum value, Unity will shrink it to meet this condition.

"Format" is responsible for the compression mode used by the texture. Usually, just select this automatic mode, and Unity will be responsible for selecting the appropriate compression mode according to different platforms.

2. Optimize details

1. LOD, code will only run when necessary

        Off-screen characters are not counted for animation updates, skill effects, cheating health bars, etc. Off-screen characters are dormant, and only the protagonist can cheat, etc.

2. Frame limiting, load balancing

Limit frames based on player model

3. Algorithm

Some operations on the code itself. For example, optimizing physical operations, exchanging space for time, using table lookup pre-calculation and other methods to speed up calculations, and reducing frequent indexing, FindGetComponentd and various operations.

4. Unity interface

Try to use GetComponent, AddComponent (also generates GC), Find and other operations as little as possible

 Empty functions such as OnGUI, FixedUpdate, Update, etc. will also have gc overhead, because there will be overhead in calling from C++ to C# layer.

 MainCamera is a traversal operation. Do not call it frequently when there are many cameras.

When position and rotation are modified at the same time, use the SetPositionAndRotation() method to set them all at once.

5. Physics

In terms of optimization, collision pairs can be reduced through layering. Try to use BoxCollider instead of MeshCollider . Do not open Raycaster for controls that do not require clicking on the UI interface. Use the most basic radiographic inspection.

6、IL2CPP & C++

Set Unity compilation to IL2CPP, and the running efficiency of the C++ version will be greatly improved.

7. Animation

If Animator.Update or MeshSkinning.Udpate is found in Prifile, the overhead is relatively large, indicating that the action may need to be optimized.

  

optimization:

(1) Open Optimize

GameObject, you can remove some invalid node bones. Note that if there are custom nodes, you need to drag them to the non-optimized list.  

   

(2) Compression:

Turning on KeyframeReduction can compress many unnecessary keyframes. The larger the value, the higher the compression ratio and the more serious the distortion.

(3) BoneWeights:

Vertices are affected by bones. For environments that are not demanding, one bone may be enough. It can be set per model or changed globally in real time.

  

(4)BakeMesh:

If you need to display a large number of models on the same screen, you can use SkinnedMeshRenderer.BakeMesh to bake the animation into a model, so that it can be merged during rendering (animation cannot be merged). DC can be greatly reduced and skinning calculations are omitted. However, the disadvantage is that the memory is increased and the CPU overhead of DynamicBatching is increased, resulting in poorer performance.

(5) Without using Animator:

Animator's overhead is an order of magnitude higher than Animaton.

(6) Invisible does not update the setting CullCompletely

However, it should be noted that some messages will also stop, which may cause problems if there is a dependence on animation.

(7) Bone LOD, GPU Skinning (some devices and situations will be slower), using Bone instead of CS, etc.

8、UI

UI is also a big expense, generally accounting for 30%-50%. UGUI corresponds to the overhead of Canvas.BuildBatch & Canvas.SendWillRenderCanvases in Profile
, which is similar to NGUI's LastUpdate. There are many articles on UI optimization, which are briefly listed here.

 (1), dynamic and static separation:

Because the UI will be merged. NGUI is rebuilt based on Panel, and UGUI is rebuilt based on Canvas to prevent the dynamic UI from triggering the merge and causing the static UI to be merged together.

 (2), preloaded, resident, instant release :

The UI is divided by type. Larger and commonly used UIs will be stuck during creation and can be preloaded. From the main city to the battle scene, while ensuring the peak memory, the hero interface, union interface and other permanent memory can speed up the loading speed. After actual measurement and optimization, the loading speed has more than doubled. Other infrequently used interfaces are split into small interfaces, loaded immediately, and unloaded when closed to save memory. It should be noted that too many UI nodes will also cause slow loading. We used to load for 10 seconds, of which the serialized UI accounted for about half of the time (texture preloading test). If the number of UI nodes is reduced, it will be too large and disassembled.

 (3), Atlas

Reasonably split the UI atlas and distinguish between public atlases (resident) and non-public atlases. If it is too large, it will easily lead to redundant loading, which will easily lead to excessive memory usage and lead to memory and video memory swapping overhead. Too small can easily cause memory fragmentation and affect efficiency. The rules are complex.

(4), memory pool

Frequently created UIs such as UI aliasing use memory pools to reduce creation time and memory fragmentation.

 (5)、Active/Deactive

It is not recommended to frequently switch the UI interface through Active/Deactive, because it will trigger the UI merge operation. You can move it off the screen or set the Layer. However, it should be noted that if it is moved outside the screen, it will still be merged and rendered. If it is not displayed for a long time, it is better to use Deactive, which depends on the situation.

 (6), UISprite instead of UITexture: Texture will not be merged.

 (7) Invisible UI will not be updated if it is not moved: such as the name of the health bar, etc.

 (8) For layout group and canvas group components, any child node that changes its parent node will use getcompent to find the layout group. These are the two major pitfalls of Unity's UGUI.

 (9) Check whether the Raycast target that does not need to be picked up is turned off .

 (10) Resource preloading: For example, the UI preloading introduced earlier, all resources should be preloaded if memory allows, combined with the memory pool. All monetization logic, characters, monsters, props, and UI in our game will be preloaded, and there will be a set of pool expansion and recycling strategies.

 (11), Shader preloading.

9、GC

GC is a very expensive system call and is also the main cause of most lags. It cannot be fully controlled. Therefore, we need to minimize excessive allocation of code heap memory to prevent frequent triggering of GC. At the same time, we can also actively GC during Loading or when performance is not sensitive.

(1) Use StringBuilder instead of string to reduce GC overhead . Do not use rich text to change the color of the Text component directly by modifying the color of the Text component.

 (2), memory pool of class objects . All those that frequently create and delete frequently should be used . Two purposes: reduce the time of loading, creation and release, reduce memory fragmentation and reduce the frequency of GC.

(3) Unity interface: AddComponent , OnGUI, UI merge frequency, delegate, etc. (Some Foreach, coroutine, etc. Unity has been optimized)

(4) GC optimization of plug-ins: The source codes of some plug-ins such as Behavior Tree and FMODStudio have been modified to reduce GC.

3. GPU optimization

(1). The optimization of DrawCall generally involves material mesh merging, so it is placed on the GPU. Rendering an object that has one mesh and carries one material at a time uses a Draw Call. It can be understood that calling DC once changes a brush to draw an object on the drawing board.

(2) Number of sides

The entire scene contains less than 10 DCs, which are combined and output by the artist or plug-in. Static merging will increase loading time and memory, and dynamic merging will also increase memory and merging CPU overhead.

(3)、LOD

GPU optimization can also be done through LOD, which can be done through model LOD , bone LOD, particle LOD, material LOD, terrain LOD, etc. For example, different configurations enable different effects, enable post-processing, etc.

(4), blocking

Occlusion culling: As the name suggests, areas that are blocked and cannot be seen are not rendered, such as objects behind walls. Occlusion culling can be calculated by CPU or GPU.

UI occlusion: For example, full-screen UI can hide the background and save power.

Scene splitting: Because we are looking down from a bird’s eye view, there are very few things that can be blocked, so we just use scene splitting.

(5), translucent

Translucency costs a lot, and will undermine the optimization of the rendering pipeline. It is also difficult to compress textures using the alpha channel (especially the PVR format on IOS, which suffers huge losses after Alpha pixel compression. ETC, DDS, PVR and other formats have one Alpha channel. The compression ratio is equal to the other 3 channels).

Use less/reduce area: Use as little as possible, use it to minimize the area occupied by the screen and reduce the pixel fill rate.

(6), particles

Reduce the screen coverage area, avoid using Alpha, merge materials and mesh, LOD = reduce the number and effect of particle emitters according to the model or distance, sequence frames: using sequence frames for special effects in some top-down games can also greatly improve efficiency. .

(7), other

  • Render settings: shadows, fog, anti-teeth, vertical synchronization, anisotropy, multi-threaded rendering, GPU computed skeleton, vertices affected by bones, soft particles and more. Each project has different requirements.
  • Reduce the rendering resolution: Reduce the Framebuff resolution to reduce PS overhead and memory, but it will be blurry. Many mainstream games such as Honor of Kings have lowered their resolution on Android.
  • Intelligent dynamic adjustment: adjust the configuration in real time according to the player configuration and the game environment. Low-configuration equipment or lower configurations for combat will limit the frame. High-configuration plug-in or low-overhead scenarios will dynamically improve the configuration and increase the frame limit. There is a document with a more detailed introduction.
  • Post-processing: In Unity, you can understand the post-processing overhead by looking at the Graphics.Blit performance. Post-processing is generally a pixel-level calculation. The resolution on mobile devices is generally higher than that on PCs, so you need to pay more attention when using it.
  • Multidimensional submaterial materials are not used.

4. Memory optimization

1. Compressed texture ETC/PVR: Textures occupy the largest amount of resources. I basically compress 3D textures and partially compress 2DUI textures. Try to use ETC format on Android, use pvr format on IOS , non-translucent has a compression ratio of 1/8. These two compression formats are similar to DX's DDS and can be directly rendered by the graphics card, which not only reduces the memory but also reduces the package size and improves the loading speed (JPG and other formats have high compression ratios but need to be decompressed into 32-bit color before rendering, which requires additional memory. and video memory and additional decompression overhead). The following points need to be noted:

(1) If the compression effect is not good, it can be reduced to 16-bit color.

(2) Turn off Mipmap: If the UI or overhead perspective game does not require Mipmap, it can be turned off to reduce the size by 1/3.

(3) If ETC is used, the PVR compression map must be a power of 2. Considering rendering efficiency and video memory fragmentation, the maximum size of the map is recommended to be 1024, the minimum is 64, and the maximum cannot exceed 2048.

(4) Use Jiugong, symmetrical textures: improve the texture reuse rate

(5) Shader: Use Shader to merge texture channels to achieve grayscale images, etc. The alpha channel stores alphatest and highlights, one channel of the texture stores shadows, and another channel stores ao, etc. The alpha channel stores other channels of the texture to facilitate compression, etc. This was greatly used when working on Xuanyuan Legend.

(6) Compress animation to reduce key frames: introduced earlier

(7) Timely uninstall : When entering and exiting the scene, or when opening the UI interface and other times that are not sensitive to performance, unload resources and call Resource.UnloadAsset to clean up reference resources and destroy , and System.GC.Collect to clean up system resources, as described earlier. AssetBundle is generated when loaded and destroyed when unloaded. This is also a relatively big pitfall:)

(8) Avoid unnecessary heap memory allocation at the code level: This can be checked through static code analysis, which is described in detail in another article.

(9) Avoid frequent New Class: use memory pool .

(10) String connection: Reduce string splicing, use StringBuilder , etc.

(11) delegate: Because of the internal linked list and boxing and unboxing operations, the GC will be high when the frequency of use is high.

(12) Proper use of Lambda expressions: For example, this is the reason why Unity’s particle system had a higher GC before version 5.6.

(13) If temporary variables or lists are generated frequently, it is recommended to define a global list and use this list for calculation every time.

(14) It should be noted that classes are allocated on the heap and structures are allocated on the stack. Sometimes structures can be used.

(15) Memory leak: Unity is based on reference counting. Generally, memory leaks are resources that are held and cannot be released. The memory growth trend is obvious and the memory expands when switching scenes repeatedly. For this situation, you can write your own tool to output the resource log of each scenario. You can also use XCode to analyze memory over time.

(16) Table data. Tables are generally not unloaded. It is recommended to use binary deserialization instead of using strings directly, and not to cache multiple copies in memory. If the table data is very large, you can consider deleting it after use.

(17) Redundant vertex data: UI map Mesh exports color, normal, etc. that are not applicable. Static merging will lead to an increase in memory.

(18) Anti-aliasing/Rendertexture: Turning on anti-aliasing will increase the memory, and high resolution will increase the memory. You can use Rendertexture interchangeably in post-processing and do not create multiple copies.

(19) Number of GameObjects: less than 1w. Too many node books will also cause slow loading and updating and memory expansion.

4. Flash memory

Specifically reflected in the package size, resource loading speed, etc. The optimization of flash memory is mostly the same as that of memory, but there are exceptions. For example, using jpg is a way to reduce the package size and increase the memory consumption of CPU. Compression is the same as memory. Textures, animations, guide tables, etc.

(1) The code strip function can be turned on on mobile platforms to reduce memory and capacity consumption caused by code. For classes that use reflection, you can use link.xml configuration to solve the problem. Code strip is an optimization function provided by Unity. It will pre-judge the execution path of the code and remove unused functions.

(2) Some SDKs are very large, and you can negotiate with third parties to reduce duplicate libraries.

(3) l2cpp will contain two versions of code, ARMv7 and ARM64, so there will be two code volumes.

(4) Dynamic download: similar to micro client. However, it is not recommended on mobile phones as it may cause the use of player data.

(5) Redundant resources : When exporting packages, use plug-ins or self-written tools to analyze resources.

(6) Dynamic generation of texture mesh: Some regular textures and models can be generated through calculation, such as the famous substance used before, and when optimizing the terrain in the previous end game, only the height of the terrain was saved, and the remaining height was saved. The GPU is restored to terrain information, which can be reduced by 1/3 and so on.

5. Network

1. Reduce the package body and compress: perform some reuse and merging on the binary to reduce the package body, and compress the protocol package.

2. Packet combining: Carry out packet combining operation at a certain frequency, merge several packets together to reduce the sending frequency and reduce the packet header.

6. Power consumption

1. CPG, GPU optimization, network optimization,

2. Frame limit. Mobile games generally limit the frame to 30 frames.

3. Reduce image quality and effects, reduce update frequency and LOD, etc.

Guess you like

Origin blog.csdn.net/qq_35647121/article/details/128864129