Transfer: https://dawnarc.com/2016/12/ue4%E4%BC%98%E5%8C%96%E5%BB%BA%E8%AE%AE%E4%B8%8E%E7%BB% 8F% E9% AA% 8C /
Content is handled issues related to the project notes, memo left to their own, but also to let others share out detours.
Scattered records
-
GPUProfile time performance statistics to consumption, is not very accurate in the editor mode, because the editor consumption also enumerated, and if you do, preferably in Game mode to view.
-
UE4 does not support 640X480 resolution, if you run the program under this resolution will cause the program to crash (version 4.4, the latest version of I do not know whether there is still the problem).
-
If the character who has a lot Component needs Attach, as far as possible in the use of Attach, do a load on all attach, or when the scene in many roles, there will be a serious performance problem.
For example: There are hundreds of characters in the scene, but not every character needs a camera and a spring arm, then do not create the camera and spring arm assemblies in the constructor. -
Sides UE4 insensitive, even in the mobile terminal. On ipad 4, 50 million for triangles, it is possible to stabilize operation of the frame rate to 30fps, the mobile terminal mainly Map Size, complexity of the material is sensitive.
Code compiler optimization
-
C ++ blueprint faster than 100 to 1000 times
[the Test] C ++ the Blueprint VS VS Nativized the Performance on BP
https://www.reddit.com/r/unrealengine/comments/6qtxy3/test_blueprint_vs_c_performance_vs_nativized_bp/
thanks users know almost金木研
correction: The above results are in the editor the test results in mode (not native blueprint), and if packaged blueprint converted to native code, then the performance gap between the blueprint and C ++ is not more than twice. Prior to about 4.18 (specific version forget), a blueprint transfer native code algorithm is not optimal, such as an addition operation, will be translated into a single function, which leads to the execution stack particularly long, later Epic of blueprints turn native algorithms continuous improvement, the current version has been optimized enough. -
When vector transform in C ++, to make use of
FTransform::TransformXXX()
andFTransform::InverseTransformXXX
, insteadFQuat::RotateVector
, andFQuat::UnrotateVector
, because the former uses more current hardware support vector assembler instructions, drained hardware performance, and which is designed to cross-platform C ++ code to use honestly performing calculation formula, although called hardware assembler instruction, but a relatively small number.
UE4 optimized, when you useFTransform::TransformXXX()
, if the current hardware support, and left hardware instructions, if it does not, goFQuat::RotateVector
. -
VS2019 for C ++ code compilation speed, and CPU
AVX/AVX2
vector instruction set computing at a deeper level of optimization, and Microsoft's Xbox ATG engineering team using UE4 demo of Infiltrator effect optimized benchmarked:- Compilation speed: full compilation speed, VS2019 (16.2) is VS2017 (15.9) of 3.5 times, incremental compilation speed, VS2019 (16.2) is VS2017 (15.9) 1.6 times;
- Code optimization: the frame rate of the game as a test standard, VS 16.2 16.0 relatively improved 2% to 3%, relative to 15.9 and 16.0 can increase the maximum 2.8%, which means that the code compiled using VS2019, compared to VS2017, you can let the game run when the frame rate up about 5%.
For more details, see Microsoft C ++ team blog: C ++ Team Blog .
Code optimization algorithm (only UE4 API-related)
-
Empty TArray, if the object is TArray will continue to use, use
Reset()
in placeEmpty()
, because the former does not destroy memory space. -
When TArray remove elements, if the order of the elements that are not of interest, may be used
RemoveAtSwap()
instead ofRemoveAt()
the former is used to fill the end of the array elements of the memory hole (invalid memory space created after removing the elements), which are empty after all of the elements translation. Time complexity, the former is O (RemovedCount), which is O (ArrayNum). -
TSoftObjectPtr
Internal use of RTTIdynamic_cast
, this operation is very expensive, to avoid large-scale use at run-time, especially in the service end. If you have to use frequently in run-time, individuals do: build their own map as a cache resource load management, key resource path, value is a self-definition of the structure, which holds pointers to objects and resource type enumeration , when it is retrieved based on enumeration valuestatic_cast
strong turn.
Horizon cut (removed)
-
Open
Occlusion Culling
(Project Settings -> Engine -> Rendering -> Occlusion Culling, is turned on by default). If necessary to increase the proportion of the screen to remove the reference intensity (excluding the cost effect is unexpected) to improve rendering efficiency, increase the following property values:Min Screen Radius for Lights
Min Screen Radius for Early Z Pass
Min Screen Radius for Cascaded Shadow Maps
-
Use
Cull Distance Volume
conduct horizon crop fine-grained. The Project SettingsOcclusion Culling
can only be controlled by a critical threshold cut-sight, andCull Distance Volume
can set up multi-level cropping parameters.
Lighting optimization
-
Activity of three light consumption from low to high:
directional light / parallel light (Directional Light) <point light source (Point Light) <spotlight (Spot Light).
When the number of light sources in the scene reaches a certain magnitude, the performance gap between the three kinds of light it is on the order of the gap.Point Light, and Spot Light in the end consumption is higher or lower, the UE4 official document seemingly did not find a clear explanation. Are two possible use scenarios under different lighting, contrast is not the same consumption. Unity early official document shows two lighting consumption on the GPU instructions:
Point Light:. They have have ON AN Average cost The Graphics Processor (though Point Light Shadows are expensive The MOST)
Spot Light: They are expensive The ON MOST The graphics processor.
without considering other factors such as the cost of memory, consider a single GPU consumption, Spot Light expensive than Point Light.
http://www.ceeger.com/Components/class-Light.html
thanks users know almost刘相敬
guidance to the recommendations:
attenuation point light source is calculated without taking into account shadows much simpler than the spotlight, and only related to the distance, to calculate the distance attenuation and spotlights and outside attenuation angle with cos sin instruction, the cost will be much larger, relatively speaking, but in actual use the illuminated area much smaller than the spot light source, rather than consume the vast majority of scene low point source. Of course, this is based on lighting and ClusterBased delayed rendering for cutting, the actual rendering Unity Forward light source is not cut, all the pixels are calculated again and the point light source illumination spot, so that spotlights consumption is considerably greater than the point light sources (Forward Add rendering), the default lighting inside and outside corners Unity is not one that relies on a map do attenuated analog, so Unity spotlights more than one sampling point source link. -
When constructing light map, if the scene does not give Lightmass Importance Volume, would do the whole scene samples indirect lighting, produce Indirect Lighting Cache, which scenario for the big game is quite a waste, like a game character can not get in, vision Indirect lighting Cache need not be generated, this time can be inserted in the scene Lightmass Importance Volume, Indirect lighting Cache will produce a specific area specified, saving a lot of time to construct the illumination.
-
Point light source and try not to turn the spotlight
Cast Volumetric Shadow
; default only parallel light is turned on this option. After the opening performance of consumption is not open to three times the performance of consumption. Do not open the hatching of the calculation using theShadow Mapping
open indicationShadow Volume
, the shadow of the former without the latter is calculated precisely, but a small amount of calculation. -
If the open volume of the fog, the recommended static light into the light, so that when Build Lighting precomputed generated fog volume data, the volume of which can significantly improve the performance of mist. Volume fogging properties huge consumption.
-
If the scene is not a static light Static Light (full dynamic light or fixation light Movable Light Stationary Light), will have to disable Static Lighting, to save on overhead associated Static Lighting (such LightMaps and ShadowMaps correlation calculation). Disable way: Project Settings -> Engine -> Rendering -> Lighting -> disable
Allow Static Lighting
.When the light is fully dynamic performance bottleneck, disabling Static Lighting can improve performance. Test case: one of my game scene, Lighting is one of the bottlenecks,
r.ScreenPercentage
modification of 400 stress testing, close the Static Lighting frame rate up after a 20. Because there is no static light, after disabling lighting effects nor any loss. -
Close
Support Global clip plane for Planar Reflections
, off by default, after opening the huge consumption. -
AO performance optimization. In very large scene, light typically will be one performance bottleneck, especially dynamic light scenes. At this time, a substantial increase in frame rate can be close AO (AO is enabled by default, the earlier version is off by default). After (Project Settings -> Engine -> Rendering -> Default Settings -> open AO
Ambient Occlusion
), AO is the default engine SSAO (Screen Space Ambient Occlusion), SSAO can not be pre-computed, so the GPU performance overhead is large, can be modified to DFAO (Distance Field Ambient Occlusion) to improve performance, because DFAO can be pre-computed, the cost of increased memory overhead.
DFAO open way:
Distance Field, Ambient Occlusion
https://docs.unrealengine.com/en-us/Engine/Rendering/LightingAndShadows/DistanceFieldAmbientOcclusion
DFAO related to the optimization of two options:- Compress Mesh Distance Fields: by compression
Distance Fields volume texture
to reduce memory usage, the cost will occur when using the Level Streaming Hitch - Eight Bit Mesh Distance Fields: a
Distance Fields volume texture
compression format from the 16-bit to 8-bit format, at the cost of visual effects thicker AO dry.
- Compress Mesh Distance Fields: by compression
-
If there are a large number of point light source and a spotlight in the scene are dynamic and, at this time by
Distance Field
dynamically LightComponet performedSetVisibility
andSetHiddenInGame
, then performance can be increased by 30% to 60%. This conclusion is based on a paid official store plug-in Dynamic Lighting Portal System (Performance Booster) source study. There are lights on the engine itselfOcclusion Culling
, as to whySetVisibility
, andSetHiddenInGame
then still have such a big performance boost, estimates need to be carefully studied to render the relevant code UE4, immature personal speculation: to deferred shading, for example, when the pixels in the image space processing of each light source, even if a light source does not have a significant impact on the current pixel, but still corresponding calculation process is performed, may render layers and not due to the upper application program or video game animation rendering and optimized to do it, and this plug-in and hidden by the lights when disabled, then the lighting information when processing the respective pixel, the calculation logic to skip the source, it is possible to greatly improve performance.
Shadow optimization
- If you use a non-static Directional Light (Stationary or Movable), when a large number of units in the scene, be sure to turn on
Dynamic Shadow Distance
(the default is 0, meaning closed).
Test: Actor with the screen 500, camera height 4000, Directional Light Stationary type of the attributeDynamic Shadow Distance StationaryLight
value is greater than the linear distance from the camera to the Actor (Note: is the linear distance of each Actor, the value to be set larger as much as possible such as 5000), or the framerate dropped from 200 fps to 100 fps.Dynamic Shadow Distance
After opening the cause can improve performance:
Dynamic Shadow Distance indicate how much the use of dynamic shadows in the distance, over this distance than the static Fade into shadow, and Fade into shadow after you can still improve performance. -
Cast Shadow logic control
while lighting attributes providedDistanceField Shadow Distance
to control the distance from the camera according to the shadow casting, but this practice across the board. For example: Suppose the performance bottleneck is the shadow cast by a large number of monsters, distant mountain shadow casting trees and buildings little effect on performance, this time usingDistanceField Shadow Distance
it will lead to performance scene ineffective. The recommended approach is to control the program logic: if the object is a monster, only to turn away from the shadow of the monster within a certain distance of the camera.
Objects that cast shadows switch:void UPrimitiveComponent::SetCastShadow(bool NewCastShadow)
-
Enable dynamic light after consuming enormous, if both want to enable dynamic light, want to ensure the performance, you can lower levels of shading, the default is Level 3 (Epic), it can be changed to Level 2 (High).
Material optimization
-
Material type of performance, from fast to slow: Opaque -> Masked -> Translucent.
-
If the scene had a lot of units, such as 500, these units must be done material LOD, and remove as much of translucent materials (such as direct remove the translucent effect in the last two stages), otherwise the performance of consumption is growing exponentially.
-
If GPUVisualizer of
BasePass
consuming high, so in large part because the material high complexity. -
Decal consumption and the number of pixels related to program functions Never mess with decals, except art shop scene. For example, the program would like to make a range of decals mark, when the mark if the range is large Never use decals can be changed or crossed impermeable texture. If the scene requires extensive use of decals, decals dynamically created and destroyed in accordance horizon, just
SetVisibility
is not enough, there will still be hidden after the huge overhead (though it may be in the terrain editor to edit this piece there are bug, because there UE4 scene editor many bug, especially when the engine upgrade version, the old version of the terrain may create a variety of inexplicable bug) appear in the new version. -
Types of materials in the scene to plan well in advance, select only when the material in the planned fight scene. If the types of materials with large screen, will increase the draw call. Especially the art scene with a patchwork of online material, can easily lead to a sharp rise in the number of types of materials.
Performance Guidelines for Artists and Designers
https://docs.unrealengine.com/latest/INT/Engine/Performance/Guidelines/
Vegetation Optimization
-
When the terrain editing, use Instanced Static Meshes. Intancing will increase the cost of the GPU, but can significantly reduce the CPU overhead. Note: The actual applications, Instancing and can not serve as the main way to reduce the number of CPU draw call, as the actual game scenes can not all be instanced mesh, even a full screen of vegetation, nor must use instanced mesh. To reduce the number of draw call, need to reduce the types of materials to improve material reuse rate.
-
When a large number of Instanced Mesh (such as one million), the implementation of a frame
RemoveInstance
orUpdateInstanceTransform
several times, the frame rate will slumped.
Optimization approach: Before Instanced Mesh, will beUHierarchicalInstancedStaticMeshComponent::bAutoRebuildTreeOnInstanceChanges
set to false, then Instanced Mesh perform various operations you need to play after the operation, and thenbAutoRebuildTreeOnInstanceChanges
set to true, then executeBuildTreeIfOutdated(true, false);
, which can significantly reduce the operation which led to one million Instanced Mesh performance loss. -
If the vegetation material consumption becomes a bottleneck, would rather increase the number of faces, do not use Translucent material, Masked use as appropriate. Sheet such as a grass surface, the entire shape of triangles all use to spell out, rather than using a cam face or Translucent materials plus Mask manner.
-
Instanced Mesh is set right
Cull Distance
.
Physics and collision optimization
-
BoxComponent the Generate Overlap Events is set to false. If you do not Overlap event, then the property is set to false, default is true. When BoxCompont reach a certain order of magnitude, open Generate Overlap Events consumption is twice the performance under the closed case.
-
If no physical, will
Simulate Physics
set to false. -
If you do not Hit event,
Simulation Generates Hit Events
set to false. -
If the objects in the scene type (WorldStatic, WorldDynamic, Pawn etc.) a lot, and the number of each lot, the Object Response Collision of the better channel settings, the channel may be provided Ignore Ignore are set to. If the type of objects in the scene is relatively simple, even though this type of objects in the scene have hundreds, Object Response Block or even set to the Overlap, it has no effect on performance.
-
If large-scale RTS game, the scene when there is a mass units (such as large-scale StarCraft 2 Zerg puppy), the Collision UE4 can not not use Collision, otherwise the number of frames slumped.
Own proposals to achieve a simple custom Collision, Collision such as spherical, then calculates the linear distance between the unit and the Collision to determine whether whether a collision, and reduce the detection interval, such as 0.1 second. By this way, if the large number of units, but also need to write a similarDistance Filed
octree to cache the list of units to reduce the number of loops through the list of units in the calculation unit interval.
Animation Optimization
-
Open the role blueprint - "MeshComponent -" under Optimization category in the Detail panel - "check
Enable Update Rate Optimizations
. -
Tick and RefreshBoneTransforms performed only for rendering SkinnedMesh
USkinnedMeshComponent::VisibilityBasedAnimTickOption = EVisibilityBasedAnimTickOption::OnlyTickPoseWhenRendered;
Default
AlwaysTickPoseAndRefreshBones
, regardless of whether the representation is rendered (in the visible region), perform Tick and RefreshBoneTransforms.VisibilityBasedAnimTickOption
Initially knownSkinnedMeshUpdateFlag
, into a version prior to 4.21MeshComponentUpdateFlag
, 4.21 start is calledVisibilityBasedAnimTickOption
.If you turn off the animation Tick, Tick event logic fail within the animation blueprint; if you close RefreshBoneTransforms, the bones will transform the logic of failure, for example
Transform (Modify) Bone
. AnimNotify this option is not affected. -
Animated blueprint of logic try to directly access member variables, the engine is enabled by default optimization options: member variables animated blueprint is copied to Native Code at compile time, so as to avoid entering the blueprint for the virtual machine (Blueprint Virtual Machine) at runtime execution blueprint Code, because of the low blueprint VM operating efficiency.
The default is compiled optimized parameter types include:
Member the Variables;
Negated boolean Member the Variables;
Members of A nested Structure;
specific instructions, see the official documentation:
Animation Optimization
https://docs.unrealengine.com/en-us/Engine/Animation/ Optimization
Animation Fast Path Optimization
https://docs.unrealengine.com/en-us/Engine/Animation/Optimization/FastPath
UI optimization
-
HUD can be resolved not use UMG, wait until you need to create a Widget display objects, the destruction is not displayed, consume huge performance when more UMG object.
For example, there is a scene within one thousand units, each unit is created for WidgetComponent, even if they do not show anything WidgetComponent, GPU will have a huge overhead. -
UMG not be used to modify the mouse cursor, because UMG to produce a higher response speed of display logic, there will be visible a significant delay (UMG shows how high performance consumption), may be used instead of UMG produced Hardware Cursors cursor .
Epic Games engineers share: How do UI optimized for UE4 on a mobile platform?
http://youxiputao.com/articles/11743
Optimization of displacement
- The Pawn mass (such as 500) mobile units, if it is used in a mobile AddMovementInput Tick, the halved frame rate directly (such as down from 90 to 40 multiframe). For the unit can not be moved, it is best to stop execution AddMovementInput (), in order to improve performance.
Special effects Optimization
- Try not to use Volume domain, will significantly increase the GPU overhead after use. Can
profilegpu
detect Volume overhead.
AI Optimization
- If the role does not require Controller, do not give it Spawn Controller. If a character for a long time to stop, then give him
Unpossesed()
until the moveable againPossessedBy()
.
Test: 500 characters, AI Controller Class set: null, AIController, PlayerController frames were 120 fps, 100 fps, 75 fps .
Dedicated Server Optimization
-
Peel animation data server Cook
Project Settings -> Engine -> Animation -> Check Strip Animation Data on Dedicated Server.
If you added Notify Event trigger modify the data in the animation, check this option there will be problems. Make sure that the animation mounted Notify only performance-related, not involved in the game logic. -
Disable Server role under physical simulation mode
FBodyInstance->bSimulatePhysics
is set to false. The default is false.SkeletalMeshComponent::bEnablePhysicsOnDedicatedServer
Set to false, default is true. But this will lead to physical checking with clients prevail, there is the risk of plug-hack. bEnablePhysicsOnDedicatedServer at run-time changes do not take effect. -
Disable Collision Under Server mode
UPrimitiveComponent->bGenerateOverlapEvents
set to false, the role blueprint CollisionComponent default is true. -
Detach the decorative role of the body all the components under the Server mode.
-
AnimInstance is
Root Motion Mode
not modified toRoot Motion from Everyting
make use of default valuesRoot Motion from Montage Only
to reduce the amount of computation servers synchronized animation. -
4.20 provides new features optimized for Dedicated Server: Replication Graph, may be a little early to be understood as the LOD for network communication (but Replication Graph provides more than the network level LOD, more features, see the official documentation). Before Without this feature, the server calls Multicast function, the player will also receive a few kilometers away Multicast, and in fact this long-distance players may not need immediate updating data, or wait until the line of sight within the time manually
ForceNetUpdate()
. With Replication Graph later, this manual optimizations to the engine can manage their own.
Reference material
Performance and Profiling
https://docs.unrealengine.com/en-us/Engine/Performance
Achieve good performance and high-quality visual effects in UE4 by optimizing
http://gad.qq.com/program/translateview/7160166
CPU Profiling
https://docs.unrealengine.com/en-us/Engine/Performance/CPU
GPU Profiling
https://docs.unrealengine.com/en-us/Engine/Performance/GPU
Unreal Engine 4 Optimization Tutorial, Part 1-4
https://software.intel.com/en-us/articles/unreal-engine-4-optimization-tutorial-part-1
https://software.intel.com/en-us/articles/unreal-engine-4-optimization-tutorial-part-2
https://software.intel.com/en-us/articles/unreal-engine-4-optimization-tutorial-part-3
https://software.intel.com/en-us/articles/unreal-engine-4-optimization-tutorial-part-4
Optimizing and Profiling Games with Unreal Engine 4
http://vincentloignon.com/blog/optimizing-and-profiling-games-with-unreal-engine-4/
Dynamic Lighting Portal System (Performance Booster)
https://www.unrealengine.com/marketplace/dynamic-lighting-portal-system-performance-booster
Performance Optimization: Shadows Triggering Zones
https://www.unrealengine.com/marketplace/performance-optimization-shadows-triggering-zones
Unreal Insights(New feature in v4.23)
https://www.youtube.com/watch?v=TygjPe9XHTw
Virtual Texturing(New feature in v4.23)
https://www.youtube.com/watch?v=fhoZ2qMAfa4