Summary of DOTS practical skills

[USparkle column] If you have great skills, love to "do some research", are willing to share and learn from others, we look forward to your joining, let the sparks of wisdom collide and interweave, and let the transmission of knowledge be endless!

With the development of technology, client-side technology is also paying more and more attention to high performance. After all, the high-degree-of-freedom game in the big world is approaching. Under this trend, Unity also gave birth to DOTS, a high-performance technical solution. This solution pays attention to The ones are highly parallel and cache friendly. The current version 1.0 of DOTS is still on the eve of the official version. Although there are many shortcomings, its usage and development ideas have been basically determined. In addition to supporting more Unity features in the future official version, the development ideas will not change too much.

1. Data transfer between systems

According to the code framework of ECS, code logic can only be written in System. In the actual development process, there are many scenarios where system A is required to notify system B to do something.

In the past Unity program development, the degree of freedom of the code is very high. There are many ways to realize that system A can notify system B to do something. The easiest way is to implement it through a callback function, or it can be implemented more elegantly by implementing a message system through the observer mode.

But DOTS has more restrictions. First of all, the callback function cannot be used, because Burst compilation does not support Delegate. Secondly, the observer mode is not suitable for the ECS framework, because the logic of the ECS framework is all data-oriented, dividing the data into groups, and calling and processing batch by batch. In other words, the observer mode is a triggered call, and the logic of the ECS framework is a round-robin call.

One way of thinking is to define a NativeList member in the ISystem to receive the message data passed to the ISystem from outside, and then take out the message data one by one in the OnUpdate function for processing. But the following problems will be encountered.

Question 1, if this NativeList is defined as a member of ISystem, other ISystems cannot access the object of this ISystem in their own OnUpdate function, only the corresponding SystemHandle can be accessed. So is it possible to define NaitveList as a static variable? This leads to questions two and three.

Of course, you can also call EntityManager.AddComponent(SystemHandle system, ComponentType componentType) to add components to the system, and then other ISystems can access the components of this system to achieve the purpose of message transmission, but this method can only transmit the data of a single component first, the data It is not applicable if the quantity becomes larger; secondly, this method does not belong to the technique of this article. This is the official standard method of Unity Entities 1.0. There are enough documents to describe how to operate, so I won’t go into details here.

Question 2, the OnUpdate function of ISystem can only access the static container variable modified by readonly. If you don't modify NativeList with readonly and access it in OnUpdate, you will get an error message.

Burst error BC1042: The managed class type Unity.Collections.NativeList1<XXX>* is not supported. Loading from a non-readonly static field XXXSystem.xxx` is not supported

Question 3, if you modify the static NativeList with readonly, you have to initialize the variable when defining the variable, then you will get the following error message.

(0,0): Burst error BC1091: External and internal calls are not allowed inside static constructors: Unity.Collections.LowLevel.Unsafe.AtomicSafetyHandle.Create_Injected(ref Unity.Collections.LowLevel.Unsafe.AtomicSafetyHandle ret)

So this idea of ​​using NativeList won't work. Here are some actionable techniques explored in the project.

1.1 Packing data into entities for transmission
Now change the way of thinking, first create an Entity, and organize the information to be transmitted into IComponentData and bind it to the Entity to form message entities one by one. Other ISystems can traverse these message entities to realize data transmission between entities. function.

This approach applies not only to data transfer between ISystem and ISystem, but even to data transfer between MonoBehaviour and ISystem, and between ISystem and SystemBase.

The following is a specific example, which defines a MonoBehaviour to bind to the button of UGUI, and the OnClick function will be called when the button is clicked.

public struct ClickComponent : IComponentData
{
    public int id;
}

public class UITest : MonoBehaviour
{
    public void OnClick()
    {
        World world = World.DefaultGameObjectInjectionWorld;
        EntityManager dstManager = world.EntityManager;

        // 每次点击按钮都创建一个Entity
        Entity e = dstManager.CreateEntity();
        dstManager.AddComponentData(e, new ClickComponent()
        {
            id = 1
        });
    }
}

Lines 14 to 18 of the code are to package the message to be delivered (id = 1) into an Entity and send it to the default world.

Below is the code to receive the button click message in another ISystem.

public partial struct TestJob : IJobEntity
{
    public Entity managerEntity;
    public EntityCommandBuffer ecb;

    [BurstCompile]
    void Execute(Entity entity, in ClickComponent c)
    {
        // TODO...
        UnityEngine.Debug.Log("接收到按钮点击的消息");

        ecb.DestroyEntity(entity);
    }
}

[BurstCompile]
public partial struct OtherSystem : ISystem
{
    void OnUpdate(ref SystemState state)
    {
        var ecbSingleton = SystemAPI.GetSingleton<BeginSimulationEntityCommandBufferSystem.Singleton>();
        EntityCommandBuffer ecb = ecbSingleton.CreateCommandBuffer(state.WorldUnmanaged);
        Entity managerEntity = SystemAPI.GetSingletonEntity<CharacterManager>();

        TestJob job = new TestJob()
        {
            managerEntity = managerEntity,
            ecb = ecb
        };
        job.Schedule();
    }
}

Line 7 of the code, inside the Execute function of IJobEntity, accesses all the button click messages. Through the call to Job on line 30 of the code, the transmission of messages from MonoBehaviour to ISystem is realized.

On line 12 of the code, the processed message is deleted by calling ecb.DestroyEntity.

The following are the results of the operation.

All the functions have been realized here, but if you are a developer who needs to stare at the Entities Hierarchy window to debug code all the time, this method will make your window flicker continuously, and it is impossible to observe the properties of Entity on the engine interface. So is there any other way?

1.2 Use DynamicBuffer to receive data
This approach is similar to the idea of ​​using NativeList, but DynamicBuffer is used here instead of NativeList.

Continue to write the code following the example in Section 1.1. What we want to achieve here is to click the button of UGUI to create a role, that is, to create a role system to receive the data for creating messages.

public struct CreateCharacterRequest : IBufferElementData
{
    public int objID;
}

public struct CharacterManager : IComponentData { }

[BurstCompile]
public partial struct CharacterSystem : ISystem
{
    [BurstCompile]
    void OnCreate(ref SystemState state)
    {
        // 创建一个管理器的Entity来管理所有请求
        Entity managerEntity = state.EntityManager.CreateEntity();
        // 创建一个TagComponent来获取管理器的Entity
        state.EntityManager.AddComponentData(managerEntity, new CharacterManager());
        // 创建一个DynamicBuffer来接收创建请求
        state.EntityManager.AddBuffer<CreateCharacterRequest>(managerEntity);

        state.EntityManager.SetName(managerEntity, "CharacterManager");
    }

    [BurstCompile]
    void OnUpdate(ref SystemState state)
    {
        DynamicBuffer<CreateCharacterRequest> buffer = SystemAPI.GetSingletonBuffer<CreateCharacterRequest>();

        for (int i = 0; i < buffer.Length; ++i)
        {
            CreateCharacterRequest request = buffer[i];

            // TODO...
            Debug.Log("创建一个角色...");
        }
        buffer.Clear();
    }

    /// <summary>
    /// 请求创建角色(工作线程/主线程)
    /// </summary>
    /// <param name="request">请求数据</param>
    /// <param name="manager">通过SystemAPI.GetSingletonEntity<EntityManager>()获取</param>
    /// <param name="ecb">ECB</param>
    public static void RequestCreateCharacter(in CreateCharacterRequest request, Entity manager, EntityCommandBuffer ecb)
    {
        ecb.AppendToBuffer(manager, request);
    }

    /// <summary>
    /// 请求创建角色(并行工作线程)
    /// </summary>
    /// <param name="request">请求数据</param>
    /// <param name="manager">通过SystemAPI.GetSingletonEntity<EntityManager>()获取</param>
    /// <param name="ecb">ECB</param>
    public void RequestCreateCharacter(in CreateCharacterRequest request, Entity manager, EntityCommandBuffer.ParallelWriter ecb)
    {
        ecb.AppendToBuffer(0, manager, request);
    }
}

The OnCreate function on line 12 of the code creates a manager entity, and there is a DynamicBuffer on this entity to save the request data from other systems.

Line 29 of the code traverses all the data of this DynamicBuffer through a for loop, and processes the data at line 34 of the code, and now simply prints a sentence.

Line 36 of the code deletes the processed data from the DynamicBuffer by calling buffer.Clear().

Lines 45 and 56 of the code define two functions called RequestCreateCharacter for other ISystems to call. The second parameter, Entity manager, is special and requires other ISystems to call SystemAPI.GetSingletonEntity() in the OnUpdate function of the main thread. to get. The difference between these two functions is the third parameter, the first one is passed in EntityCommandBuffer, the second one is passed in EntityCommandBuffer.ParallelWriter, that is to say, the first function is used for the main thread and executed by calling the Schedule function Job, the second function is for the Job executed by calling the ScheduleParallel function.

Review the difference between Run, Schedule, and ScheduleParallel.

  1. Run: Run under the main thread.
  2. Schedule: Runs under the worker thread, and the same job can only be executed under the same worker thread.
  3. ScheduleParallel: Running under the worker thread, the data of the same job and different chunks will be allocated to different worker threads for execution, but there are many restrictions, such as not being able to write to the Container allocated in the main thread, etc.

Let's take a look at how to send a message to CharacterSystem and request to create a character. Let's go back to the code written in Section 1.1.

public partial struct TestJob : IJobEntity
{
    public Entity managerEntity;
    public EntityCommandBuffer ecb;

    [BurstCompile]
    void Execute(Entity entity, in ClickComponent c)
    {
        CharacterSystem.RequestCreateCharacter(new CreateCharacterRequest()
        {
            objID = c.id
        }, managerEntity, ecb);

        ecb.DestroyEntity(entity);
    }
}

[BurstCompile]
public partial struct OtherSystem : ISystem
{
    void OnUpdate(ref SystemState state)
    {
        var ecbSingleton = SystemAPI.GetSingleton<BeginSimulationEntityCommandBufferSystem.Singleton>();
        EntityCommandBuffer ecb = ecbSingleton.CreateCommandBuffer(state.WorldUnmanaged);
        Entity managerEntity = SystemAPI.GetSingletonEntity<CharacterManager>();

        TestJob job = new TestJob()
        {
            managerEntity = managerEntity,
            ecb = ecb
        };
        job.Schedule();
    }
}

Line 25 of the code obtains the entity of the manager by calling SystemAPI.GetSingletonEntity(), and passes it to the RequestCreateCharacter function at line 12 of the code.

Line 9 of the code passes the data of type CreateCharacterRequest to CharacterSystem by calling the RequestCreateCharacter function.

The running results are as follows.

In this way, two methods are used to realize the data transfer between Systems, and even between MonoBehaviour.

Second, simulate polymorphism

Data-oriented design is much more difficult than object-oriented design. After all, the original intention of object-oriented design is to reduce the difficulty of thinking in development and design by abstracting everything in the world into objects. The original intention of data-oriented is not to reduce the difficulty of thinking, but to make execution more efficient.

So, can you combine data-oriented efficiency and object-oriented "fools"?

The biggest obstacle here is that all data in DOTS uses struct, and struct itself does not support inheritance and polymorphism. In many cases of framework design, this feature will greatly constrain the designer.

Of course, you will think that interface can also be used, so that the management class can manage different types of struct data through the interface, but DOTS does not recognize interface, because DOTS requires value types, and interface cannot indicate whether it is a value type or a reference Type of. In practice, you will know that this idea is not feasible.

For example, to make a state machine system, its constituent units are states one by one. According to OOD thinking, there needs to be a state base class, and then each specific state class needs to inherit from this base class to write and implement. Such a simple design will be very difficult in DOD.

Here is a technique that was explored during the course of the project.

If you are familiar with C++, you will know that there is a union. Of course, C# also has a similar existence, that is, StructLayout and FieldOffset. By using this kind of precise layout or union operation, class inheritance can be achieved in a disguised form with as little consumption as possible.

Below is the definition of the state base class.

using System.Runtime.InteropServices;
using Unity.Entities;

// 状态枚举
public enum StateType
{
    None,
    Idle,
    Chase,
    CastSkill,
    Dead,
}

/// <summary>
/// 所有状态实现类都需要继承自IBaseState
/// </summary>
public interface IBaseState
{
    void OnEnter(Entity entity, ref StateComponent self, ref StateHelper helper);

    void OnUpdate(Entity entity, ref StateComponent self, ref StateHelper helper);

    void OnExit(Entity entity, ref StateComponent self, ref StateHelper helper);
}

/// <summary>
/// 存放状态子类用的组件
/// </summary>
[Serializable]
[StructLayout(LayoutKind.Explicit)]
public struct StateComponent : IBufferElementData
{
    [FieldOffset(0)]
    public StateType stateType;

    [FieldOffset(4)]
    public int id;

    [FieldOffset(8)]
    public NoneState noneState;
    [FieldOffset(8)]
    public IdleState idleState;
    [FieldOffset(8)]
    public ChaseState chaseState;

    public void OnEnter(Entity entity, ref StateComponent self, ref StateHelper helper)
    {
        switch (stateType)
        {
            case StateType.None:
                noneState.OnEnter(entity, ref self, ref helper);
                break;
            case StateType.Idle:
                idleState.OnEnter(entity, ref self, ref helper);
                break;
            case StateType.Chase:
                chaseState.OnEnter(entity, ref self, ref helper);
                break;
        }
    }

    public void OnUpdate(Entity entity, ref StateComponent self, ref StateHelper helper)
    {
        switch (stateType)
        {
            case StateType.None:
                noneState.OnUpdate(entity, ref self, ref helper);
                break;
            case StateType.Idle:
                idleState.OnUpdate(entity, ref self, ref helper);
                break;
            case StateType.Chase:
                chaseState.OnUpdate(entity, ref self, ref helper);
                break;
        }
    }

    public void OnExit(Entity entity, ref StateComponent self, ref StateHelper helper)
    {
        switch (stateType)
        {
            case StateType.None:
                noneState.OnExit(entity, ref self, ref helper);
                break;
            case StateType.Idle:
                idleState.OnExit(entity, ref self, ref helper);
                break;
            case StateType.Chase:
                chaseState.OnExit(entity, ref self, ref helper);
                break;
        }
    }
}

Lines 29 to 31 of the code define a structure named StateComponent, which uses the two tags StructLayout and FieldOffset mentioned above to control the layout of the memory.

Line 34 of the code defines the member stateType of the StateType type, which indicates which object of the implementation class is stored in the StateComponent structure. For example, if the value of stateType is StateType.Chase, then the chaseState object on line 44 of the code is initialized and filled with values, while the noneState object on line 40 of the code and the idleState object on line 42 of the code are uninitialized .

From the implementation of the OnEnter function on line 46 of the code, the OnUpdate function on line 62 of the code, and the OnExit function on line 78 of the code, we can know that the value of stateType determines the object that the StateComponent structure actually takes effect, that is to say, the OnEnter of StateComponent is called externally. function, OnUpdate function and OnExit function, it will trigger the call to the OnEnter function, OnUpdate function and OnExit function of the subclass corresponding to IBaseState.

A DynamicBuffer can be used to manage the state on the character, as shown below.

Entity characterEntity = GetCharacter();
state.EntityManager.AddBuffer<StateComponent>(characterEntity);

In this way, object-oriented polymorphism is simulated.

Let's take a closer look at the implementation of the ChaseState structure in order to have a more complete understanding.

using Unity.Entities;
using Unity.Mathematics;

public struct ChaseState : IBaseState
{
    public Entity target;
    public float duration;

    private float endTime;

    public void OnEnter(Entity entity, ref StateComponent self, ref StateHelper helper)
    {
        endTime = helper.elapsedTime + duration;
    }

    public void OnExit(Entity entity, ref StateComponent self, ref StateHelper helper)
    {

    }

    public void OnUpdate(Entity entity, ref StateComponent self, ref StateHelper helper)
    {
        if (helper.elapsedTime >= endTime)
        {
            // 跳转到下一个状态
            return;
        }
        if (!helper.localTransforms.TryGetComponent(target, out var targetTrans))
        {
            return;
        }
        if (!helper.localTransforms.TryGetComponent(entity, out var selfTrans))
        {
            return;
        }
        float3 dir = math.normalizesafe(targetTrans.Position - selfTrans.Position);
        selfTrans.Position = selfTrans.Position + dir * helper.deltaTime;
        helper.localTransforms[entity] = selfTrans;
    }
}

It can be seen from the code that this is a state of chasing the target, and the logic is very simple. A chase time is determined in the OnEnter function, and OnUpdate checks the chase time, and if it exceeds the duration of the chase state, it jumps to the next state. Among them, StateHelper saves some data collected in the OnUpdate function of the main thread, including various ComponentLookups.

With this simulated polymorphism, many object-oriented design patterns can be applied to data-oriented design. However, this approach has two major drawbacks that need to be noted:

  • Since a structure similar to a union is used to rearrange the memory, the memory footprint of this component (that is, the StateComponent mentioned above) will be equal to the footprint of the largest class among these implementation classes.
  • The amount of code to be written to extend the new implementation class increases. It is recommended to write an editor that generates code to assist in extending the implementation.

3. Performance debugging

This section is a relatively basic part, and mainly introduces some tips that are different from previous debugging. Since the previous development mostly wrote logic on the main thread, DOTS uses Job, so a lot of main logic will be distributed to each worker thread (Work Thread).

3.1 How to use Profiler
How Profiler debugs the editor and how to debug the real machine, the method is similar to the previous development, so I won’t go into details. The difference is the multi-threaded analysis. Let’s briefly introduce it below.

Write two simple jobs below and observe their performance.

public struct Test1Component : IBufferElementData
{
    public int id;
    public int count;
}

[UpdateAfter(typeof(Test2System))]
public partial class Test1System : SystemBase
{
    protected override void OnCreate()
    {
        Entity managerEntity = EntityManager.CreateEntity();
        EntityManager.AddBuffer<Test1Component>(managerEntity);
    }

    protected override void OnUpdate()
    {
        // 创建容器,从工作线程里面取数据
        NativeList<Test1Component> list = new NativeList<Test1Component>(Allocator.TempJob);

        // 遍历DynamicBuffer里的所有元素,并传递给主线程的list,用来在主线程里打印日志
        Dependency = Entities
            .WithName("Test1Job")
            .ForEach((in DynamicBuffer<Test1Component> buffer) =>
            {
                for (int i = 0; i < buffer.Length; ++i)
                {
                    list.Add(buffer[i]);
                }
            }).Schedule(Dependency);

        // 等待工作线程结束
        CompleteDependency();

        // 主线程里打印所有元素
        for (int i = 0; i < list.Length; ++i)
        {
            Test1Component c = list[i];
            Debug.Log("element:" + c.id);
        }
        // 释放容器
        list.Dispose();
    }
}

Lines 7 to 8 of the code define a system named Test1System, and the execution order is immediately after the system named Test2System.

Line 19 of the code creates a NativeList, passes it to the worker thread, and collects all the elements in the DynamicBuffer in the worker thread.

Review: Note here that the allocator passed in to the container's constructor must be Allocator.TempJob, because this container needs to be accessed in the Job. At the end of the frame, remember to wait for the Job to finish running and Dispose the container. (Of course, you can also use WithDisposeOnCompletion in the Lambda expression to release it immediately after the Lambda is executed, but because the main thread still needs to use it, the release is delayed.)

Lines 22 to 30 of the code are a simple job that transfers all the elements in the DynamicBuffer to the NativeList.

In order to better distinguish the Job in the Profiler, it is best to use the WithName function to give the Job a name.

Line 33 of the code calls the CompleteDependency function, and waits for the execution of the job of the worker thread in this frame to end before continuing to execute the remaining code of OnUpdate.

After the 36th line of the code is running in the main thread, which is the same as the previous development and will not be described in detail.

public partial class Test2System : SystemBase
{
    public static int index;

    protected override void OnUpdate()
    {
        int id = ++index;

        Entities
            .WithName("Test2Job")
            .ForEach((ref DynamicBuffer<Test1Component> buffer) =>
            {
                // 往DynamicBuffer里面加元素
                buffer.Add(new Test1Component()
                {
                    id = id
                });

                // 下面的代码单纯为了增加性能消耗
                for (int i = 0; i < buffer.Length; ++i)
                {
                    var c = buffer[i];
                    c.count = buffer.Length;
                    for (int j = 0; j < 10000; ++j)
                    {
                        c.count = buffer.Length + j;
                    }
                }
            }).Schedule();
    }
}

From line 9 to line 29 of the code, a system named Test2System is defined. In this system, a job named "Test2Job" is implemented. This job has no other functions, mainly to increase performance consumption.

The following is executed, let's take a look at the consumption of these jobs.

Looking directly at the performance consumption of the main thread according to past debugging experience, you will find that Test1System actually accounts for 91.5%, which is obviously unreasonable from the perspective of code logic. At this time, you need to open Timeline to take a look.

From the screenshot, you can see the following information:

  • Test2Job is executed prior to Test1Job because Test1System adds the label [UpdateAfter(typeof(Test2System))].
  • The main consumption of Test1Job is JobHandle.Complete, which is waiting for the end of the worker thread of this frame.
  • The real consumption is reflected in the Timeline of the worker thread "Worker 0", which is mainly the tens of thousands of cycles in the Test2Job written in the above code.

Click the Test2Job block on the Timeline, and a prompt box will pop up. Click "Show > Hierachy" on the prompt box to see the specific consumption of this Job.

Note that if you want to use breakpoints or Profiler.BeginSample / Profiler.EndSample functions, the Job needs to call the WithoutBurst function to avoid using Burst compilation.

Therefore, when using Profiler analysis in DOTS, you should not only look at the consumption of the main thread, but also comprehensively look at the consumption of the worker threads and the waiting situation of the main thread. There will be more factors for reference analysis.

3.2 Consumption of Lookup
At the beginning of the section, let me explain what the Lookup in the section title refers to? In the ECS framework, if you need to access Component or DynamicBuffer through Entity, the official provides a set of structures, which are ComponentLookup and BufferLookup (the old version of DOTS is called ComponentDataFromEntity and BufferFromEntity). In this article, these structures for random access to components are collectively referred to as Lookup.

In the past Unity development, it is very simple to access a MonoBehaviour on a GameObject, which can be accessed by calling gameObject.GetComponent, but it is more difficult in the ECS framework.

Suppose we want to write a function: create 30 small balls and 200 squares, the small ball will find the two closest squares to itself, and bounce back and forth between these two squares. Let's first write a piece of code to see the situation of Profiler.

protected override void OnUpdate()
{
    float deltaTime = SystemAPI.Time.DeltaTime;
    ComponentLookup<LocalTransform> transforms = SystemAPI.GetComponentLookup<LocalTransform>();

    // 查询所有方块的实体
    NativeArray<Entity> entities = entitiesQuery.ToEntityArray(Allocator.TempJob);

    Entities
        .WithoutBurst()
        .WithName("Test2Job")
        .ForEach((/*小球的实体*/Entity entity, int entityInQueryIndex, /*小球的移动组件*/ref BulletMove move) =>
        {
            float minDist = float.MaxValue;
            LocalTransform targetTransform = default(LocalTransform);

            // 遍历所有方块,查找离小球最近的方块并靠近它
            for (int i = 0; i < entities.Length; ++i)
            {
                Entity targetEntity = entities[i];
                // 上次靠近过的方块先排除掉
                if (move.lastEntity == targetEntity)
                {
                    continue;
                }
                // 通过Lookup获取小球的位置
                if (!transforms.TryGetComponent(entity, out var selfT))
                {
                    continue;
                }
                // 通过Lookup获取方块的位置
                if (!transforms.TryGetComponent(targetEntity, out var targetT))
                {
                    continue;
                }
                float distance = math.distance(targetT.Position, selfT.Position);
                // 找到离小球最近的方块
                if (minDist > distance)
                {
                    minDist = distance;
                    move.targetEntity = targetEntity;
                    targetTransform = targetT;
                }
            }

            if (!transforms.TryGetComponent(entity, out var t))
            {
                return;
            }

            // 朝着离小球最近的方块靠近
            float3 dir = targetTransform.Position - t.Position;
            float3 n = math.normalizesafe(dir);
            t.Position = t.Position + n * deltaTime;

            // 到达离小球最近的方块附近,记录该方块,下一帧开始将不再朝着这个方块靠近
            if (math.length(dir) <= 0.5f)
            {
                move.lastEntity = move.targetEntity;
            }

            transforms[entity] = t;
        }).Schedule();
}

The above code is simple enough and I won’t explain its logic too much. It should be noted that lines 27 and 28 of the code call the TryGetComponent function of ComponentLookup twice respectively. In this example, there are 200 squares, that is to say, each small ball The TryGetComponent function will be called 400 times. Consumption is as follows.

As can be seen from the figure above, only 400 calls per ball cost 2.89ms, which is a large overhead. Then let's try to optimize it and change the 400 TryGetComponent function calls per ball to 200 calls per ball.

protected override void OnUpdate()
{
    float deltaTime = SystemAPI.Time.DeltaTime;
    ComponentLookup<LocalTransform> transforms = SystemAPI.GetComponentLookup<LocalTransform>();

    // 查询所有方块的实体
    NativeArray<Entity> entities = entitiesQuery.ToEntityArray(Allocator.TempJob);

    Entities
        .WithoutBurst()
        .WithName("Test2Job")
        .ForEach((/*小球的实体*/Entity entity, int entityInQueryIndex, /*小球的移动组件*/ref BulletMove move) =>
        {
            // 通过Lookup获取小球的位置
            if (!transforms.TryGetComponent(entity, out var t))
            {
                return;
            }

            float minDist = float.MaxValue;
            LocalTransform targetTransform = default(LocalTransform);

            // 遍历所有方块,查找离小球最近的方块并靠近它
            for (int i = 0; i < entities.Length; ++i)
            {
                Entity targetEntity = entities[i];
                // 上次靠近过的方块先排除掉
                if (move.lastEntity == targetEntity)
                {
                    continue;
                }
                // 通过Lookup获取方块的位置
                if (!transforms.TryGetComponent(targetEntity, out var targetT))
                {
                    continue;
                }
                float distance = math.distance(targetT.Position, t.Position);
                // 找到离小球最近的方块
                if (minDist > distance)
                {
                    minDist = distance;
                    move.targetEntity = targetEntity;
                    targetTransform = targetT;
                }
            }

            // 朝着离小球最近的方块靠近
            float3 dir = targetTransform.Position - t.Position;
            float3 n = math.normalizesafe(dir);
            t.Position = t.Position + n * deltaTime;

            // 到达离小球最近的方块附近,记录该方块,下一帧开始将不再朝着这个方块靠近
            if (math.length(dir) <= 0.5f)
            {
                move.lastEntity = move.targetEntity;
            }

            transforms[entity] = t;
        }).Schedule();
}

Line 15 of the code is at the beginning of the entire Job logic. First, the position of the ball is obtained by calling the TryGetComponent function, so as to avoid obtaining the position of the ball every time the for loop is used. After the code is optimized in this way, the number of calls of the TryGetComponent function is reduced by half, and the consumption is as follows.

As can be seen from the above figure, the consumption of Job has been reduced from 2.89ms to 1.31ms, which is an obvious improvement in performance!

Based on the above optimization, and further thinking, if the position of the current block has been recorded before calling the Job and saved in an IComponentData structure named Target, then it is not necessary to call the TryGetComponent function in the entire Job? Below is the code implemented based on this idea.

protected override void OnUpdate()
{
    float deltaTime = SystemAPI.Time.DeltaTime;

    // 查询所有方块的Target组件
    NativeArray<Target> entities = entitiesQuery.ToComponentDataArray<Target>(Allocator.TempJob);

    Entities
        .WithoutBurst()
        .WithName("Test2Job")
        .ForEach((/*小球的实体*/Entity entity, int entityInQueryIndex, /*小球的移动组件*/ref BulletMove move, /*小球的位置组件*/ref LocalTransform t) =>
        {
            float minDist = float.MaxValue;
            int targetID = 0;
            float3 targetPos = float3.zero;

            // 遍历所有方块,查找离小球最近的方块并靠近它
            for (int i = 0; i < entities.Length; ++i)
            {
                Target target = entities[i];
                // 上次靠近过的方块先排除掉
                if (move.lastID == target.id)
                {
                    continue;
                }
                float distance = math.distance(target.position, t.Position);
                // 找到离小球最近的方块
                if (minDist > distance)
                {
                    minDist = distance;
                    targetID = target.id;
                    targetPos = target.position;
                }
            }

            // 朝着离小球最近的方块靠近
            float3 dir = targetPos - t.Position;
            float3 n = math.normalizesafe(dir);
            t.Position = t.Position + n * deltaTime;

            // 到达离小球最近的方块附近,记录该方块,下一帧开始将不再朝着这个方块靠近
            if (math.length(dir) <= 0.5f)
            {
                move.lastID = targetID;
            }
        }).Schedule();
}

It can be seen from the code that the call of the TryGetComponent function is completely canceled. Let's take a look at the consumption of the code implemented in this way.

Not surprisingly, the code efficiency has improved again!

If you are more careful, you will find that the consumption of these three codes is displayed in blue on the Timeline. This is for the convenience of debugging, and the Burst compilation is turned off. So what will happen when we turn on the Burst compilation of the last changed code?

It can be seen that when Burst compilation is turned on, the Job consumption on the Timeline turns green, and the consumption decreases again, from 0.718ms to 0.045ms.

The above experiments prove that:

  • Lookup is not efficient in the case of a large number of entity calculations, which is why DOTS does not recommend a large number of uses. Judging from the source code of Entities, the TryGetComponent function is a simple pointer offset without complicated logic, which has such a great impact on performance. This is random access after all, which breaks cache friendliness, so it's not recommended. Instead, the optimization technique is to organize the data in other jobs, and then pass it to the current job for use.
  • This experiment also proves the power of Burst, and it must be turned on if you can turn on Burst compilation.

But everything is not absolute. You can still use Lookup’s TryGetComponent function to simplify the logic in places that are not hot spots of performance. After all, organizing special data also has memory consumption and labor costs.

Four. Summary

These are the skills of DOTS summarized so far. Through these tips, you can help novices solve some difficult problems, and at least provide a way of thinking. However, after all, in the groping stage, if you have any misunderstandings, welcome to correct and communicate.


This is the 1445th article of Yuhu Technology, thanks to the author zd304 for the contribution. Welcome to repost and share, please do not reprint without the authorization of the author. If you have any unique insights or discoveries, please contact us and discuss together.

Author's homepage: https://www.zhihu.com/people/zhang-dong-13-77

Thanks again to zd304 for sharing. If you have any unique insights or discoveries, please contact us and discuss together.

Guess you like

Origin blog.csdn.net/UWA4D/article/details/132320709