Unity + Lua performance optimization

unity + performance optimization lua need to take note of the main points

1. From the deadly gameobj.transform.position = pos to begin
 
Like the wording gameobj.transform.position = pos, in unity it is another thing in common, but in ulua, the extensive use of such an approach is very bad. why?
 
Because just a single line of code, it happened very, very many things, for a little more intuitive, we had this line of code calls the key luaapi and ulua related key steps listed (ulua + cstolua export to prevail, gameobj is GameObject type, pos is Vector3):
 
first step:
GameObjectWrap.get_transform lua want to get from gameobj transform, the corresponding gameobj.transform
LuaDLL.luanet_rawnetobj the lua in gameobj become recognizable id c #
ObjectTranslator.TryGetValue, get the gameobject c # objects from ObjectTranslator in with this id
gameobject.transform preparation so much, here finally really get gameobject.transform the implementation of c #
ObjectTranslator.AddObject to transform assigned an id, and this will be used to represent this in lua transform, transform to save for the future look to ObjectTranslator
LuaDLL.luanet_newudata assigned in a userdata lua, deposit into the id, used to represent the lua soon returned to transform
LuaDLL.lua_setmetatable to the userdata attach metatable, so you can use it like this transform.position
LuaDLL.lua_pushvalue return transform, finishing behind do
LuaDLL.lua_rawseti
LuaDLL.lua_remove
 
Step two:
TransformWrap.set_position lua want pos set to transform.position
LuaDLL.luanet_rawnetobj in the lua transform to become recognizable id c #
ObjectTranslator.TryGetValue, get the transform objects from c # ObjectTranslator in with this id
LuaDLL.tolua_getfloat3 get Vector3 lua from the float 3 returns the value to c #
lua_getfield + lua_tonumber 3 times to get the value of xyz, de-stacked
lua_pop
transform.position = new Vector3 (x, y, z) to prepare so much, and finally execute the assignment transform.position = pos
It's that one line of code, actually do such a lot of things! If c ++, abc =
 
After optimization of such x nothing more than to get the memory address and then assign thing. But here, the frequent values, stack, c # type conversion to lua, every step is full of cpu time, it had not considered among the various memory allocation and the back of the GC!
 
Now we're going to walk, some things are actually unnecessary, it can be omitted. We can finally put him into the optimization:
lua_isnumber + lua_tonumber 4 times completed
 
2. Referenced c # in the lua object, costly
 
You can see from the above example, gameobj just want to get from a transform, it has been a very expensive price of c # object, not as a direct pointer to c for operation (can in fact be pinning by GCHandle to do, but not how the performance test, and is not subject to pinning by gc management), and therefore the mainstream lua + unity is represented by an id of c # object in c # and to the corresponding object id by dictionary. At the same time because of a reference to the dictionary, but also to ensure the c # object in the case of lua have cited not be recycled garbage out.
 
Therefore, each parameter with the object, from the lua converted back to c # id represents the object, then do a dictionary lookup; each member of a object method call, it must first find the object, will do dictionary lookup.
 
If before this object has not been used and gc in lua, the next thing to check is that also the dictionary. But if found to be in a new lua not used in an object that is ready to work in the example above that of the long list.
 
If the object you returned only temporarily in lua with it, the situation is even worse! Just assigned userdata and dictionary index may cause references lua is gc and removed, then the next time you use this object you have to do all the preparatory work again, resulting in repeated and distribution gc, poor performance.
 
The example in gameobj.transform is a huge trap, because. transform only temporary return it, but you did not back references, will soon be freed lua, lead you back every time. transform once, it could mean a distribution and gc.
 
3. The unity value unique type of transmission between lua and c # (Vector3 / Quaternion etc.) more expensive
 
Since said earlier call lua slow c # objects, if every vector3.x go through c #, that performance is in basically collapsed, so the program will achieve mainstream Vector3 other types of pure lua Code, Vector3 is a {x , y, z} in the table, so that in use the lua fast.
 
After doing so, however, c #, and he expressed lua Vector3 would be entirely on two things, the transmission parameters to relate to the type and lua type of conversion c #, for example, the c # Vector3 pass lua, the entire process is as follows:
  1. c # Vector3 get in the x, y, z three values
  2. push these three float to the lua stack
  3. Then constructs a table, a table of x, y, z assignment
  4. This will push the return value in the table
A simple pass parameters necessary to complete the three push parameter memory allocation table, the table 3 is inserted, the performance can be imagined.
 
So how to optimize it? Our tests show that, three float directly in the transfer function, is faster than the transfer Vector3.
E.g. void SetPos (GameObject obj, Vector3pos) to void SetPos (GameObject obj, float x, floaty, float z) effect can be seen particularly behind the test data, quite obvious.
 
Parameter passing between 4.lua and c #, return, try not to pass the following types:
 
Severe categories: Vector3 / Quaternion like unity value types, arrays
Less serious category: bool string various object
Recommend passing: int float double
 
Although lua and c # pass the Senate, but from the mass participation this perspective, lua and c # intermediate in fact with a layer of c (after all lua itself c realization), lua, c, c # due representation in many data types and a memory allocation strategy is different, so the data transfer among the three, often need to be converted (the term parameter mashalling), the conversion of consumption according to different types vary widely.
 
Let me talk about less serious class bool string type, involving c and c # interactive performance consumption, according to Microsoft's official documentation on the data type of processing, c # defined Blittable Types and Non-Blittable Types, which bool and string belonging to the Non -Blittable types, meaning that they exist in c and c # represents not the same, it means passing from c to c # type conversion when needed, reduce performance, but also consider the string memory allocation (to copy the string to the managed heap memory and utf8 and utf16 system conversion).
 
Can refer https://msdn.microsoft.com/zh-cn/library/ms998551.aspx, here are more details on the performance of the interaction of c and c # optimization guidelines.
And severe categories, basically ulua programs such attempt lua bottleneck of the object corresponding to the object due to the c #.
Vector3 equivalent type of consumption, as already been mentioned.
And the array is even worse, because the array lua table can only be expressed, and the c # which is totally different, there is no direct correlation, so as luatable basis only copy of c # conversion from the array, if related object / string, etc. , it is to be converted one by one.
 
The number of functions, the parameters to be controlled frequently called
 
Whether it is the lua pushint / checkint, or c to parameter c # pass, parameter conversion is the most important consumption, and is carried out by one parameter, therefore, lua call c # performance, in addition to associated with parameter types, but also with parameters The number of a great relationship. In general, a function called frequently not more than four parameters, and frequently more than a dozen arguments, if called frequently, you will see a very significant performance drop, one may call hundreds of times on the phone, you can see 10ms level time.
 
6. The priority use static export function, reduce the use of members of the Export method
 
Mentioned earlier, a member object to access methods or member variables, we need to find lua userdata and c # object reference, or find metatable, consuming many. Direct export static function, can reduce such consumption.
 
像obj.transform.position = pos。
Our recommended approach is written as static export function, similar
class LuaUtil{
  static void SetPos(GameObject obj, float x, float y, float z){obj.transform.position = new Vector3(x, y, z); }
}
 
Then lua in LuaUtil.SetPos (obj, pos.x, pos.y, pos.z), this performance will be better very much, because eliminating the need for frequent transform return, but also to avoid the transform often cause temporary return lua the gc.
 
7. Note lua holding referencing c # c # object will cause the object can not be released, which is a common cause of memory leaks
 
As mentioned earlier, c # object back to the lua, is the object dictionary by associates of userdata lua and c #, as long as the lua userdata not recovered, c # object will also be holding a reference to this dictionary, cause can not be recovered.
 
The most common is gameobject and component, if lua references inside them, even if you were Destroy, also found that they still remained in mono pile.
 
However, because the dictionary is the only association lua with c #, so be aware of this problem is not difficult, traversing about this dictionary is very easy to find. ulua ObjectTranslator in this category dictionary, slua class at ObjectCache
 
8. Consider using only their own management id in the lua, without direct reference to the object c #
 
Lua want to avoid references to c #
 
Various performance issues One way is to assign object id bring themselves to index object, while the export function is no longer relevant c # object passed as an argument, but the pass int.
 
This brings several advantages:
  1. Function calls better performance;
  2. Explicitly manage the object life cycle, avoid ulua automatically manage these objects references, if incorrectly references these objects in lua will cause the object can not be released, so that the memory leak
  3. c # object returned to the lua, if there is no reference to lua, it will easily immediately gc, and delete ObjectTranslator to object references. This self-reference management relations, would not have the gc behavior and distribution behavior occur frequently.
For example, the above LuaUtil.SetPos (GameObject obj, float x, float y, floatz) can be further optimized to LuaUtil.SetPos (int objID, float x, floaty, floatz). Then we record the correspondence between objID with GameObject in your own code inside, if possible, rather than using the array to record the dictionary, then there will be a faster search efficiency. So down further save time lua call c # and object management will be more efficient.
 
9. The rational use of the keyword return out complex return value
 
Various types of returns to lua in c # similar things with mass participation, but also a variety of consumption.
 
For example Vector3 GetPos (GameObject obj) can be written as void GetPos (GameObject obj, out float x, out float y, out floatz)
 
Increasing the number of parameters on the surface, but out according to the generated code is derived (subject to our Ulua), will from: LuaDLL.tolua_getfloat3 (containing get_field + tonumber 3 times) becomes isnumber + tonumber 3 times
 
On get_field is essentially a table lookup, certainly slower than isnumber access the stack, so to do so will have better performance.
 
Measured
 
Well, I said so much, do not get the point of view of data is too obscure, in order to consume more realistic to see pure language itself, we do not directly use the example of gameobj.transform.position, because here the first part of the time there is in the internal unity of the waste.
 
We rewrote a simplified version of GameObject2 and Transform2.
class Transform2{
  public Vector3 position = new Vector3();
}
class GameObject2{
   public Transform2 transform = new Transform2();
}
 
Then we use several different ways to call the set position transform
方式1:gameobject.transform.position = Vector3.New(1,2,3)
Mode 2: gameobject: SetPos (Vector3.New (1,2,3))
Mode 3: gameobject: SetPos2 (1,2,3)
方式4:GOUtil.SetPos(gameobject, Vector3.New(1,2,3))
方式5:GOUtil.SetPos2(gameobjectid, Vector3.New(1,2,3))
Way 6: GOUtil.SetPos3 (gameobjectid, 1,2,3)
 
1,000,000, respectively, the following results (the test environment is a windows version, cpu is i7-4770, luajit of jit mode off, will be on the phone because luajit architecture, il2cpp interference and other factors are different, but this next one we will further set forth):
 
Mode 1: 903ms
Mode 2: 539ms
Mode 3: 343ms
Mode 4: 559ms
Mode 5: 470ms
Way 6: 304ms
 
We can see, each optimization, all increase significantly, especially removed. transform and get Vector3 conversion upgrade is huge, we just changed the way of foreign export, does not need to pay a high cost, it would be able to save 66 percent of the time.
 
In fact we can not go further? Also! On the basis of 6 on the way, then we can do it only 200ms!
 
Here secrecy, next luajit integration we explain further. In general, the level of the way we do recommend 6 is sufficient.
 
This is just one of the most simple case, there are many kinds of commonly-used export (eg GetComponentsInChildren this performance pit, or a function of the parameters passed more than a dozen cases) we need to be optimized according to their own use, there are lua performance principle of integrated solutions we provide behind the analysis, you should consider how it is easy to do.
 
Next will write the second part of the unity lua + performance optimization, luajit integrated performance pit
 
This compared to the first part of the code will be able to see export performance probably know the problem of consumption, luajit integration problem is much more complex obscure.
 
C # code attached to the test:
public class Transform2
{
    public Vector3 position = new Vector3();
}
public class GameObject2
{
    public Transform2 transform = new Transform2();
    public void SetPos(Vector3 pos)
    {
        transform.position = pos;
    }
    public void SetPos2(float x, float y, float z)
    {
        transform.position.x = x;
        transform.position.y = y;
        transform.position.z = z;
    }
}
public class GOUtil
{
    private static List<GameObject2> mObjs = new List<GameObject2>();
    public static GameObject2 GetByID(int id)
    {
        if(mObjs.Count == 0)
        {
            for (int i = 0; i < 1000; i++ )
            {
                mObjs.Add(new GameObject2());
            }
        }
        return mObjs[id];
    }
    public static void SetPos(GameObject2 go, Vector3 pos)
    {
        go.transform.position = pos;
    }
    public static void SetPos2(int id, Vector3 pos)
    {
        mObjs[id].transform.position = pos;
    }
    public static void SetPos3(int id, float x, float y ,float z)
    {
        var t = mObjs[id].transform;
        t.position.x = x;
        t.position.y = y;
        t.position.z = z;
    }
}
From: https: //blog.csdn.net/haihsl123456789/article/details/54017522/

Guess you like

Origin www.cnblogs.com/scoregao/p/10984190.html