Table of contents
2. Avoid compiling code in lua programs as much as possible
As a scripting language, lua is widely used in game development. Then the optimization of lua code becomes very important. This article elaborates on the optimization of the project I am engaged in, and refers to some articles and books.
Lua is famous for its performance, so does the lua code need to be optimized? The answer is yes. But there should always be some measurement when optimizing, to know where is the best, and then measure to understand whether the optimization process really improves our code.
Basic facts Lua converts (precompiles) source code into an internal format before running any code. This format is a sequence of instructions for a virtual machine, similar to the machine code of a real CPU. This internal format is then interpreted by the C code, which is essentially a while loop with a big switch inside and a case for each instruction.
As you may have learned, since version 5.0, Lua uses a register-based virtual machine. The "registers" of this virtual machine do not correspond to actual registers in the CPU, because the correspondence is not portable, and the number of available registers is very limited. Instead, Lua uses a stack (implemented as an array plus some indices) to hold its registers. Every active function has an activation record, which is a stack slice where the function stores its registers. Therefore, each function has its own registers. Each function can use up to 250 registers, since there are only 8-bit registers per instruction.
Given the large number of registers, the Lua precompiler is able to store all local variables in registers. The result is that access to local variables in Lua is very fast. For example, if a and b are local variables, a Lua statement like a = a + b would generate an instruction: ADD 0 0 1 (assuming a and b are in registers 0 and 1, respectively). As a comparison, if both a and b are global variables, the code for addition looks like this:
GETGLOBAL 0 0 ; a
GETGLOBAL 1 1 ; b
ADD 0 0 1
SETGLOBAL 0 0 ; a
1. Using local variables
Example 1: for i = 1, 1000000 do
local x = math.sin(i)
end
This method is 30% slower than the following
local sin = math.sin
for i = 1, 1000000 do
local x = sin(i)
end
And note: accessing outer locals (i.e. locals of the enclosing function) is not as fast as accessing locals, but still faster than accessing globals.
2. Avoid compiling code in lua programs as much as possible
For example, use loadstring (convert a table-type string into a table)
for example:
local lim = 10000
local a = {}
for i = 1, lim do
a[i] = loadstring(string.format("return %d", i))
end
print(a[10]()) --> 10 --The running time of the code is 1.4s
Using closures can speed things up:
function fk (k)
return function () return k end
end
local lim = 100000
local a = {}
for i = 1, lim do
a[i] = fk(i)
end
print(a[10]()) --> 10
3. Table optimization
Lua's tables are what makes Lua special, so to optimize a program that uses tables (actually any Lua program), it's best to know a little about how Lua implements tables.
The implementation of tables in Lua involves some clever algorithms. Every table in Lua has two parts: an array part and a hash part. The array section stores integer key entries in the range 1 to n for a particular n (we'll discuss how this n is calculated later). All other entries (including integer keys outside that range) go to the hash part.
As the name suggests, a hash section uses a hash algorithm to store and lookup its keys. It uses a so-called open address table, which means that all entries are stored in the hash array itself. The hash function gives the primary index of a key; if there is a collision (that is, if two keys are hashed to the same location), the keys are linked into a list, with each element occupying an array entry.
When Lua needs to insert a new key into the table, and the hash array is full, Lua will perform a rehashing. The first step in rehashing is to determine the size of the new array part and the new hash part. So Lua iterates over all the entries, counts and sorts them, and chooses the size of the array part to be the largest power of 2 such that more than half of the elements in the array part are filled. The hash size is the smallest power of 2 that can hold all remaining entries (i.e. those that don't fit in the array part). When Lua creates an empty table, both parts are O in size, so no arrays are allocated for them. Let's see what happens when we run the following code:
local a = {}
for i = 1, 3 do
a[i] = true
end
First create an empty table a. On the first loop iteration, the assignment a[1] = true triggers a hash; Lua then sets the size of the array part of the table to 1 and keeps the hash part empty. On the second loop iteration, the assignment a[2] = true triggers another rehashing, so the array portion of the table is now of size 2. Finally, the third iteration triggers another rehashing, increasing the size of the array part to 4.
But if changed to this:
a = {}
ax = 1; ay = 2; az =3 -- then only the hash portion of the table is grown
So during programming, constructors can be used to avoid these initial duplicate hashes.
To be continued,,,,