NEO looks at NEOVM from source code analysis

0x00 Preface

This article is the starting point for the next article "NEO looks at UTXO transfer transactions from source code analysis", and explores some technical foundations of transaction construction and execution. Since this thing is a bit dry, I can hardly swallow it, so I will go from top to bottom, starting from the contract code and slowly deepening. In addition, there are inevitably some incomplete or omissions in the article, and I hope the big guys will give me some advice.

0x01 Lock contract (Lock)

Among the three contract examples provided by the official, this lock-up contract is the only one that does not require Storage. At present, I feel that it may be simpler. If this pits me, I have no regrets. After all, other contracts will have to be analyzed sooner or later, /(ㄒoㄒ)/~~. The code and explanation of the lock-up contract can be found in the official documentation, the Chinese version address is here , and the github address is here

 public class Lock : SmartContract
{
    public static bool Main(byte[] signature)
    {
        Header header = Blockchain.GetHeader(Blockchain.GetHeight());
        if (header.Timestamp < 1520554200) // 2018-3-9 8:10:00
            return false;
        return true;
   }
}

I changed the original timestamp here and deleted the signature verification. I won't say more about the steps to create a new contract project, they are all available on the official website. This contract can transfer money only if the latest block timestamp is greater than my predetermined time, otherwise the transfer will fail. The theory is like this, and the official website explanation is basically so concise. What I have to do next is the hardest part - tracking the generation and execution of this contract script. The code involved below is mainly three items:

0x02 compile

I have to say that the NEO development team has done a good job. Although the compilation process is very complicated, it is really simple to operate. Just right-click on the project and select Generate:

contract compilation result

From here you can see a lot of messages, what was executed at each step, what was generated, and what was the result. The most important thing is that there are keywords here. Some people in the community asked me how to read the source code. I just read it like this. It can be seen from this log that the dll dynamic link library is generated first when compiling, which is of course the work of .net. Then call the stuff Neo.Compiler.MSIL. I'll look for this first.

0x03 parsing

According to the keywords in the above summary, I locate the Program.cs file of the neo-compiler project, which contains the entry function Main of the compiler. Don't ask me how I called it, don't care, just be so arrogant (I really didn't find it). The Main method will receive a parameter, which is the path to the dll file:

Source code location: neo/Compiler/Program.cs/Main(string[] args)

log.Log("Neo.Compiler.MSIL console app v" + Assembly.GetEntryAssembly().GetName().Version);
if (args.Length == 0)
{
      log.Log("need one param for DLL filename.");
      return;
}
string filename = args[0];
string onlyname = System.IO.Path.GetFileNameWithoutExtension(filename);
string filepdb = onlyname + ".pdb";

To be honest, my understanding of C# is not deep enough to the level of bytecode, and the use experience is limited to the few months when I practiced and made games in the goose factory. I can only do my best to transfer from DLL to AVM. The main function of conversion is Convert of ModuleConverter. This method receives an object of type ILModule as a parameter, and this ILModule object is responsible for parsing dll files to obtain IL instructions. Since I couldn't find a way to dynamically analyze the compiler, I directly reversed the Lock.dll file and analyzed the compiler statically against the IL instructions. The reverse tool I use is ILSPY , which is available on github. Here is the reverse IL code:

.class public auto ansi beforefieldinit Lock
	extends [Neo.SmartContract.Framework]Neo.SmartContract.Framework.SmartContract
{
	// 方法
	.method public hidebysig static 
		bool Main (
			uint8[] signature
		) cil managed 
	{
		// 方法起始 RVA 地址 0x2050
		// 方法起始地址(相对于文件绝对值:0x0250)
		// 代码长度 62 (0x3e)
		.maxstack 4
		.locals init (
			[0] class [Neo.SmartContract.Framework]Neo.SmartContract.Framework.Services.Neo.Header,
			[1] bool,
			[2] bool
		)

		// 0x025C: 00
		IL_0000: nop
		// 0x025D: 28 10 00 00 0A
		IL_0001: call uint32 [Neo.SmartContract.Framework]Neo.SmartContract.Framework.Services.Neo.Blockchain::GetHeight()
		// 0x0262: 28 11 00 00 0A
		IL_0006: call class [Neo.SmartContract.Framework]Neo.SmartContract.Framework.Services.Neo.Header [Neo.SmartContract.Framework]Neo.SmartContract.Framework.Services.Neo.Blockchain::GetHeader(uint32)
		// 0x0267: 0A
		IL_000b: stloc.0
		// 0x0268: 06
		IL_000c: ldloc.0
		// 0x0269: 6F 12 00 00 0A
		IL_000d: callvirt instance uint32 [Neo.SmartContract.Framework]Neo.SmartContract.Framework.Services.Neo.Header::get_Timestamp()
		// 0x026E: 20 20 2F A1 5A
		IL_0012: ldc.i4 1520512800
		// 0x0273: FE 05
		IL_0017: clt.un
		// 0x0275: 0B
		IL_0019: stloc.1
		// 0x0276: 07
		IL_001a: ldloc.1
		// 0x0277: 2C 04
		IL_001b: brfalse.s IL_0021

		// 0x0279: 16
		IL_001d: ldc.i4.0
		// 0x027A: 0C
		IL_001e: stloc.2
		// 0x027B: 2B 1B
		IL_001f: br.s IL_003c

		// 0x027D: 02
		IL_0021: ldarg.0
		// 0x027E: 1F 21
		IL_0022: ldc.i4.s 33
		// 0x0280: 8D 16 00 00 01
		IL_0024: newarr [mscorlib]System.Byte
		// 0x0285: 25
		IL_0029: dup
		// 0x0286: D0 01 00 00 04
		IL_002a: ldtoken field valuetype '<PrivateImplementationDetails>'/'__StaticArrayInitTypeSize=33' '<PrivateImplementationDetails>'::'09B200FB2B3E1BDC14112F99F08AA4576CF64321'
		// 0x028B: 28 13 00 00 0A
		IL_002f: call void [mscorlib]System.Runtime.CompilerServices.RuntimeHelpers::InitializeArray(class [mscorlib]System.Array, valuetype [mscorlib]System.RuntimeFieldHandle)
		// 0x0290: 28 14 00 00 0A
		IL_0034: call bool [Neo.SmartContract.Framework]Neo.SmartContract.Framework.SmartContract::VerifySignature(uint8[], uint8[])
		// 0x0295: 0C
		IL_0039: stloc.2
		// 0x0296: 2B 00
		IL_003a: br.s IL_003c

		// 0x0298: 08
		IL_003c: ldloc.2
		// 0x0299: 2A
		IL_003d: ret
	} // 方法 Lock::Main 结束

	.method public hidebysig specialname rtspecialname 
		instance void .ctor () cil managed 
	{
		// 方法起始 RVA 地址 0x209a
		// 方法起始地址(相对于文件绝对值:0x029a)
		// 代码长度 8 (0x8)
		.maxstack 8

		// 0x029B: 02
		IL_0000: ldarg.0
		// 0x029C: 28 15 00 00 0A
		IL_0001: call instance void [Neo.SmartContract.Framework]Neo.SmartContract.Framework.SmartContract::.ctor()
		// 0x02A1: 00
		IL_0006: nop
		// 0x02A2: 2A
		IL_0007: ret
	} // 方法 Lock::.ctor 结束

} // 类 Lock 结束

From the IL code reversed by ILSpy above, you can clearly see key information such as function names, parameters, types, system calls, etc., and neo-vm's analysis of C# bytecode is based on these things. The Compiler uses mono.cecil to obtain IL instructions from the dll, and the code for this tool is also available on github. Basically NEO-VM defines its own complete set of instructions, which can be translated one by one, and the IL instructions are translated into avm instructions. The result of this translation is the avm script. The translation process is first to extract the methods in the IL instruction. Some of the extracted parts are cumbersome to judge the automatically generated code and system calls, and they are not very helpful for us to understand the conversion process, so I will not talk about it. The core processing code for each method is as follows:

Source code location: neon/MSIL/Converter.cs/Convert(ILModule _in)

//方法参数获取
foreach (var src in m.Value.paramtypes)
{
         nm.paramtypes.Add(new NeoParam(src.name, src.type));
}
//是否为neo系统调用
byte[] outcall; string name;
if (IsAppCall(m.Value.method, out outcall))
        continue;
if (IsNonCall(m.Value.method))
          continue;
if (IsOpCall(m.Value.method, out name))
          continue;
if (IsSysCall(m.Value.method, out name))
          continue;
//方法代码转换为opcode
this.ConvertMethod(m.Value, nm);

After each method is parsed, the ConvertMethod method will be called to convert the IL instruction inside the method into the corresponding avm instruction. The instruction conversion method is ConvertCode, which defines the complete IL to avm mapping relationship, which is not the same here. An analysis. Here I will pretend that the conversion process has been finished, and the details may be covered in future blogs, which will be discussed later. After the previous analysis, I was cold when I created the contract. This actually involves the application contract and the authentication contract (the next blog post will introduce this topic). I have been in the fog for a long time, but now I am directly bumping into it. On, suffering also. If you don’t understand here, you can wait for my next blog dedicated to introducing contracts, and I will go straight down first. The lock-up contract itself does not need to be deployed on the blockchain. It is an authentication contract just like the account contract. I have analyzed in detail in the previous article "Viewing nep2 and nep6 from Source Code Analysis", NEO's account itself is actually a contract, an authentication contract that does not need to be deployed on the blockchain and is executed at each transaction. . The Lock contract does.

0x04 Transfer

Because this lock-up contract is an authentication contract and does not need to be deployed on the blockchain, we only need to deploy it locally. This process can be easily completed with neo-GUI. In order to make the test intuitive, I only kept an account with 3.8 gas locally: AV5XmH49Gzz8puT5iMdv5ycmhqWGH5VNq7, we will call this account Xu Zheng below. The newly created contract address is Aaigh8uGWwsmPTWKkxfXx8ZRJNYk6RvnBQ, and this account is called Wang Baoqiang. In addition, I also have another account ASCjW4xpfr8kyVHY1J2PgvcgFbPYa1qX7F, our name is Huang Bo, which is used to transfer money to Xu Zheng's account to confirm that Xu Zheng's account is functioning normally. The background of the story is as follows. Wang Baoqiang borrowed 3.8 GAS from Xu Zheng to pay for his journey home, and agreed to pay it back at 8:10:00 on 3/9/2018. Story development:

  • Act 1: Wang Baoqiang went home for the Chinese New Year without tolls, so he borrowed 3.8 gas from Xu Zheng. So Xu Zheng lent Wang Baoqiang 3.8GAS and agreed to return it after 8:10. No way, only after returning home can the money be repaid. Transaction 1 id is: 0x7f5be9b212c81958428a416f5afad3ca26d3d032e85330b6837f9fea559e1785
  • Act 2: Xu Zheng fell out with Wang Baoqiang on the road, and Xu Zheng forcibly asked Wang Baoqiang for 3.8 GAS. Poor Baoqiang is helpless? Xu Zheng's request for 3.8GAS is the transaction 2 id: 0xfa17a8d74a8ebf75f839286de21e011209177930551f7b52a09161250a39df66
  • Act 3: The poor baby is persistent, what is mine is mine, and I don’t want what is not mine. After 8:10, I will pay it back after 8:10. The rabbit is in a hurry and even bites, the baby is determined not to give in, the child is mine, and the gas is also mine. Xu Zheng's requests were unsuccessful, and transaction 2 failed.

Xu Zheng's transfer failed

  • Act 4: Finally, at 8:23 after 8:10, Xu Zheng successfully took away the 3.8GAS loaned to the baby. Retrieve GAS transaction 3 id: 0xfa17a8d74a8ebf75f839286de21e011209177930551f7b52a09161250a39df66

successful transfer

  • The final scene: After going through ups and downs and sharing adversity on the road, the two turned their quarrels into jade and silk, and their relationship went deeper and never quarreled.

In the above short story, since the lock-up contract stipulates that the withdrawal time is after 8:10, any asset transfer before this time will fail. All transactions in the mini-story are real, and transaction information can be found on the testnet. Next, let's analyze how this transaction 2 fails to execute.

0x05 contract execution

When the transaction of transferring assets from the lock-up contract is broadcast, it will be verified by the consensus node in a new round of consensus (for the consensus part, please move to my blog "NEO Sees Consensus Protocol from Source Code Analysis"), if verified If it succeeds, it will be placed in the cache to wait for a new block to be written. If the verification fails, the transaction will be discarded:

Source code location: neo/Core/Helper/VerifyScripts(this IVerifiable verifiable)

using (StateReader service = new StateReader())
{
        ApplicationEngine engine = new ApplicationEngine(TriggerType.Verification, verifiable, Blockchain.Default, service, Fixed8.Zero);
        engine.LoadScript(verification, false);
        engine.LoadScript(verifiable.Scripts[i].InvocationScript, true);
        if (!engine.Execute()) return false;
        if (engine.EvaluationStack.Count != 1 || !engine.EvaluationStack.Pop().GetBoolean()) return false;
}

ApplicationEngine is the class used in neo-vm to execute scripts. It can be seen that the triggertype of the script execution engine is set as verification, and the script of the transaction is passed in. Here we follow up on the Execute method.

Source code location: neo/SmartContract/ApplicationEngine/Execute()

while (!State.HasFlag(VMState.HALT) && !State.HasFlag(VMState.FAULT)) {
    if (CurrentContext.InstructionPointer < CurrentContext.Script.Length) {
        //读取下一条指令
        OpCode nextOpcode = CurrentContext.NextInstruction;
        //按指令收费
        gas_consumed = checked(gas_consumed + GetPrice(nextOpcode) * ratio);
        if (!testMode && gas_consumed > gas_amount) {
            State |= VMState.FAULT;
            return false;
        }

        if (!CheckItemSize(nextOpcode) ||
            !CheckStackSize(nextOpcode) ||
            !CheckArraySize(nextOpcode) ||
            !CheckInvocationStack(nextOpcode) ||
            !CheckBigIntegers(nextOpcode) ||
            !CheckDynamicInvoke(nextOpcode)) {
            State |= VMState.FAULT;
            return false;
        }
    }
    //执行
    StepInto();
}    

It is not difficult to see that the way the engine executes the avm script is similar to that of the cpu, fetching one instruction at a time for execution. Since it is better to directly look at the AVM instruction code following StepInto one by one, we will jump out of the source code to analyze the AVM. My contract script is:

54c56b6c766b00527ac4616168184e656f2e426c6f636b636861696e2e4765744865696768746168184e656f2e426c6f636b636861696e2e4765744865616465726c766b51527ac46c766b51c36168174e656f2e4865616465722e47657454696d657374616d7004d8d0a15a9f6c766b52527ac46c766b52c3640e00006c766b53527ac4620e00516c766b53527ac46203006c766b53c3616c7566

The code transferred to ASM through the NEL light wallet tool is as follows:

0:PUSH4
1:NEWARRAY
2:TOALTSTACK
3:FROMALTSTACK
4:DUP
5:TOALTSTACK
6:PUSH0(false)
7:PUSH2
8:ROLL
9:SETITEM
a:NOP
b:NOP
c:SYSCALL[781011114666108111991079910497105110467110111672101105103104116]
26:NOP
27:SYSCALL[78101111466610811199107991049710511046711011167210197100101114]
41:FROMALTSTACK
42:DUP
43:TOALTSTACK
44:PUSH1(true)
45:PUSH2
46:ROLL
47:SETITEM
48:FROMALTSTACK
49:DUP
4a:TOALTSTACK
4b:PUSH1(true)
4c:PICKITEM
4d:NOP
4e:SYSCALL[7810111146721019710010111446711011168410510910111511697109112]
67:PUSHBYTES4[0xd8d0a15a]
6c:LT
6d:FROMALTSTACK
6e:DUP
6f:TOALTSTACK
70:PUSH2
71:PUSH2
72:ROLL
73:SETITEM
74:FROMALTSTACK
75:DUP
76:TOALTSTACK
77:PUSH2
78:PICKITEM
79:JMPIFNOT[14]
7c:PUSH0(false)
7d:FROMALTSTACK
7e:DUP
7f:TOALTSTACK
80:PUSH3
81:PUSH2
82:ROLL
83:SETITEM
84:JMP[14]
87:PUSH1(true)
88:FROMALTSTACK
89:DUP
8a:TOALTSTACK
8b:PUSH3
8c:PUSH2
8d:ROLL
8e:SETITEM
8f:JMP[3]
92:FROMALTSTACK
93:DUP
94:TOALTSTACK
95:PUSH3
96:PICKITEM
97:NOP
98:FROMALTSTACK
99:DROP
9a:RET

The address of this avm2asm tool is http://sdk.nel.group , and the source code github is open. Is this reversed asm code very similar to our assembly code, except that this instruction is not ternary like assembly. This point is also introduced in the official document. It is said that because the operands on this virtual machine are maintained on an operand stack separately, there are only simple push and pop operations for data operations, so there is no need to specify the address. I said that I can go through the entire contract execution process against the avm instructions one by one. You definitely don’t believe it, and I don’t believe it either. If someone is willing to translate it for me, you can find the corresponding instructions for each instruction from the file neo-vm/OpCode.cs definition. Personally, I feel that since I don't want to use the avm script by hand, it's almost enough to know that this thing is such a process.

0x06 system call

In the avm code posted in the previous section, there are three syscall instructions, each with a byte array. In fact, it can be seen from the IL code that the three byte arrays stored must be the path of the system call. But how did this thing come about?

  • The address of the first syscall is: 7810111114666108111991079910497105110467110111672101105103104116, which corresponds to hexadecimal: 4e656f2e426c6f636b636861696e2e476574486569676874.
  • The address of the second syscall is: 78101111466610811199107991049710511046711011167210197100101114, which corresponds to hexadecimal: 4e656f2e426c6f636b636861696e2e476574486561646572, translated from GetHeader.
  • The third syscall address is: 7810111146721019710010111446711011168410510910111511697109112, the corresponding hexadecimal is: 4e656f2e4865616465722e47657454696d657374616deoHeader, which translates to Nemesis.

It can be seen that the address of the system call is actually the path of the method called in our C#. The construction code for this block is as follows:

Source code location: neo/Compiler/MSIL/ModuleConverter/_ConverterCall(OpCode src,NeoMethod to)

var bytes = Encoding.UTF8.GetBytes(callname);
if (bytes.Length > 252) throw new Exception("string is to long");
byte[] outbytes = new byte[bytes.Length + 1];
outbytes[0] = (byte)bytes.Length;
Array.Copy(bytes, 0, outbytes, 1, bytes.Length);
//bytes.Prepend 函数在 dotnet framework 4.6 编译不过
_Convert1by1(VM.OpCode.SYSCALL, null, to, outbytes);

As can be seen from the code, the address length of this syscall instruction can only be up to 252 bytes. The code to call this syscall instruction is in the ExecuteEngine class of nep-vm:

Source code location: neo/vm/ExecuteEngine/ExecuteOp

case OpCode.SYSCALL:
      if (!service.Invoke(Encoding.ASCII.GetString(context.OpReader.ReadVarBytes(252)), this))
           State |= VMState.FAULT;
       break;

Here is the call to the Invoke method, and the path of the system call is passed, we follow up to this Invoke method:

Source code location: neo/vm/InteropService

internal bool Invoke(string method, ExecutionEngine engine)
{
      if (!dictionary.ContainsKey(method)) return false;
            return dictionary[method](engine);
}

It can be seen that the address is used as the key to fetch the corresponding method from the map to execute. The content in this map is defined in the StateReader class of the smart contract, which inherits InteropService and adds elements to the dictionary in the constructor:

Source code location: neo/SmartContract/StateReader

public StateReader()
{
    Register("Neo.Runtime.GetTrigger", Runtime_GetTrigger);
    Register("Neo.Runtime.CheckWitness", Runtime_CheckWitness);
    //省略N多Register
    Register("Neo.Iterator.Next", Iterator_Next);
    Register("Neo.Iterator.Key", Iterator_Key);
    Register("Neo.Iterator.Value", Iterator_Value);
}

As for the return value of these system call methods, it is obtained by the ExecutionEngine object received by each system call.

Well, the above is the general process and principle of NEO VM. Since this project involves a wide range of things, I hope that the article cannot be detailed.

{{o.name}}
{{m.name}}

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324129474&siteId=291194637