剖析Solidity合约创建EVM bytecode

1. 引言

前序博客有:

在以太坊中,当合约创建时,init code将作为交易的一部分发送,然后返回该合约的实际bytecode——runtime code。详细可参看以太坊黄皮书第7章

当交易中的recipient地址为空(即0)时,该交易为创建合约交易:

  • 创建合约交易中可包含value值,即创建合约的同时也给新创建的合约转账(此时,Solidity合约的构造函数需标记payable关键字)。
  • 执行交易中的init code,返回存储在新创建合约的bytecode(runtime code)。【返回用到RETURN opcode,从虚拟机memory取output,相应的offset取决于stack的top值,相应的length取决约stack的第二个top值。】

若合约有构造函数,则会在copy和return runtime code之前,执行该构造函数,详细可参看Solidity源码的编译处理流程:https://github.com/ethereum/solidity/blob/ccdc11ea5b7b11cfcc3f01f7b7aaba79a116fc0e/libsolidity/codegen/ContractCompiler.cpp#L174

以太坊EVM为stack machine,即所有的操作都是从stack中pop参数,然后经operation后,再将结果push回stack中。每个stack item均为32 byte word,使用big endian notation。

1.1 为何需要init code呢?

答案是:智能合约的构造函数仅执行一次,由于其后续不再执行,无需将构造函数存储在链上,因此,将构造函数的执行写在init code中,而不是runtime code中。

此外,用户可在创建合约时给合约发送native currency,因此有必要验证该合约接收了native currency——这种检查在部署时运行,在创建合约时,若创建者发送native currency 而 该合约无法接收native currency,则该合约创建将被revert。

init code中为在部署时执行的完整脚本。

init code负责准备合约并返回runtime code,而runtime code为后续每次触发交易时所执行的bytecode。

init code可:

  • 可根据需要对合约地址状态做任何修改(如初始化某些状态变量)。
  • 将runtime code放入memory某处。
  • 将runtime code的length推入stack。
  • 将runtime code在memory中的offset值(即在memory中的起始地址)推入stack。
  • 执行RETURN statement。

2. runtime code VS init code

部署以太坊智能合约为向null地址发送data payload,该data由2部分组成:

扫描二维码关注公众号,回复: 14540630 查看本文章
  • 1)runtime code:调用合约时,EVM所执行的代码。
  • 2)init code:用于设置(构造)合约,返回的runtime code会存储在链上。【合约创建时init code的目的为,将runtime code返回给EVM并将runtime code存储在链上。】
{
    
    
  "to": null,
  "value": 0,
  "data": "<init_code><runtime_code>"
}

runtime code,通常在区块浏览器上看到的就是合约的runtime code。每次调用合约时,EVM会执行runtime code。以下合约为将数字2和4相加,将结果返回:

60 02 // PUSH1 2 - Push 2 on the stack
60 04 // PUSH1 4 - Push 4 on the stack
01 // ADD - Add stack[0] to stack[1]

60 00 // PUSH1 0 - Push 0 on the stack (destination in memory)
53 // MSTORE - Store result to memory

60 20 // PUSH1 32 - Push 32 on the stack (length of data to return)
60 00 // PUSH1 00 - Push 0 on the stack (location in memory)
F3 // Return

注意,上述合约的长度为13个字节。
接下来,需要使用 init code 将上述合约部署到链上:

  • 1)将合约的runtime code 复制到 memory中:【注意,交易data中,runtime code在init code之后,需要指定复制的位置】
    60 0D // PUSH1 13 (The length of our runtime code)
    60 0C // PUSH1 12 (The position of the runtime code in the transaction data)
    60 00 // PUSH1 00 (The destination in memory)
    39 // CODECOPY
    
  • 2)将在memory中的合约runtime code 返回:
    60 0D // PUSH1 13 (The length of our runtime code)
    60 00 // PUSH1 00 (The memory location holding our runtime code)
    F3 // RETURN
    

完整的合约创建交易data为:

0x600D600C600039600D6000F3600260040160005360206000F3
-------^init code^-------|------^runtime code^-----

仍以上述合约为例,改为合约构造函数具有参数2和4,则相应的data格式为:【构造函数的参数附加在runtime code之后】

{
    
    
  "to": null,
  "value": 0,
  "data": "<init_code><runtime_code>0000000200000004"
                                  // param1^|param2^
}

则:

  • 1)init code中:会将这些参数CODECOPY到内存中。
  • 2)通过SSTORE将这些参数持久化存储到合约状态中。
  • 3)runtime code中:将SLOAD加载这些合约状态内的数字(2和4)到stack中,然后执行加法运算。

3. bytecode VS deployedBytecode

如以Remix中的Storage.sol合约为例:

// SPDX-License-Identifier: GPL-3.0

pragma solidity >=0.7.0 <0.9.0;

/**
 * @title Storage
 * @dev Store & retrieve value in a variable
 * @custom:dev-run-script ./scripts/deploy_with_ethers.ts
 */
contract Storage {

    uint256 number;

    /**
     * @dev Store value in variable
     * @param num value to store
     */
    function store(uint256 num) public {
        number = num;
    }

    /**
     * @dev Return value 
     * @return value of 'number'
     */
    function retrieve() public view returns (uint256){
        return number;
    }
}

经编译后,在artifacts/build-info/Storage.json中有:

"bytecode": {
    
    
	......
	"object": "608060405234801561001057600080fd5b50610150806100206000396000f3fe608060405234801561001057600080fd5b50600436106100365760003560e01c80632e64cec11461003b5780636057361d14610059575b600080fd5b610043610075565b60405161005091906100d9565b60405180910390f35b610073600480360381019061006e919061009d565b61007e565b005b60008054905090565b8060008190555050565b60008135905061009781610103565b92915050565b6000602082840312156100b3576100b26100fe565b5b60006100c184828501610088565b91505092915050565b6100d3816100f4565b82525050565b60006020820190506100ee60008301846100ca565b92915050565b6000819050919050565b600080fd5b61010c816100f4565b811461011757600080fd5b5056fea26469706673582212209a159a4f3847890f10bfb87871a61eba91c5dbf5ee3cf6398207e292eee22a1664736f6c63430008070033",
	"opcodes": "PUSH1 0x80 PUSH1 0x40 MSTORE CALLVALUE DUP1 ISZERO PUSH2 0x10 JUMPI PUSH1 0x0 DUP1 REVERT JUMPDEST POP PUSH2 0x150 DUP1 PUSH2 0x20 PUSH1 0x0 CODECOPY PUSH1 0x0 RETURN INVALID PUSH1 0x80 PUSH1 0x40 MSTORE CALLVALUE DUP1 ISZERO PUSH2 0x10 JUMPI PUSH1 0x0 DUP1 REVERT JUMPDEST POP PUSH1 0x4 CALLDATASIZE LT PUSH2 0x36 JUMPI PUSH1 0x0 CALLDATALOAD PUSH1 0xE0 SHR DUP1 PUSH4 0x2E64CEC1 EQ PUSH2 0x3B JUMPI DUP1 PUSH4 0x6057361D EQ PUSH2 0x59 JUMPI JUMPDEST PUSH1 0x0 DUP1 REVERT JUMPDEST PUSH2 0x43 PUSH2 0x75 JUMP JUMPDEST PUSH1 0x40 MLOAD PUSH2 0x50 SWAP2 SWAP1 PUSH2 0xD9 JUMP JUMPDEST PUSH1 0x40 MLOAD DUP1 SWAP2 SUB SWAP1 RETURN JUMPDEST PUSH2 0x73 PUSH1 0x4 DUP1 CALLDATASIZE SUB DUP2 ADD SWAP1 PUSH2 0x6E SWAP2 SWAP1 PUSH2 0x9D JUMP JUMPDEST PUSH2 0x7E JUMP JUMPDEST STOP JUMPDEST PUSH1 0x0 DUP1 SLOAD SWAP1 POP SWAP1 JUMP JUMPDEST DUP1 PUSH1 0x0 DUP2 SWAP1 SSTORE POP POP JUMP JUMPDEST PUSH1 0x0 DUP2 CALLDATALOAD SWAP1 POP PUSH2 0x97 DUP2 PUSH2 0x103 JUMP JUMPDEST SWAP3 SWAP2 POP POP JUMP JUMPDEST PUSH1 0x0 PUSH1 0x20 DUP3 DUP5 SUB SLT ISZERO PUSH2 0xB3 JUMPI PUSH2 0xB2 PUSH2 0xFE JUMP JUMPDEST JUMPDEST PUSH1 0x0 PUSH2 0xC1 DUP5 DUP3 DUP6 ADD PUSH2 0x88 JUMP JUMPDEST SWAP2 POP POP SWAP3 SWAP2 POP POP JUMP JUMPDEST PUSH2 0xD3 DUP2 PUSH2 0xF4 JUMP JUMPDEST DUP3 MSTORE POP POP JUMP JUMPDEST PUSH1 0x0 PUSH1 0x20 DUP3 ADD SWAP1 POP PUSH2 0xEE PUSH1 0x0 DUP4 ADD DUP5 PUSH2 0xCA JUMP JUMPDEST SWAP3 SWAP2 POP POP JUMP JUMPDEST PUSH1 0x0 DUP2 SWAP1 POP SWAP2 SWAP1 POP JUMP JUMPDEST PUSH1 0x0 DUP1 REVERT JUMPDEST PUSH2 0x10C DUP2 PUSH2 0xF4 JUMP JUMPDEST DUP2 EQ PUSH2 0x117 JUMPI PUSH1 0x0 DUP1 REVERT JUMPDEST POP JUMP INVALID LOG2 PUSH5 0x6970667358 0x22 SLT KECCAK256 SWAP11 ISZERO SWAP11 0x4F CODESIZE SELFBALANCE DUP10 0xF LT 0xBF 0xB8 PUSH25 0x71A61EBA91C5DBF5EE3CF6398207E292EEE22A1664736F6C63 NUMBER STOP ADDMOD SMOD STOP CALLER ",
	......
},
"deployedBytecode": {
    
    
	......
	"object": "608060405234801561001057600080fd5b50600436106100365760003560e01c80632e64cec11461003b5780636057361d14610059575b600080fd5b610043610075565b60405161005091906100d9565b60405180910390f35b610073600480360381019061006e919061009d565b61007e565b005b60008054905090565b8060008190555050565b60008135905061009781610103565b92915050565b6000602082840312156100b3576100b26100fe565b5b60006100c184828501610088565b91505092915050565b6100d3816100f4565b82525050565b60006020820190506100ee60008301846100ca565b92915050565b6000819050919050565b600080fd5b61010c816100f4565b811461011757600080fd5b5056fea26469706673582212209a159a4f3847890f10bfb87871a61eba91c5dbf5ee3cf6398207e292eee22a1664736f6c63430008070033",
	"opcodes": "PUSH1 0x80 PUSH1 0x40 MSTORE CALLVALUE DUP1 ISZERO PUSH2 0x10 JUMPI PUSH1 0x0 DUP1 REVERT JUMPDEST POP PUSH1 0x4 CALLDATASIZE LT PUSH2 0x36 JUMPI PUSH1 0x0 CALLDATALOAD PUSH1 0xE0 SHR DUP1 PUSH4 0x2E64CEC1 EQ PUSH2 0x3B JUMPI DUP1 PUSH4 0x6057361D EQ PUSH2 0x59 JUMPI JUMPDEST PUSH1 0x0 DUP1 REVERT JUMPDEST PUSH2 0x43 PUSH2 0x75 JUMP JUMPDEST PUSH1 0x40 MLOAD PUSH2 0x50 SWAP2 SWAP1 PUSH2 0xD9 JUMP JUMPDEST PUSH1 0x40 MLOAD DUP1 SWAP2 SUB SWAP1 RETURN JUMPDEST PUSH2 0x73 PUSH1 0x4 DUP1 CALLDATASIZE SUB DUP2 ADD SWAP1 PUSH2 0x6E SWAP2 SWAP1 PUSH2 0x9D JUMP JUMPDEST PUSH2 0x7E JUMP JUMPDEST STOP JUMPDEST PUSH1 0x0 DUP1 SLOAD SWAP1 POP SWAP1 JUMP JUMPDEST DUP1 PUSH1 0x0 DUP2 SWAP1 SSTORE POP POP JUMP JUMPDEST PUSH1 0x0 DUP2 CALLDATALOAD SWAP1 POP PUSH2 0x97 DUP2 PUSH2 0x103 JUMP JUMPDEST SWAP3 SWAP2 POP POP JUMP JUMPDEST PUSH1 0x0 PUSH1 0x20 DUP3 DUP5 SUB SLT ISZERO PUSH2 0xB3 JUMPI PUSH2 0xB2 PUSH2 0xFE JUMP JUMPDEST JUMPDEST PUSH1 0x0 PUSH2 0xC1 DUP5 DUP3 DUP6 ADD PUSH2 0x88 JUMP JUMPDEST SWAP2 POP POP SWAP3 SWAP2 POP POP JUMP JUMPDEST PUSH2 0xD3 DUP2 PUSH2 0xF4 JUMP JUMPDEST DUP3 MSTORE POP POP JUMP JUMPDEST PUSH1 0x0 PUSH1 0x20 DUP3 ADD SWAP1 POP PUSH2 0xEE PUSH1 0x0 DUP4 ADD DUP5 PUSH2 0xCA JUMP JUMPDEST SWAP3 SWAP2 POP POP JUMP JUMPDEST PUSH1 0x0 DUP2 SWAP1 POP SWAP2 SWAP1 POP JUMP JUMPDEST PUSH1 0x0 DUP1 REVERT JUMPDEST PUSH2 0x10C DUP2 PUSH2 0xF4 JUMP JUMPDEST DUP2 EQ PUSH2 0x117 JUMPI PUSH1 0x0 DUP1 REVERT JUMPDEST POP JUMP INVALID LOG2 PUSH5 0x6970667358 0x22 SLT KECCAK256 SWAP11 ISZERO SWAP11 0x4F CODESIZE SELFBALANCE DUP10 0xF LT 0xBF 0xB8 PUSH25 0x71A61EBA91C5DBF5EE3CF6398207E292EEE22A1664736F6C63 NUMBER STOP ADDMOD SMOD STOP CALLER ",
	......
},

其中,deployedBytecode对应第2节的runtime codebytecode对应第2节的init code + runtiem code
本例中,init code为:“608060405234801561001057600080fd5b50610150806100206000396000f3fe”,对应的opcodes为:

PUSH1 0x80 PUSH1 0x40 MSTORE CALLVALUE DUP1 ISZERO PUSH2 0x10 JUMPI PUSH1 0x0 DUP1 REVERT JUMPDEST POP PUSH2 0x150 DUP1 PUSH2 0x20 PUSH1 0x0 CODECOPY PUSH1 0x0 RETURN INVALID 

根据Layout in Memory可知,Solidity保留4个32-byte slots:

  • 1)0x00-0x3f(64字节):为scratch space for hashing methods。scratch space可用在statements之间(即 within inline assembly)。
  • 2)0x40-0x5f(32字节):为currently allocated memory size(又名free memory pointer)。free memory pointer应初始指向0x80。
  • 3)0x60-0x7f(32字节):为zero slot。zero slot用作initial value for dynamic memory arrays,且should never be written to。因此free memory pointer应初始指向0x80。

因此init code的初始几个bytes为6080604052,对应为:【将memory 0x40地址值初始化为0x80。】

 PUSH1 0x80 
 PUSH1 0x40 
 MSTORE

表示,往memory的0x40地址存入value 0x80。

接下来:【CALLVALUE负责从交易中获取value值,并将其存入stack中。】

CALLVALUE 
DUP1

此时stack中的内容为:| value | value |

ISZERO会取stack中的第一个值,若该值为0,则替换为1;若该值为非0值,则替换为0,此时stack中的内容为:| value == 0 | value |

PUSH2 0x10 之后,stack内容为:| 0x10 | value == 0 | value |

JUMPI为有条件JUMP,若stack中的第二个top值为非零值,则即取stack中的第一个top值为跳转目标offset,否则顺序执行。即本例中,若创建合约交易中value为0,跳转执行第0x10个opcode(编号从0开始)—— JUMPDEST,然后顺序执行后续opcode。此时stack内容为:| value |

POP之后,stack内容为:| |

PUSH2 0x150 DUP1 PUSH2 0x20 PUSH1 0x0之后,stack内容为:| 0x0 | 0x20 | 0x150 | 0x150 |

CODECOPY将stack中的top 3元素解析为| destOffset | offset | size |:

  • destOffset:是指byte offset in the memory where the result will be copied。【将runtime code拷贝到memory的offset,本例中为0x0。】
  • offset:是指byte offset in the code to copy。【本例中,init code长度为32字节,因此runtime code的 offset为0x20字节。】
  • size:是指byte size to copy。【本例中,runtime code的长度为336字节,即size为0x150。】

CODECOPY执行完之后stack内容为: | 0x150 |

PUSH1 0x0之后,stack内容为:| 0x0 | 0x150 |

RETURN用于halt execution returning output data,将stack中的top2元素解析为| offset | size |:【本例中即返回的是已存入内存中的runtime code】

  • offset:是指byte offset in the memory in bytes, to copy what will be the return data of this context。
  • size:是指byte size to copy(即size of the return data)。

参考资料

[1] Ethereum Contract Creation - Explained from bytecode
[2] OpenZeppelin析构Solidity合约系列博客 Deconstructing a Solidity Contract — Part II: Creation vs. Runtime
[3] What is the difference between bytecode, init code, deployed bytecode, creation bytecode, and runtime bytecode?
[4] Understanding Bytecode on Ethereum
[5] A deep-dive into Solidity – contract creation and the init code
[6] The difference between bytecode and deployed bytecode
[7] Layout in Memory
[8] A deep-dive into Solidity – function selectors, encoding and state variables

猜你喜欢

转载自blog.csdn.net/mutourend/article/details/127365619