Article Directory
First, the state tree
Ethernet Square is based on the books of accounts, it is necessary to map the account address and account status, as follows:
We try to find an appropriate data structure to complete this requirement:
- If the state data stored in the form of a hash table, very efficiently retrieve, update the account status data, but since the state data stored only in the block body, the light is difficult to node Merkle Proof, so consider building Merkle tree;
- If the simple account data organized into Merkle tree, do not sort, you need to block release of all accounts to ensure consistent root hash, but orders of magnitude too large is not feasible; if only published account status changes, it will cause all root node hash inconsistent, not consensus;
- If you use a Merkle tree sort, the root hash of each node will be the same, but increase the account, you need to reconstruct Merkle tree, too costly. In addition Merkle tree is not able to quickly find, update status data. Therefore we need to consider a new data structure Merkle Patricia trie.
1. trie
trie prefix tree is a dictionary, information retrieval more convenient. If there General, Genesis, Go, God, Good these words, trie composition as follows:
trie has the following characteristics:
- The branch elements each node depends on the range, for example above 26 English letters and end flag, a maximum of 27 bifurcation. Square 16 in hexadecimal Ether (0 ~ F) represented account, together with the end flag, a maximum of 17 branches;
- Find efficiency depends on the length of the key value, the longer the key, the more the look of visits. Ether Square in account 40-digit hexadecimal number, so finding a fixed length of 40;
- Collision does not occur, then there is a problem hash table collision;
- Given a set of inputs, trie same configuration;
- Updated data is very easy, just visit the local branch.
2. Patricia tree(trie)
Patricia tree called compression path prefix tree, you can save storage space, while also reducing the number of visits to find and improve search efficiency. Trie improvements example embodiment of Patricia tree, as shown below:
Patricia Tree sparse distribution of key values for data compressing effect is more obvious, as shown below:
In Square ether, to prevent the collision, using 160bit long accounts, very sparse, and therefore suitable for use Patricia tree data structure.
3. Patricia Merkle tree (trie)
The Patricia tree pointer replaced all hash pointer, it became constructed Merkle Patricia tree, the root of the hash value can be calculated, stored in the header area.
- Guaranteed by the root hash tree is not tampered with, the status of each account are not tampered with;
- By Merkle proof, it can be used to prove the state of any one account, such as account balances;
- By Merkle proof, you can prove that an account does not exist.
4. Modified Merkle Patricia tree(trie)
Square Ethernet using a modified version of the Merkle Patricia tree, there is no essential difference between Merkle Patricia tree. For example, in FIG 7 has four addresses, account balance stored information (value), there are three nodes in the tree, a hash value associated with each storage node of the node:
- Extension Node: extension nodes, the compressed data storage path, i.e., 16 shared nibbles stored binary data;
- Branch Node: branch node, can not be compressed;
- Leaf Node: leaf node, saving account status data;
In addition the new release a block of time, some state accounts will change, the new district will soon account for the change to re-establish the branch, the branch history most of the same data block is pointing, and therefore between the blocks will most states share the same branch. As shown below:
Benefits to preserve the historical status: Not to Win bifurcation temporary rollback to continue the block, due to the implementation of intelligent contracts is not easy to perform reverse thrust, and retain the starting end of the recording, the rollback was more convenient.
5. Account state value storage
After the account status data sequence storage RLP (Recursive Length Prefix), RLP serialization is simple compared protobuf, supports only nested array of characters (nested array of bytes), relatively easy to implement.
6. Block Code Analysis
Header structure region is defined as follows:
ParentHash parent denotes a hash block, UncleHash hash block represents uncle, miners represent Coinbase account address, Root represents the root hash tree state, TxHash transaction represents the root hash tree, ReceiptHash indicate receipt of the tree root hash, Bloom expressed bloom filter (for efficient query results meet certain conditions), difficulty represent mining difficult (to adjust as needed), GasLimit and GasUsed associated with gasoline, Time represents the block approximate generation time, MixDigest associated with the mining process, from the column after some calculated from nonce, a random number nonce represents answer of mining.
// Header represents a block header in the Ethereum blockchain.
type Header struct {
ParentHash common.Hash `json:"parentHash" gencodec:"required"`
UncleHash common.Hash `json:"sha3Uncles" gencodec:"required"`
Coinbase common.Address `json:"miner" gencodec:"required"`
Root common.Hash `json:"stateRoot" gencodec:"required"`
TxHash common.Hash `json:"transactionsRoot" gencodec:"required"`
ReceiptHash common.Hash `json:"receiptsRoot" gencodec:"required"`
Bloom Bloom `json:"logsBloom" gencodec:"required"`
Difficulty *big.Int `json:"difficulty" gencodec:"required"`
Number *big.Int `json:"number" gencodec:"required"`
GasLimit uint64 `json:"gasLimit" gencodec:"required"`
GasUsed uint64 `json:"gasUsed" gencodec:"required"`
Time uint64 `json:"timestamp" gencodec:"required"`
Extra []byte `json:"extraData" gencodec:"required"`
MixDigest common.Hash `json:"mixHash"`
Nonce BlockNonce `json:"nonce"`
}
Block structure is shown below:
header to point to the region header (Header) pointer, uncles uncle to point to pointers header region, transactions within a block of the transaction list.
// Block represents an entire block in the Ethereum blockchain.
type Block struct {
header *Header
uncles []*Header
transactions Transactions
// caches
hash atomic.Value
size atomic.Value
// Td is used by package core to store the total difficulty
// of the chain up to and including the block.
td *big.Int
// These fields are used by package eth to track
// inter-peer block relay.
ReceivedAt time.Time
ReceivedFrom interface{}
}
Block release information:
extblock to block release of information to the network, including the header area, a list of transactions, uncle area header.
// "external" block encoding. used for eth protocol, etc.
type extblock struct {
Header *Header
Txs []*Transaction
Uncles []*Header
}