Square 16 in ether and the status tree data structure

------ >> two brush boundaries.

Data structure of Ethernet Square

Ether Square account address is 160 (20 bytes), generally expressed as a hexadecimal number forty.

Simple hash table to realize how kind?

Query, update is done in constant time, while this structure can not provide a good hash proof, such as to sign a contract: the need to provide some account balance, which will be how to provide it:

  • One approach is to hash table elements organized into one Merkle tree, a root calculating a hash value stored in the block header, published out.

    • Problems: If you want to generate a new block, the arrival of the new block will cause the contents of the hash table changes, so we need to recalculate all accounts over the contents of a new generation of Merkle tree, so the price is too high (too many accounts). In fact account changes only a small part, most of the user's status will not change, so every time the cost of rebuilding a Merkle tree is great.

    • Bitcoin system is one on every block will build a Merkle tree, but why not have this problem? : Bitcoin transaction Merkle tree is assembled into a Merkle tree, Bitcoin Merkle tree after each finished rebuilding will not change, the number of blocks in the transaction it: 1m each transaction so up to 250 bytes 4000 in fact, many transactions are hundreds, so every time we publish a block is to make hundreds of thousands of blocks to construct a new Merkle tree. If we use this approach in an Ethernet Square, each account is to traverse all over again to build a Merkle tree, Merkle tree role in addition to providing hash proof beyond, there is an important role is to maintain state between full node consistency: this is also why bitcoin hash value is written in the root cause of the block header.

I have a question, why Ethernet Square, this way you need to traverse all accounts generate Merkle tree Yeah, you can not only traverse the same as Bitcoin-related accounts generated Merkle tree it

  • The above method is that the bad does not provide a good way to find and update, Merkle tree Bitcoin is how built: the bottom layer is the transaction, hash values ​​are paired together until the root node, and this methods for Ethernet Square does not provide a method to quickly find and update. In addition, we will direct account on a Merkle tree inside, the Merkle tree or not to sort? : 1, if it does not sort the words each full node can not maintain a unified unified Merkle tree 2, Bitcoin is not sorted, why not have this problem in bitcoin: bitcoin each node in the whole building of Merkle tree is not the same, but bitcoin have such a difference: in Bitcoin system is to obtain the right to have the final say bookkeeping node, so the problem does not arise Bitcoin system, but Square in the ether is to be published account state (great magnitude, so this approach is not feasible to block release)

  • The above description is not the sort of Merkle tree is not enough (the shape of the tree is not unique), then use sorted Merkle tree it will be a problem: how do add an account (he is a rebuilt a Merkle tree)

  • Use trie (retrieval): like trie

    • trie features

      • 1

      • 2

      • 3 because the account is not the same address, so it will not collide

      • 4 inserted in merkle tree in a different way, the resulting structure is not the same, so the problem does not occur in the trie.

      • Published once every 5 blocks, only account changes need to be changed, updated locality is very important, in the trie to update an account, you can access the appropriate branch, the update is a good locality.

      • The disadvantage of 6 trie

        • A waste storage, a single transmission pulse for a node, and the introduction of so patricia tree (trie): Compression prefix tree (see below), so that the height of the tree becomes shorter,

        •  

        • MPT (Merkle Patricia tree) here as if the block chain as ordinary pointer into the hash pointer

          This structure can not prove a key does not exist: the branch as Merkle proof made in the past, we can prove that he does not exist? ? ? Do not know what the meaning of

        • Ethernet is not square with native MPT, using a Modified MPT MPT to be replaced by the following structure:

        • 这里有四个账户(为了演示方便,账户状态只显示了余额,账户的地址也是比较短)

          • 节点分类

            • extension node 用于路径压缩

            • branch node 分支节点

            • leaf node叶子节点

  • 每次发布一个新的区块的过程中,这个状态树中有一些节点的值会发生变化,这些改变不是在原地改变的,而是新建一些分支,原来的状态其实是保留下来的,如下面的例子中是两个相邻的区块:

  •  

  • state root是状态树的根hash值,下面显示的是这课状态树,右边的state root是新发布的区块的状态树,我们可以看到,虽然这两棵树的有一些节点是不一样的,但是大部分节点是共享的,右边这棵树大部分节点指向的是左边的节点,只保存了发生改变节点的分支

    • 可以看到上述图片中发生变化的账户是一个合约账户,有code代码,还有存储,合约账户的存储也是用MPT的形式保存下来的

      • 这个变量的存储也是一个key value,维护着一个变量到变量存储的值,所以也是用了一棵MPT

      • 在以太坊中,是大的MPT包含小的MPT,每一个合约账户的存储就是一个小的MPT

    • 在上述图片中,改变的是存储中部分变量的值,可以看到其中一个变量的值由29变成了45

  • 以太坊将出块时间变为十几秒,所以会造成分叉并成为常态。在以太坊中有智能合约,所以在以太坊中要是不保存之前的状态,要想像比特币一样推算出之前的状态是很难的(a转给b10个比特币,由当前的状态就知道在交易之前a与b账户之前的状态),但是在以太坊中是不行的,不可能根据智能合约推算出之前的状态,所以要想支持回滚,必须要保存之前的状态,下面是以太坊中块头的一些定义:

    • parenthash:前一个区块块头的hash值

    • unclehash:叔叔区块的hash值,注意:有可能叔父区块不是和父亲区块在同一个级别上,有可能叔父区块要比父亲区块大好几个级别

    • coinbase:挖出这个区块的矿工的地址

    • 三棵树的根hash值

      • root 状态树的根hash值

      • txhash 交易树的根hash值:类似于比特币系统中的根hash值

      • receipthash 收据树的根hash值:

    • difficulty:挖矿的难度

    • number:

    • 智能合约要消耗汽油费,类似与智能合约的交易税:

      • gaslimit:

      • gasused:

    • time:大致的产生时间

    • mixdigest:在挖矿中产生作用

    • nonce:类似与比特币中的随机数,以太坊中的挖矿也是要寻找随机数,写在块头中的nonce是最后找到的符合难度要求的

以下是块的结构:

  • header是指向块头的指针

  • uncles是指向叔父区块的块头的指针,一个区块可以有多个叔父的区块

  • 以太坊系统中在网上真正发布的信息是前三项:

     

在状态树中保存的是key valve对,账户的状态是怎么存储在状态树当中的?:实际上的保存要经过一个序列化的过程,利用RLP进行一个序列化处理,(RLP:一个序列化的方法,特点就是简单,极简主义)

Protocol buffer:一个很常用的做序列化的库

在以太坊中的所有结构最后都要序列化为nested array of bytes,所以要实现一个RLP要比Protocol buffer容易很多,因为难的东西他都不做,都推给应用层做了。

 

 

 

 

 

 

 

 

 

 

发布了124 篇原创文章 · 获赞 16 · 访问量 3万+

Guess you like

Origin blog.csdn.net/qq_36160277/article/details/104363542