In-depth understanding of blockchain sixth: Bitcoin blockchain

Introduction to Blockchain

A blockchain is a data structure in which blocks containing transaction information are linked in order from back to front . It can be stored as a flat file (a file containing records with no relative relationship), or in a simple database. The Bitcoin Core client uses Google's LevelDB database to store blockchain metadata. Blocks are sequentially linked in this chain from back to front, with each block pointing to the previous block. Blockchains are often viewed as a vertical stack , with the first block being the first block at the bottom of the stack, and each subsequent block being placed on top of other blocks. Having visualized the concept of stacking blocks in sequence , we can use terms such as: "height" for the distance between a block and the first block; and "top" or "top" for The latest added block. 

SHA256 cryptographic hash of each block header to generate a hash value. Through this hash value, the corresponding block in the blockchain can be identified. At the same time, each block can refer to the previous block (parent block) through the "parent block hash" field of its block header . That is, each block header contains its parent block hash. This sequence of hashes linking each block to its respective parent creates a chain that goes all the way back to the first block (the genesis block). 

Although each block has only one parent block , it is possible to temporarily have multiple child blocks . Each child block has the same block as its parent block and has the same (parent block) hash in the "Parent Block Hash" field. The occurrence of multiple sub-blocks in a block is called a " blockchain fork ". A blockchain fork is a temporary state that only occurs when multiple different blocks are discovered by different miners at about the same time (see "8.10.1 Blockchain Fork"). Ultimately, only one subblock will become part of the blockchain, solving the "blockchain fork" problem. Although a block may have more than one child block, each block has only one parent block, this is because a block has only one "parent block hash" field that can point to its only parent block. 

Since the block header contains the "parent block hash value" field, the hash value of the current block is also affected by this field. If the identity of the parent block changes, the identity of the child block will also change. When there is any change in the parent block, the hash value of the parent block also changes. A change in the hash value of the parent block will force a change in the "parent block hash" field of the child block, which in turn will cause the hash value of the child block to change. A change in the hash value of the child block will force the "parent block hash" field of the grandchild block to change, which in turn changes the hash value of the grandchild block, and so on. Once a block is many generations old, this waterfall effect will ensure that the block will not be changed unless all subsequent blocks of the block are forced to be recalculated . It is precisely because such recalculation requires a huge amount of computation that the existence of a long blockchain can make the history of the blockchain immutable, which is also a key feature of Bitcoin's security. 

You can think of a blockchain as a geological layer in a geological formation or a glacier core sample . The surface layer may change with the seasons and be blown away by the wind even before it was deposited. But the deeper you go, the more stable the geological layers become. Go down a few hundred feet and you'll see rock formations that have been preserved for millions of years but remain in their historic state. In the blockchain, the most recent blocks may be modified due to recalculation caused by the fork of the blockchain. The latest six blocks are like topsoil several inches deep. However, beyond these six blocks, the deeper a block is in the blockchain, the less likely it is to be altered. After 100 blocks, the blockchain is stable enough that Coinbase transactions (transactions containing newly mined bitcoins) can be paid. A blockchain after a few thousand blocks (one month) will become deterministic history that will never change. 

7.2 Block Structure

A block is a container data structure that aggregates transaction information contained in a public ledger (blockchain). It consists of a block header containing metadata followed by a long list of transactions that make up the block body. The block header is 80 bytes, while the average transaction is at least 250 bytes, and the average block contains at least 500 transactions. Therefore, a full block containing all transactions is 1000 times larger than the block header. Table 7-1 describes a block structure. 

Table 7-1 Block Structure

size field describe
4 bytes block size The block size after this field in bytes
80 bytes block header Several fields that make up the block header
1-9 (variable integer) transaction counter number of transactions
Variable trade Transaction information recorded in the block

7.3 Block header

The block header consists of three sets of block metadata. The first is a set of data referencing the hash of the parent block, this set of metadata is used to connect this block to the previous block in the blockchain. The second set of metadata, difficulty, timestamp, and nonce, is related to mining competition, as detailed in Chapter 8. The third set of metadata is the merkle root (a data structure used to effectively summarize all transactions in a block). Table 7-2 describes the data structure of the block header. 

Table 7-2 Block header structure

size field describe
4 bytes Version Version number, used to track software/protocol updates
32 bytes 父区块哈希值 引用区块链中父区块的哈希值
32字节 Merkle根 该区块中交易的merkle树根的哈希值
4字节 时间戳 该区块产生的近似时间(精确到秒的Unix时间戳)
4字节 难度目标 该区块工作量证明算法的难度目标
4字节 Nonce 用于工作量证明算法的计数器

Nonce、难度目标和时间戳会用于挖矿过程,更多细节将在第8章讨论。 

7.4 区块标识符:区块头哈希值和区块高度

区块主标识符是它的加密哈希值,一个通过SHA256算法对区块头进行二次哈希计算而得到的数字指纹。产生的32字节哈希值被称为区块哈希值,但是更准确的名称是:区块头哈希值,因为只有区块头被用于计算。例如:000000000019d6689c085ae165831e934ff763ae46a2a6c172b3f1b60a8ce26f是第一个比特币区块的区块哈希值。区块哈希值可以唯一、明确地标识一个区块,并且任何节点通过简单地对区块头进行哈希计算都可以独立地获取该区块哈希值。 

请注意,区块哈希值实际上并不包含在区块的数据结构里,不管是该区块在网络上传输时,抑或是它作为区块链的一部分被存储在某节点的永久性存储设备上时。相反,区块哈希值是当该区块从网络被接收时由每个节点计算出来的。区块的哈希值可能会作为区块元数据的一部分被存储在一个独立的数据库表中,以便于索引和更快地从磁盘检索区块。 

第二种识别区块的方式是通过该区块在区块链中的位置,即“区块高度(block height)”。第一个区块,其区块高度为0,和之前哈希值000000000019d6689c085ae165831e934ff763ae46a2a6c172b3f1b60a8ce26f所引用的区块为同一个区块。因此,区块可以通过两种方式被识别:区块哈希值或者区块高度。每一个随后被存储在第一个区块之上的区块在区块链中都比前一区块“高”出一个位置,就像箱子一个接一个堆叠在其他箱子之上。2014年1月1日的区块高度大约是278,000,说明已经有278,000个区块被堆叠在2009年1月创建的第一个区块之上。 

和区块哈希值不同的是,区块高度并不是唯一的标识符。虽然一个单一的区块总是会有一个明确的、固定的区块高度,但反过来却并不成立,一个区块高度并不总是识别一个单一的区块。两个或两个以上的区块可能有相同的区块高度,在区块链里争夺同一位置。这种情况在“8.10.1 区块链分叉”一节中有详细讨论。区块高度也不是区块数据结构的一部分,它并不被存储在区块里。当节点接收来自比特币网络的区块时,会动态地识别该区块在区块链里的位置(区块高度)。区块高度也可作为元数据存储在一个索引数据库表中以便快速检索。 

0?wx_fmt=png
一个区块的区块哈希值总是能唯一地识别出一个特定区块。一个区块也总是有特定的区块高度。但是,一个特定的区块高度并不一定总是能唯一地识别出一个特定区块。更确切地说,两个或者更多数量的区块也许会为了区块链中的一个位置而竞争。 

7.5 创世区块

区块链里的第一个区块创建于2009年,被称为创世区块。它是区块链里面所有区块的共同祖先,这意味着你从任一区块,循链向后回溯,最终都将到达创世区块。 

因为创世区块被编入到比特币客户端软件里,所以每一个节点都始于至少包含一个区块的区块链,这能确保创世区块不会被改变。每一个节点都“知道”创世区块的哈希值、结构、被创建的时间和里面的一个交易。因此,每个节点都把该区块作为区块链的首区块,从而构建了一个安全的、可信的区块链的根。 

在chainparams.cpp里可以看到创世区块被编入到比特币核心客户端里。 

创世区块的哈希值为: 

0000000000 
19d6689c085ae165831e934ff763ae46a2a6c172b3f1b60a8ce26f

你可以在任何区块浏览网站搜索这个区块哈希值,如blockchain.info,你会发现一个用包含这个哈希值的链接来描述这一区块内容的页面: 

https://blockchain.info/block/000000000019d6689c085ae165831e934ff763ae46a2a6c172b3f1b60a8ce26f

https://blockexplorer.com/block/ 000000000019d6689c085ae165831e934ff763ae46a2a6c172b3f1b60a8ce26f

在命令行使用比特币核心客户端: 

$ bitcoind getblock 000000000019d6689c085ae165831e934ff763ae46a2a6c172b3f1b60a8ce26f
{
 "hash":"000000000019d6689c085ae165831e934ff763ae46a2a6c172b3f1b60a8ce26f",
 "confirmations":308321,
 "size":285,
 "height":0,
 "version":1,
 "merkleroot":"4a5e1e4baab89f3a32518a88c31bc87f618f76673e2cc77ab2127b7afdeda33b",
 "tx":["4a5e1e4baab89f3a32518a88c31bc87f618f76673e2cc77ab2127b7afdeda33b"],
 "time":1231006505,
 "nonce":2083236893,
 "bits":"1d00ffff",
 "difficulty":1.00000000,
 "nextblockhash":"00000000839a8e6886ab5951d76f411475428afc90947ee320161bbf18eb6048"
}

创世区块包含一个隐藏的信息。在其Coinbase交易的输入中包含这样一句话“The Times 03/Jan/2009 Chancellor on brink of second bailout forbanks.”这句话是泰晤士报当天的头版文章标题,引用这句话,既是对该区块产生时间的说明,也可视为半开玩笑地提醒人们一个独立的货币制度的重要性,同时告诉人们随着比特币的发展,一场前所未有的世界性货币革命将要发生。该消息是由比特币的创立者中本聪嵌入创世区块中。 

7.6 区块的连接

比特币的完整节点保存了区块链从创世区块起的一个本地副本。随着新的区块的产生,该区块链的本地副本会不断地更新用于扩展这个链条。当一个节点从网络接收传入的区块时,它会验证这些区块,然后链接到现有的区块链上。为建立一个连接,一个节点将检查传入的区块头并寻找该区块的“父区块哈希值”。 

让我们假设,例如,一个节点在区块链的本地副本中有277,314个区块。该节点知道最后一个区块为第277,314个区块,这个区块的区块头哈希值为: 00000000000000027e7ba6fe7bad39faf3b5a83daed765f05f7d1b71a1632249。 

然后该比特币节点从网络上接收到一个新的区块,该区块描述如下: 

{
 "size":43560,
 "version":2,

 "previousblockhash":"00000000000000027e7ba6fe7bad39faf3b5a83daed765f05f7d1b71a1632249",
 "merkleroot":"5e049f4030e0ab2debb92378f53c0a6e09548aea083f3ab25e1d94ea1155e29d",
 "time":1388185038,
 "difficulty":1180923195.25802612,
 "nonce":4215469401,
 "tx":["257e7497fb8bc68421eb2c7b699dbab234831600e7352f0d9e6522c7cf3f6c77",
  #[...many more transactions omitted...]
  "05cfd38f6ae6aa83674cc99e4d75a1458c165b7ab84725eda41d018a09176634"
 ]
}

对于这一新的区块,节点会在“父区块哈希值”字段里找出包含它的父区块的哈希值。这是节点已知的哈希值,也就是第277314块区块的哈希值。故这个区块是这个链条里的最后一个区块的子区块,因此现有的区块链得以扩展。节点将新的区块添加至链条的尾端,使区块链变长到一个新的高度277,315。图7-1显示了通过“父区块哈希值”字段进行连接三个区块的链。 

0?wx_fmt=png
图7-1 区块通过引用父区块的区块头哈希值的方式,以链条的形式进行相连

7.7 Merkle 树

区块链中的每个区块都包含了产生于该区块的所有交易,且以Merkle树表示。 

Merkle树是一种哈希二叉树,它是一种用作快速归纳和校验大规模数据完整性的数据结构。这种二叉树包含加密哈希值。术语“树”在计算机学科中常被用来描述一种具有分支的数据结构,但是树常常被倒置显示,“根”在图的上部同时“叶子”在图的下部,你会在后续章节中看到相应的例子。 

在比特币网络中,Merkle树被用来归纳一个区块中的所有交易,同时生成整个交易集合的数字指纹,且提供了一种校验区块是否存在某交易的高效途径。生成一棵完整的Merkle树需要递归地对哈希节点对进行哈希,并将新生成的哈希节点插入到Merkle树中,直到只剩一个哈希节点,该节点就是Merkle树的根。在比特币的Merkle树中两次使用到了SHA256算法,因此其加密哈希算法也被称为double-SHA256。 

当N个数据元素经过加密后插入Merkle树时,你至多计算2*log2(N)次就能检查出任意某数据元素是否在该树中,这使得该数据结构非常高效。 

Merkle树是自底向上构建的。在如下的例子中,我们从A、B、C、D四个构成Merkle树树叶的交易开始,如图7-2。起始时所有的交易都还未存储在Merkle树中,而是先将数据哈希化,然后将哈希值存储至相应的叶子节点。这些叶子节点分别是HA、HB、HC和HD: 

H~A~ = SHA256(SHA256(交易A))

通过串联相邻叶子节点的哈希值然后哈希之,这对叶子节点随后被归纳为父节点。 例如,为了创建父节点HAB,子节点A和子节点B的两个32字节的哈希值将被串联成64字节的字符串。随后将字符串进行两次哈希来产生父节点的哈希值: 

H~AB~=SHA256(SHA256(H~A~ + H~B~))

继续类似的操作直到只剩下顶部的一个节点,即Merkle根。产生的32字节哈希值存储在区块头,同时归纳了四个交易的所有数据。 

0?wx_fmt=png
图7-2 在Merkle树中计算节点 

因为Merkle树是二叉树,所以它需要偶数个叶子节点。如果仅有奇数个交易需要归纳,那最后的交易就会被复制一份以构成偶数个叶子节点,这种偶数个叶子节点的树也被称为平衡树。如图7-3所示,C节点被复制了一份。 

0?wx_fmt=png
图7-3 复制一份数据节点,使整个树中数据节点个数是偶数 

由四个交易构造Merkle树的方法同样适用于从任意交易数量构造Merkle树。在比特币中,在单个区块中有成百上千的交易是非常普遍的,这些交易都会采用同样的方法归纳起来,产生一个仅仅32字节的数据作为Merkle根。在图7-4中,你会看见一个从16个交易形成的树。需要注意的是,尽管图中的根看起来比所有叶子节点都大,但实际上它们都是32字节的相同大小。无论区块中有一个交易或者有十万个交易,Merkle根总会把所有交易归纳为32字节。 

0?wx_fmt=png
图7-4 一颗囊括了许多数据元素的Merkle树 

为了证明区块中存在某个特定的交易,一个节点只需要计算log2(N)个32字节的哈希值,形成一条从特定交易到树根的认证路径或者Merkle路径即可。随着交易数量的急剧增加,这样的计算量就显得异常重要,因为相对于交易数量的增长,以基底为2的交易数量的对数的增长会缓慢许多。这使得比特币节点能够高效地产生一条10或者12个哈希值(320-384字节)的路径,来证明了在一个巨量字节大小的区块中上千交易中的某笔交易的存在。 

在图7-5中,一个节点能够通过生成一条仅有4个32字节哈希值长度(总128字节)的Merkle路径,来证明区块中存在一笔交易K。该路径有4个哈希值(在图7-5中由蓝色标注)HL、HIJ、HMNOP和HABCDEFGH。由这4个哈希值产生的认证路径,再通过计算另外四对哈希值HKL、HIJKL、HIJKLMNOP和Merkle树根(在图中由虚线标注),任何节点都能证明HK(在图中由绿色标注)包含在Merkle根中。 

0?wx_fmt=png
图7-5 一条为了证明树中包含某个数据元素而使用的Merkle路径 


Merkle树的高效随着交易规模的增加而变得异常明显。表7-3展示了为了证明区块中存在某交易而所需转化为Merkle路径的数据量。 

表7-3 Merkle树的效率

交易数量 区块的近似大小 路径大小(哈希数量) 路径大小(字节)
16笔交易 4KB 4个哈希 128字节
512笔交易 128KB 9个哈希 288字节
2048笔交易 512KB 11个哈希 352字节
65,535笔交易 16MB 16个哈希 512字节

依表可得,当区块大小由16笔交易(4KB)急剧增加至65,535笔交易(16MB)时,为证明交易存在的Merkle路径长度增长极其缓慢,仅仅从128字节到512字节。有了Merkle树,一个节点能够仅下载区块头(80字节/区块),然后通过从一个满节点回溯一条小的Merkle路径就能认证一笔交易的存在,而不需要存储或者传输大量区块链中大多数内容,这些内容可能有几个G的大小。这种不需要维护一条完整的区块链的节点,又被称作简单支付验证(SPV)节点,它不需要下载整个区块而通过Merkle路径去验证交易的存在。 

7.8 Merkle树和简单支付验证(SPV)

Merkle树被SPV节点广泛使用。SPV节点不保存所有交易也不会下载整个区块,仅仅保存区块头。它们使用认证路径或者Merkle路径来验证交易存在于区块中,而不必下载区块中所有交易。 

例如,一个SPV节点欲知它钱包中某个比特币地址即将到达的支付,该节点会在节点间的通信链接上建立起bloom过滤器,限制只接受含有目标比特币地址的交易。当节点探测到某交易符合bloom过滤器,它将以Merkleblock消息的形式发送该区块。Merkleblock消息包含区块头和一条连接目标交易与Merkle根的Merkle路径。SPV节点能够使用该路径找到与该交易相关的区块,进而验证对应区块中该交易的有无。SPV节点同时也使用区块头去关联区块和区块链中的区域区块。这两种关联,交易与区块、区块和区块链,证明交易存在于区块链。简而言之,SPV节点会收到少于1KB的有关区块头和Merkle路径的数据,其数据量比一个完整的区块(目前大约有1MB)少了一千倍有余。

每天五分钟, 玩转区块链:

0?wx_fmt=jpeg


Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325359821&siteId=291194637