Merkle tree generation and SPV verification information strategy

Recent projects need to use blockchain to quickly verify certain information, and since the amount of information every day is very large, it is not feasible to directly use blockchain to save information, so Use the leaf nodes of the Merkle tree to hash the incoming information first, and then use the node hashes to construct the Merkle tree to get the Merkle Root. Finally, together with the previous block hash, timestamp and other information, it is hashed and encrypted to get the current block. information.
Information original data
SPV verification information When the data is attacked and tampered with:
Tampering with information
Insert image description here Merkle tree construction algorithm:
Set the basic structure of the Merkle tree, including left and right children, data Domain, whether it is a leaf point and the corresponding information primary key

public class MerKleTreeNode {
    
    
    private MerKleTreeNode lChild;
    private MerKleTreeNode rChild;
    private String data;
    private Integer isLeaf;
    private Integer infoId;
}

First construct a hash for the leaf nodes

for(Map<String,String> map : list){
    
    
            MerKleTreeNode node = new MerKleTreeNode(null,null,BlockHashAlgoUtils.encodeDataBySHA_256(map),1,-1);
            res.add(node);
}

Because Merkle binary tree can use its pairwise hash to construct a full binary tree, and then use the characteristics of the full binary tree (serial number arrangement characteristics, tree height formula, number of nodes per layer formula) and so on, everything is related to the full binary tree. The characteristics of binary trees can achieve fast access to MerkleTreeNode. Since we want to construct a full binary tree, we can first use leaf nodes to find a number, let the number be X, X must be greater than the number of leaf nodes and is a power of 2 (1,2,4,8...), and X cannot be too large, otherwise it will cause a waste of space. It must be as close to the total number of leaf nodes as possible.

        if((size & (size -1))!=0){
    
    
            int n = size;
            n |= n >>> 1;
            n |= n >>> 2;
            n |= n >>> 4;
            n |= n >>> 8;
            n |= n >>> 16;
            n = n >= MAXIMUM ? MAXIMUM : n + 1;
            //计算差距
            int distance = n - size;
            for(int i =size-distance;i < size; i++){
    
    
                //复制对称数据，凑齐merkle树所有节点
                data.add(data.get(i));
            }
            size = n;
        }

The example above is as follows:
init: 0001101011011001
>>>1: 0001101010000100 | 0000110101000010 = 000111111000110
````````(unsigned right shift analogy)
====>>> 0001111111111111
res = 0001111111111111 + 1 = 0010000000000000 = 2^14
Taking advantage of the characteristics of high-order right shift, it is actually to change the incoming number to 1 through unsigned right shift, and finally pass By +1 once, you can get a 1 in the next highest bit of the original number, and the remaining positions become 0 due to the binary system. Now you can get an accurate X

The next step is to calculate some heights, serial numbers, and the node tree owned by each layer
Get the binary form of size, its length is the height of the merkle tree, that is, the binary Length

  int height = Integer.toBinaryString(size).length();
	int idxStart = (total +1) >> 1;
	int total = (int)(Math.pow(2,height)-1);

Finally construct the Merkle tree
idxStart is used to calculate the sequence number of the first leaf node of each layer of nodes (counting from the left) (the root node is arranged starting from 1)

while(height-1>0){
    
    
                /**
                 * 总节点数 2^n-1
                 * 每一层的最左节点序号
                 */
                total = (int)(Math.pow(2,height-1)-1);
                idxStart = (total +1) >> 1;
                /**
                 * 逐层构建，new temp(list) 作为新一层的数据赋值到data（list）上，循环
                 */
                ArrayList<MerKleTreeNode> temp = new ArrayList<>();
                for(int j = 0; j < size ; j+=2){
    
    
                    MerKleTreeNode lChild = data.get(j);
                    MerKleTreeNode rChild = data.get(j+1);
                    String hash = BlockHashAlgoUtils.encodeDataBySHA_256(lChild.getData() + rChild.getData());
                    MerKleTreeNode node = new MerKleTreeNode(lChild, rChild, hash,0,-1);
                    temp.add(node);
                    //输出非叶节点的序号
                    System.out.println(idxStart++);
                }
            //temp 下个循环进行新建回收，赋值data
            data = temp;
            height--;
            //size长度每次减半
            size = size >> 1;
        }

The next step is SPV verification. Since array storage is inconvenient to display, Guwo stores the data in the database first and uses the index field to store the subscript of the array.
data sheet To perform SPV query, the most important thing is to first obtain the SPV verification path (that is, the other half of the pair of hashes needs to be used as a fixed factor, as shown in the black circle in the figure below), or based on the full binary tree The serial number characteristics of The index where the message is located quickly gets the index of all spv verification path nodes
Insert image description here

 while(idx>1){
    
    
                /**
                 * if 是偶数左节点，取他的右兄弟节点
                 * else 取左节点
                 * 完成后向父节点移动
                 */
                if((idx&1) == 0){
    
    
                    list.add(idx+1);
                }else{
    
    
                    list.add(idx-1);
                }
                idx = idx >> 1;
            }
            /**
             * 1也要加入进去
             */
            list.add(idx);
            //根据集合在list集合中的数据去数据库中查询所有的节点
         		........(查库操作)
         }

The next step is to construct the hash branch again to regain the MerkleRoot, and compare it with the merkleROOT that has been uploaded to the blockchain to determine whether the information has been modified again.

if(CollectionUtils.isEmpty(checkProofs)){
    
    
            return res;
        }
        /**
         * 从集合中取出一个，任意一个所属的区块和链都应相同
         */
        Integer blockIndex = checkProofs.get(0).getBlockIndex();
        MerkleNode merkleRoot = getMerkleRoot(blockIndex);
        if(Objects.isNull(merkleRoot)){
    
    
            return res;
        }
        /**
         * 与区块链上的block比对Merkle Root
         */
        Blockchain blockchain = blockchainMapper.selectByPrimaryKey(blockIndex);
        if(!blockchain.getBlockMerkle().equals(merkleRoot.getHash())){
    
    
            return res;
        }
        /**
         *  与merkle树上的验证路径比对
         *  mapper降序处理
         *  通过奇偶判断左右孩子
         *  跑到最后一层即可,不用继续计算
         */
        Info info = infoMapper.selectByPrimaryKey(infoId);
        String hash = BlockHashAlgoUtils.encodeDataBySHA_256(info.toString());
        for(int i = checkProofs.size()-1;i>0;i--){
    
    
            MerkleNode node2 = checkProofs.get(i);
            if((node2.getMerkleNodeIndex()&1)==0){
    
    
                hash = BlockHashAlgoUtils.encodeDataBySHA_256(node2.getHash()+hash);
            }else{
    
    
                hash = BlockHashAlgoUtils.encodeDataBySHA_256(hash + node2.getHash());
            }
        }
        /**
         * 新生成的与merkle root再次比对
         */
        if(!blockchain.getBlockMerkle().equals(hash)){
    
    
            return res;
        }

In general, the advantage of using Merkle + SPV is that when you need to verify whether a certain information in a large amount of information has been modified, you do not need to know the information of all nodes, you only need to verify the node of a certain small branch. However, the required nodes only need log(n+1). When the order of magnitude is large, the space saved is considerable, and users do not need to bear such a large price. They only need to keep the head of the blockchain. Just data.
The rest is the characteristics of the blockchain. Since MerkleRoot is stored in the blockchain, the corresponding information can basically guarantee its reliability (hash collisions may still occur, see The hash algorithm used)

Merkle tree generation and SPV verification information strategy

Guess you like