How to realize efficient data distribution in distributed clusters, consistent hash, distributed hash table

I. Introduction

With the development of the Internet, more and more data are generated by users, and enterprises are facing the problem of storing huge data. At present, the mainstream distributed big data file system on the market breaks up data slices and uses discrete methods to disperse data. Listed on all nodes in the cluster, this article will take you to understand DHT (Distributed Hash Table): how distributed hash tables implement distributed discrete storage of data.

DHT (Distributed Hash Table): Distributed Hash Table

2. Technical background

In the early days of the Internet, data was usually stored on a single server. The initial growth of data was relatively slow, and the storage capacity of a single machine could be improved to meet the growing demand for data. With the advancement of the Internet, the number of users and the data generated and accessed by users became exponential Growth; a single machine can no longer store the data that users need. For this reason, there is an urgent need for multiple servers to work together to store larger data.

Three, traditional Hash

Traditional Hash uses the algorithm hash()=X mod S to disperse the data. When the metadata is relatively dispersed, the data can be well hashed in the cluster nodes. Since S represents the number of nodes in the cluster, when the cluster is expanded and contracted, the change of S will affect the hit problem of historical data. Therefore, in order to improve the data hit rate, a large number of measurement data migrations will occur, and the performance will be poor.

Four, a simple DHT

Distributed Hash constructs a ring with a length of 2 to the 32th power (the number of ipv4 addresses), and hashes nodes on the Hash ring. The hash value calculated according to different Hash algorithms is different. This article uses the FNV Hash algorithm To calculate the Hash value.
Insert picture description here
As shown in the figure, the storage node is first added to the DHT ring through Hash calculation. The interval between each node and the previous node is used as the data partition of the node. The data whose hash value falls in this partition will be stored in On this node

Then hash the data to the DHT ring through the Hash algorithm. After the data falls on the DHT ring, follow the clockwise direction to find the node closest to you as the data storage node. As shown below, the data ObjectA falls on the node NodeA, and the data ObjectB Fall on the node NodeB;
Insert picture description here
the source code for initializing the DHT is as follows:
Insert picture description here
First, a Map that stores the metadata of the cluster nodes is defined to store the data of the physical nodes connected to the DHT ring. Then a DHT ring-vNodes is defined, which is used to store node location information in the DHT ring. In this way, we have implemented a simple DHT ring, which can simulate the joining of cluster nodes through the addPhysicalNode method. When joining, the hash value of the node will be calculated and stored in vNodes.
Image: https://uploader.shimo.im/f/9l8g7BCYTZrLsnQm.png
Initialize 4 storage nodes.
Insert picture description here
Use the countNodeValue method to insert 100 pieces of data. In the process of writing data, find the nearest node on the DHT ring according to the hash value of the data, and then write the data to the node.
Insert picture description here
After inserting 100 pieces of data, the data distribution of each node is as follows. It can be seen that the data of the 4 nodes is not uniform, and only one node is allocated to the data (this also has a certain relationship with the written data).
Insert picture description here
After inserting 1 million pieces of data, the data distribution of each node is as follows. Although each node is allocated data, there is still a large data skew. This will cause 99% of requests to be processed by Node3, and there will be a situation where one core is difficult and three cores are onlookers.
Insert picture description here

What is the reason for the above problem? By looking at the hash value of each node on the DHT ring, it is not difficult to see that the distance between each node is uneven. When the inserted data is clockwise to find the node, Node3 is found. Therefore, the data is written to Node3, so the uneven node interval will cause some These nodes can cover more data, resulting in data imbalance.
Insert picture description here
Insert picture description here

After talking about the basic principle of a simple DHT ring, let's think about a problem: the simple DHT ring data is discrete, but there is still a situation of data skew. It is better to use the traditional hash method to distribute the data.

As mentioned earlier, the traditional hash method will undergo a large number of migrations of data in the entire cluster after a node failure at that time, which will affect the performance of the cluster. Can DHT solve this problem?

We still use the 1 million pieces of data allocated before to simulate the failure of node 4. As shown in the figure below, the data on Node4 is only migrated to Node1, and no data migration is generated for Node2 and Node3, thereby reducing node failures and causing each node to fail. Need to carry out the impact of data migration.
Insert picture description here

Five, DHT improvement

1. Virtual node

Think about it, how can we solve the problem of data skew?

(1) Increase cluster nodes

It is the simplest and most direct way to add cluster nodes. The purpose is to hash more nodes onto the DHT ring, so that all nodes on the ring are distributed more evenly, and the interval between nodes is as balanced as possible. The following are 10 nodes and 20 Data distribution of each node cluster.
Insert picture description here
Insert picture description here
It can be found that the problem of data skew cannot be solved fundamentally by adding nodes. And adding nodes will increase the equipment cost and maintenance cost of the cluster. At the same time, this solution also leads to a serious problem. If Node20 fails, all the data of Node20 will be migrated to the next node, which will eventually lead to data skew in the cluster. Nodes with more data will also process more IO. Requests can easily form data hotspots and become a performance bottleneck, causing overall performance degradation of the cluster.
Insert picture description here

(2) Introduce virtual nodes

In order to solve the problem of data skew, the concept of virtual nodes is introduced. The virtual node is also a logical copy of the real node. As shown in the figure, the virtual node hash distribution is performed on the node NodeA three times to form the virtual nodes NodeA1, NodeA2, and NodeA3. When NodeA fails, the data pointing to NodeA will point to NodeB and NodeC.
Insert picture description here
Insert picture description here
When the number of virtual nodes introduced is 100, the data has been scattered on each node. If there are enough virtual nodes, the state of data balance will eventually be reached.
Insert picture description here

Data distribution
Insert picture description here
for 10,000 hours of virtual node data: Data distribution when the number of virtual nodes is 1 million:
Insert picture description here
When Node3 fails, the data on Node3 is evenly distributed to other nodes without data skew.
Insert picture description here

2. Load boundary factor

Is this perfect? We initialize a DHT ring with 4 nodes, set the virtual node to 100, then insert 100 pieces of data, and print the metadata information of the DHT ring as follows: It
Insert picture description here
can be found that although the virtual node is set, it is still unable to hash the nodes in a balanced manner. On the DHT ring, Node2 is overloaded and Node1 is idle. Let's think about an extreme scenario. When our data happens to be in the interval A after calculating the hash value, and this interval is only NodeA, then there is still a data skew. How to solve this problem, here we introduce a concept called load boundary factor. The DHT ring deploys 4 nodes, and there are 100 pieces of data to be inserted. On average, the weight of each node is 100/4+1=26. When the weight of the node is reached during the data mapping process, it will be mapped to the next one. Node, below is the code implementation.
Insert picture description here
Insert picture description here
When the load boundary factor switch is
Insert picture description here
turned on : After the load boundary factor switch is turned on, the data is well balanced.

Sixth, the thinking triggered by DHT

The above is just a simple DHT, and the data has been simplified. The storage and reading of data need to query the DHT ring. How to improve the read and write performance of DHT? How to improve the high reliability of DHT? When a node fails, how to migrate the data of the failed node to the new node? How to do a good job of data backup? How to ensure that the replica data is not concentrated on one node? It also needs to be considered. This article is just a brief introduction to the basic ideas of DHT. More challenges facing the production environment will not be expanded here.

It can be seen that DHT provides an idea of ​​load balancing. Use the characteristics of the hash algorithm to distribute data or business requests to each node in the cluster to improve the fault tolerance of the system.

Guess you like

Origin blog.csdn.net/qq_32727095/article/details/113858454