Analysis of the Principle of Consistent Hashing

introduce

In business development, we often persist data to the database. If you need to read these data, in addition to reading directly from the database, in order to reduce the access pressure of the database and improve the access speed, we introduce more caches to access the data. The process of reading data is generally as follows:

write picture description here 
Figure 1: The process of reading data into the cache

For a distributed cache, data for different objects is stored on different machines. In order to achieve load balancing of these cache machines, Equation 1 can be used to locate the storage machine of the object cache:

m = hash(o) mod n - Equation 1

Among them, ois the name of the object, nis the number mof machines, is the number of the machine, and hashis a hash function. The load balancer in Figure 2 uses Equation 1 to dispatch client requests for different objects to different machines for execution. For example, for object o, after the calculation of Equation 1, the obtained mvalue is 3, then all read and store requests for object o are sent to machine 3 for execution.

write picture description here 
Figure 2: How to use Hash modulo to achieve load balancing

Equation 1 works well most of the time, however, when the machine needs to be scaled up or the machine goes down, things get trickier. 
When the machine is expanded and a cache machine needs to be added, the formula used by the load balancer becomes:

m = hash(o) mod (n + 1) - Equation 2

When the machine is down and the number of machines is reduced by one, the formula used by the load balancer becomes:

m = hash(o) mod (n - 1) - Equation 3

Let's take the case of machine expansion as an example to illustrate what problems a simple modulo method can cause. Suppose the machine is changed from 3 to 4, the m value of the object o1 calculated by the formula 1 is 2, but the m value calculated by the formula 2 may be 0, 1, 2, 3 (an integer of 3t + 2 Taking the modulo of 4, its value may be 0, 1, 2, 3, the reader can verify it by himself), there is about a 75% (3/4) possibility of a cache access miss. As the size of the machine cluster increases, this ratio increases linearly. When 99 machines are added with 1 machine, the probability of a miss is 99% (99/100). Such a result is obviously unacceptable, because it will lead to a sudden increase in the pressure of database access, and in severe cases, it may even lead to database downtime.

Consistent hash algorithm is exactly the method to solve such problems, it can ensure that when the machine increases or decreases, the impact on the probability of cache access hits is minimized. Let's talk about the specific process of the consistent hash algorithm in detail.

Consistent Hash Ring

Consistent hashing algorithms are implemented through a data structure called a consistent hash ring. The starting point of this ring is 0, the ending point is 2^32 - 1, and the starting point and the ending point are connected, and the integers in the middle of the ring are distributed counterclockwise, so the integer distribution range of this ring is [0, 2^32-1], as follows As shown in Figure 3:

write picture description here 
Figure 3: Consistent Hash Ring

Place an object into the Hash ring

Suppose now we have 4 objects, namely o1, o2, o3, o4, use the hash function to calculate the hash value of these 4 objects (range 0 ~ 2^32-1):

hash(o1) = m1 
hash(o2) = m2 
hash(o3) = m3 
hash(o4) = m4

Put the four values ​​of m1, m2, m3, and m4 on the hash ring, and get the following figure 4:

write picture description here 
Figure 4: Consistent Hash Ring with Objects placed

Place the machine in the Hash ring

Using the same hash function, we also place the machine on the hash ring. Suppose we have three cache machines, c1, c2, and c3, and use the hash function to calculate the hash values ​​of these three machines:

hash(c1) = t1 
hash(c2) = t2 
hash(c3) = t3

Put the three values ​​of t1, t2, and t3 on the hash ring, and get the following figure 5:

write picture description here 
Figure 5: Consistent hash ring with machines placed

Select the machine for the object

After placing both the object and the machine in the same hash ring, search the hash ring clockwise to find the machine closest to the hash value of the object, that is, the machine to which the object belongs. 
For example, for object o2, the closest machine found by the sequence needle is c1, so machine c1 will cache object o2. And machine c2 caches o3, o4, and machine c3 caches object o1.

write picture description here 
Figure 6: Selecting machines for objects on a consistent hash ring

Handling machine additions and removals

For online business, it is common to increase or decrease the deployment of a machine. 
For example, increase the deployment of machine c4 and add machine c4 to the hash ring between machines c3 and c2. At this point, only objects between machines c3 and c4 need to be reallocated to the new machine. For our example, only the object o4 was reassigned to c4, the other objects are still on the original machine. As shown in Figure 7:

write picture description here 
Figure 7: The structure of the consistent hash ring after adding machines

As mentioned above, using a simple modulo method will cause most of the cache invalidation when a new machine is added, and this situation will be greatly improved by using a consistent hash algorithm. As mentioned earlier, after 3 machines become 4 machines, the cache hit rate is only 25% (the miss rate is 75%). With the consistent hash algorithm, ideally, the cache hit rate is 75%, and as the scale of the machine increases, the hit rate will further increase. After 99 machines are added, the hit rate reaches 99%, which greatly reduces the It increases the pressure of database access brought by the cache machine.

For another example, when machine c1 is offline (of course, it is also possible that machine c1 is down), at this time, only the objects originally allocated to machine c1 need to be reassigned to the new machine. For our example, only the object o2 was reassigned to machine c3, the other objects are still on the original machine. As shown in Figure 8:

write picture description here 
Figure 8: The structure of the consistent hash ring after reducing the machine

virtual node

上面提到的过程基本上就是一致性hash的基本原理了,不过还有一个小小的问题。新加入的机器c4只分担了机器c2的负载,机器c1与c3的负载并没有因为机器c4的加入而减少负载压力。如果4台机器的性能是一样的,那么这种结果并不是我们想要的。 
为此,我们引入虚拟节点来解决负载不均衡的问题。 
将每台物理机器虚拟为一组虚拟机器,将虚拟机器放置到hash环上,如果需要确定对象的机器,先确定对象的虚拟机器,再由虚拟机器确定物理机器。 
说得有点复杂,其实过程也很简单。

还是使用上面的例子,假如开始时存在缓存机器c1,c2,c3,对于每个缓存机器,都有3个虚拟节点对应,其一致性hash环结构如图9所示:

write picture description here 
图9:机器c1,c2,c3的一致性Hash环结构

假设对于对象o1,其对应的虚拟节点为c11,而虚拟节点c11对象缓存机器c1,故对象o1被分配到机器c1中。

新加入缓存机器c4,其对应的虚拟节点为c41,c42,c43,将这三个虚拟节点添加到hash环中,得到的hash环结构如图10所示:

write picture description here 
图10:机器c1,c2,c3,c4的一致性Hash环结构

新加入的缓存机器c4对应一组虚拟节点c41,c42,c43,加入到hash环后,影响的虚拟节点包括c31,c22,c11(顺时针查找到第一个节点),而这3个虚拟节点分别对应机器c3,c2,c1。即新加入的一台机器,同时影响到原有的3台机器。理想情况下,新加入的机器平等地分担了原有机器的负载,这正是虚拟节点带来的好处。而且新加入机器c4后,只影响25%(1/4)对象分配,也就是说,命中率仍然有75%,这跟没有使用虚拟节点的一致性hash算法得到的结果是相同的。

总结

The consistent hash algorithm solves the problem that a simple modulo operation cannot obtain a higher hit rate when the number of machines increases or decreases in a distributed environment. Through the use of virtual nodes, the consistent hash algorithm can evenly share the load of the machine, making this algorithm more realistic. Because of this, consistent hashing algorithms are widely used in distributed systems.

References

  1. https://en.wikipedia.org/wiki/Consistent_hashing

  2. https://www.codeproject.com/articles/56138/consistent-hashing

  3. "Technical Architecture of Large Websites - Core Principles and Security Analysis", by Li Zhihui, Electronic Industry Press

Source of this article: leehao.me https://blog.csdn.net/lihao21/article/details/54193868

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325571062&siteId=291194637