Good Java programmer training take you five minutes to understand the consistency of hash algorithm

Good Java programmer training take you five minutes to understand the consistency of hash algorithm design goals consistent hashing algorithm to solve the hot issues of the Internet, it is now widely used in distributed systems.

For example, for load balancing issues, scalable algorithms for hash value modulo difference, when to increase or decrease the server, mapping relationships may be a problem, the use of consistent hash algorithm can better solve the problem.

Hash value modulo arithmetic problems

For example, we have a flood of pictures stored on the server, if there are now four servers, we can picture the name, using hash algorithm to determine which server in the picture storage

12557379-37b3b9f3fbeb97c9.png

If we need to increase the server, then the algorithm server access picture will change, such as adding one server, the algorithm becomes hash (a.jpg) / 5, this time the results are not necessarily or two, then the picture position will change. Similarly, reducing the server, the same problems will exist. Moreover, all servers will be affected.

Consistency Hash Algorithm

Consistency Hash algorithm space hash value mapping represented as a virtual ring, you can set the general range mapping values ​​is 0 ---- 232-1 that is to say, hash value we want to get to 232 modulo. The hash ring may be represented as follows:

12557379-fa585c66fde6e434.png

If we have four servers, we can choose ip or host name of the server as a hash key, then modulo, each machine can be determined hash fixed position on the ring. As shown below:

12557379-89a4500f7815877d.png

For example, Object A, Object B, Object C, Object D four data, and after hashed modulo spatial position on the ring as shown below:

12557379-de01b3fee8a43e55.png

From this position the ring in the clockwise direction, "walking", which should be targeted to the server is first encountered by the server. That is Object A target Node A, Object B is positioned to the Node B, Object C is positioned to Node C, Object D target Node D.

If Node C issues This server downtime occurs, the target Node D Objcet C this server, when a server problem, only affect the front of the machine in the clockwise direction, in this embodiment, only Node D will have an impact.

Similarly, if additional server Node X, are calculated and positioned as shown in the following figure:

12557379-f0c56d11962caf4a.png

Object C will then navigate to the Node X, this situation will only have an impact on Node C in a clockwise direction, it does not affect other servers.

The disadvantage of consistency Hash

When a server node is relatively small when there will be consistency hash algorithm tilt problems (most of the data exist on a single server). Without changing the number of server nodes premise general solution is to increase the virtual nodes (i.e., conformability hash algorithm plurality of values, each positioning calculation result at a service node for each server on the ring), the positioning data when, according to the virtual node can navigate to the actual server.

to sum up

Consistent hashing algorithm to increase or decrease the nodes are only a small portion of the data re-positioning loop space, has good fault tolerance and scalability.

Reproduced in: https: //www.jianshu.com/p/45f020b6ced9

Guess you like

Origin blog.csdn.net/weixin_34405557/article/details/91276129