Hash Hash law and take the remainder consensus algorithm

Hash Hash law and take the remainder consensus algorithm

A, Hash take over law

Hash modulo method is actually very simple, as long studied the data structure of the hash table will probably know (commonly used algorithms)
Hash use to take over law in the distributed system

What method is the more than 1.1 Hash take?

For simplicity of illustration, not to say micro-services, redis things like the (The following example is not in line with the current development model, but suffice it to take the remainder Hash can be resolved and the problems caused by)
the following figure as an example.
Tomcat each of the following is a single application, set up the back-end application clusters by nginx load balancing. In order to be able to attain a proper session, by taking more than nginx algorithm to achieve load balancing purposes

Load balancing algorithm: Number Hash (ip address) tomcat i.e.% ( "ip address" .hashCode & Integer.MAX_VALUE)% Total tomcat
through Hash algorithm to obtain a hash value of the ip address (derived an integer), then it is calculated by taking, as tomcat here for three, so the remainder here is 3, then the ip address each visit always be assigned to the same tomcat up

Advantages:
1), this algorithm is its simplicity and high efficiency (CPU does best is to calculate)
2), hash value of the IP address is always the same, can solve every request is load balancing tomcat lead to another session failure of issue

Disadvantages:
1), when the increase / decrease of the number of tomcat, will lead to a large number of session failures.
So we can count on, at the beginning of the three tomcat kept the session, because if three tomcat withstand the pressure of excessive access, you need to add a tomcat, then the number of tomcat will become four, then there will be such a problem, Hash (ip address) 4%, then all the requests will be re-hash of, access tomcat will become different, this will lead to a large number of session failures. And did not increase before tomcat, the probability of hitting the same tomcat is 1/4 = 25%, then there will probably say 75% probability of failure of session. If the same tomcat tomcat downtime reduced one, from the previous three became two, then there will be a 50% probability of failure resulting in session.

In order to solve such a large number of session failures, it introduced the concept of consistency hash algorithm
Here Insert Picture Description

Two, Hash consensus algorithm

2.1 What is a Hash algorithm consistency

Hash consistency algorithm is a theory proposed in 1997 by the Massachusetts Institute of Technology.
Hash algorithm is this consistency, we see it as a Hash value that the whole circle of the largest Hash value of 2 ^ 32-1 (approximately 43 million), the minimum is 0, and it is fixed at 43 more than one hundred million + 1 = 0,0-1 = over 4.3 billion (unsigned int here, much like the C language). Hash values so that the whole is a large circle of.
Here Insert Picture Description
Here I Hash is calculated by taking more than the law, the location where the three tomcat. That Hash (tomcat's ip)% 4.3 billion
can draw the following distribution
Here Insert Picture Description
when there is a request, ip address we will request the hash, that is Hash (IP address of this request)% 4.3 billion
for example, has three requests.
1 respectively request, request 2, 3 request. At this time we require, request a clockwise direction will find our server tomcat
then the request will find 3 tomcat1, 1 request will find tomcat2, Request 2 will find tomcat3
Here Insert Picture Description
so at present to solve the problem?
When there are nine requests, it will by Hash algorithms, hash assigned to 0-43 billion in position, and the recent visit clockwise tomcat server. Shaped as shown in FIG.
Here Insert Picture Description
So when tomcat1 is down, all requests will turn to tomcat1 to visit tomcat2. While all requests before the session tomcat1 will fail, but before accessing tomcat2 all requests and tomcat3, and still access the server previously visited, so the session will not fail.
Before taking over the comparative hash method, because a new tomcat tomcat or downtime, leading to a large number of session failures. Hash algorithm will not be much more consistent better? When the Add / decrease enough tomcat, Hash take more than the difference between the two Hash and consistency it will become increasingly apparent.

那么当新增tomcat,hash一致性会出现怎样的问题呢?
可以看出,新增一台tomcat4,所有部分原本访问tomcat1的请求都转向访问tomcat4了。但这些影响只是一小部分的。所以影响并不是特别大。可容错性就变强了
Here Insert Picture Description

三、Hash一致性算法的平衡性问题

看完Hash一致性算法大家可能都会产生一个疑问。Hash一定就会散列分布嘛,总会小概率的出现一大部分的请求都分配给了tomcat1,而tomcat2和3都只分配了小部分啊。或者说太少的tomcat经过hash的散列算法分布情况不一导致部分服务器太闲,部分服务器很忙的情况出现。形如下图。tomcat2承受了大量的请求,而tomcat3和1闲的要死
Here Insert Picture Description
为了解决这种平衡,这里引出了一种虚拟节点的概念
首先tomcat1、tomcat2、tomcat3都是真实节点
我们可以虚拟出tomcat1-1、tomcat1-2、tomcat2-1、tomcat2-2、tomcat3-1、tomcat3-2的虚拟节点
所有访问到达tomcat1-1和tomcat1-2的请求,都会转向tomcat1中去。同理2-1和2-2会访问tomcat2,到达虚拟节点的请求会转向访问其对应的真实节点。形如下图
Here Insert Picture Description
引入虚拟节点的概念后,我们其实就可以在这里大做文章了。譬如权重的分配(tomcat1的硬件服务器性能比较好,那么分配的虚拟节点就多一些,tomcat2和3硬件服务器性能都不咋地,就分配的虚拟节点少一些)

Here you may also have a question: add nodes / delete nodes so there will always be part of the request will turn to other servers ah, then there will always be part of the session will fail ah. This is how to do?
In fact, think of it, I do not know if you have not played dnf, when the server is not too much pressure will cause some players when dropped? Then just log back just fine (this is the original access server downtime, then turned to access other servers), when the number of players too much, in fact, downtime will result in a small number of people dropped, not a lot of problems. The small problem is difficult to avoid the ...

Fourth, if after reading this article still do not understand. Recommend this article

Recommend a plain consistency Hash algorithm reference article
https://blog.csdn.net/bntX2jSQfEHy7/article/details/79549368

Published 23 original articles · won praise 15 · views 10000 +

Guess you like

Origin blog.csdn.net/TanJiaLiang_/article/details/104316537