RedisCluster cluster implementation principle

1. RedisCluster architecture

To support clusters, we must first overcome the problem of data fragmentation, that is, the problem of consistent hashing. Common solutions include the following:

  1. Client sharding
    uses a method similar to hash modulus. When the client fully grasps and controls the number of servers, it can be used simply
  2. The middle layer sharding
    adds a middle layer between the client and the server to act as a manager and scheduler. The client's request is sent to the middle layer. The middle layer realizes request forwarding and response recovery. The most important role of the middle layer is Dynamic management of multiple servers
  3. Server-side sharding
    implements a decentralized management model. The client directly requests any node in the server. If the requested node does not have the required data, it will reply MOVED to the client and tell the client the storage location of the required data , This process is actually done by the client and server working together to redirect the request

From Redis 3.0 版本then, Redis supportRedisCluster clusterAs a distributed solution. The cluster usesMultiple masters and multiple slavesStructure, realized 服务端的分片技术using a plurality of Master节点stored data and the entire cluster state, by increasing and decreasing Master节点can achieve increase / decrease the data capacity Redis object so well supportedHorizontal expansion. Official portal

Insert picture description here

The structural characteristics of Redis Cluster are as follows:

  1. Multiple Master nodes are supported in the cluster, and each Master node can also have multiple Slave nodes. All nodes are connected to each other (based on PING-PONG mechanism), and use binary protocol communication internally
  2. When more than half of the Master nodes in the cluster detect a node failure, the node is determined to be offline
  3. The client is directly connected to any available Master node in the cluster
  4. The cluster has a built-in automatic data sharding mechanism that maps all the keys in the cluster to 16384a Slot. Each Master node will record which Slots are assigned to itself and which are assigned to other nodes. This mechanism makes the addition and removal of nodes in the cluster very simple. Adding a Master node will move the Slot of other nodes over, and reducing a Master will move its Slot to other nodes. The cost of moving Slots is very high. low
  5. The client connects to any Master node in the cluster to send commands, but before the command is executed, it will CRC16(key)%16384locate a slot according to the key use . If the slot is not within the scope of the current Master node, the current node will be responsible for the node address of the Slot Return to the client, the client will automatically resend the original request to the target node after receiving it

Reasons why RedisCluster is designed into 16384 Slots

Refer to the author's answer , mainly consider the following aspects

  1. The size of the heartbeat packet
    Because redis nodes need to send a certain number of ping messages as heartbeat packets regularly, the most space is occupied in the message header of the packet myslots[CLUSTER_SLOTS/8].If the number of slots is too large, such as 65536, the size of this block is: 65536÷8=8kb, the header of the ping message is too large, which wastes bandwidth
  2. The number of cluster nodes is limited.
    The more cluster nodes there are, the more data is carried in the heartbeat packet. If the number of nodes exceeds a certain limit, it will also cause network congestion, which will make the time for nodes in the cluster to reach final consistency relatively long, so it is not recommended The number of nodes in the cluster exceeds 1000. In this case, 16384 Slots are sufficient
  3. Transmission efficiency In
    the configuration information of the Redis master node, the slot responsible for it is saved by a bitmap. In the network transmission process, bitmap will be compressed, but if the bitmap fill rate slots/N(N表示节点数)is high, then the bitmap compression rate is very low. In the case of a limited number of nodes, the number of Slots as small as possible can increase the compression rate of the bitmap, thereby reducing network traffic

2. Data consistency of nodes in the cluster

2.1 RedisCluster communication mode between nodes

Each Master node in the RedisCluster cluster will use two ports, one is the port that provides services, such as 6379, and the other is the port number plus 10000, such as 16379. This port number isSpecially used for inter-node communication, Which is the cluster bus

RedisCluster does not store 集群元数据(节点信息,故障信息,节点的增加和移除,slot信息等)centrally on a single node, but uses Gossip协议continuous communication between all nodes to maintain the consistency of metadata on each node in the cluster.Specifically, every node will send PING messages to several other nodes at regular intervals, and other nodes will return PONG messages after receiving PING to exchange metadata with each other

  • Advantages of Gossip
    Metadata updates are scattered, instead of being concentrated on one node, update requests will be sent to all nodes one after another to update, there is a certain delay, reducing the pressure
  • Gossip Disadvantages
    There is a delay in metadata updates, which may cause some changes in the cluster to have a period of synchronization lag on each node

2.2 Gossip protocol

Gossip协议Also known as 流言协议, it is a means of synchronizing the states of each node in a Redis cluster. The way it works is very simple to add a new node (Meet), for example, initially only send out invitations nodes and nodes are invited to know this, but a diffusion layer by Ping message, the other node is also notified, so Gossip协议to achieve Final consistency

The Gossip algorithm is also called anti-entropy (Anti-Entropy), which is to seek consistency in chaos. This also illustrates the characteristics of Gossip: in a bounded network, each node may know all other nodes, or only a few A neighbor node, as long as these nodes can be connected through the network, and each node communicates with other nodes randomly, then after a period of communication, the state of all nodes will eventually reach agreement

The cluster messages of RedisCluster mainly have the following types:

  1. Meet
    through the cluster meet ip portcommand, a node of the existing cluster will send an invitation to the newly joined node to join the existing cluster
  2. Ping
    each node will frequently send Ping to other nodes, each time it will select 5 other nodes that have not communicated for the longest time. If it is found that the communication delay of a certain node has reached cluster_node_timeout/2, then immediately send Ping to avoid data inconsistency for a long time. cluster_node_timeoutIt can be used to adjust the frequency of Ping. If the value is set to a larger value, the number of Pings will be reduced. The Ping message contains the node's own state and the cluster metadata maintained by it, and also sends out with at least two known other node information for data exchange
  3. The Pong
    node will reply to the Pong message after receiving the Ping message, the message also contains the information of its own node and other known node information
  4. After
    a node determines that a node is Fail,Will immediately broadcast the message that the node has died to all nodes in the cluster, Other nodes mark the specified node offline after receiving the message
  5. Publish
    When a node receives a PUBLISH command, the node will execute this command and broadcast a Publish message to the cluster. All nodes that receive the message will execute the same PUBLISH command
  6. FAILOVER_AUTH_REQUEST
    When the Master node enters the Fail state, the Slave sends messages to all nodes in the cluster, but only the Master can vote for the Slave to failover its Master
  7. FAILOVER_AUTH_ACK
    When the Master receives the FAILOVER_AUTH_REQUEST message, if the Slave node meets the voting conditions and the Master does not vote in the current epoch, it will return a FAILOVER_AUTH_ACK message to the Slave to vote

3. Principles of High Availability

The principle of RedisCluster high availability is very similar to the sentinel mechanism. The main principles are as follows:

  1. Master node downtime judgment
    If a node pings another node, this node cluster-node-timeouthas not returned pong within the timeout period, then it is considered as PFAIL,Subjective downtime
    If a node thinks that a certain node is PFAIL, it will broadcast this information to other nodes in the ping message. If 多个节点(N/2 + 1)all think the target node is PFAIL, then this node will become FAIL.Objective downtime
  2. Slave node switch to Master node.
    For the down Master node, you need to select one of its slave nodes to switch to Master node. This process is roughly divided into 3 steps:
    1. Filter the available slave nodes. The
      specific method is to check the time that each slave node is disconnected from the Master node. If it exceeds cluster-node-timeout * cluster-slave-validity-factor, then the slave node is not eligible to switch to Master.
    2. Determining a candidate node priority
      after filtered Slave node available from all nodes need to be sorted, sorting factor replication offset offset , each time a elections, elections forward in time from the node Slave node priority according to the priority setting election
    3. Elected Master node
      Slave node cluster currentEpoch own records plus 1, and broadcast Failover Requestinformation. When other nodes receive this information, only the Master node will judge the legitimacy of the requester and send FAILOVER_AUTH_ACKan election vote for the Slave node, and only send an ack once per epoch. The Slave node collects FAILOVER_AUTH_ACK. If 大部分Master节点(N/2 + 1)they vote for the Slave node, then the election is passed, and the Slave node will perform the active/standby switch, switch to the new Master node, and broadcast to notify other nodes in the cluster

Guess you like

Origin blog.csdn.net/weixin_45505313/article/details/107509714