#### redis cluster cluster architecture, principle, communication protocol ####

Transfer from: https://juejin.cn/post/6844904002098823181

For personal backup only, please see the original text for browsing

 

table of Contents

Redis Cluster principle and protocol

Node status information structure

The concept of Gossip protocol

Use of Gossip protocol

Fault detection based on Gossip protocol


 

The official cluster introduces the concept of slots for data sharding, and then distributes the data slots to multiple master nodes, and the master node is configured with N slave nodes to form a multi-instance sharding version of the official cluster architecture.

Redis Cluster is a distributed cluster that can share data between multiple Redis nodes. On the server side, communication is carried out through a special protocol between nodes. This special protocol acts as a communication protocol for the management part of the middle layer. The protocol is called the Gossip Gossip Protocol.

The purpose of the distributed system consistency protocol is to solve the problem of multi-node state notification in the cluster, which is the basis for managing the cluster.

The figure shows the official cluster architecture diagram based on the Gossip protocol:

Redis Cluster principle and protocol

Node status information structure

Each node in the Cluster maintains a copy of the current state of the entire cluster in its own view , including:

  1. Current cluster status
  2. Information about the slots that each node in the cluster is responsible for, and its migrate status
  3. The master-slave status of each node in the cluster
  4. The survival status and unreachable voting of each node in the cluster

In other words, the above information is the content theme of the Nodes in the cluster to spread gossip and gossip, and it is more comprehensive, not only has its own but also others, so that everyone can pass each other, and the final information is comprehensive and accurate. The difference is Regarding the Byzantine Empire issue, the credibility of the information is high.

Based on the Gossip protocol, when the cluster status changes, such as new node joining, slot migration, node downtime, and slave promotion to a new master, we hope that these changes will be discovered as soon as possible, propagate to all nodes in the entire cluster and reach agreement. The mutual heartbeats (PING, PONG, MEET) between nodes and the data they carry are the most important way for cluster state propagation.

The concept of Gossip protocol

The gossip protocol (gossip protocol), also known as the epidemic protocol (epidemic protocol), is a protocol for information exchange between nodes or processes based on the spread of an epidemic.
It is widely used in distributed systems. For example, we can use the gossip protocol to ensure that all nodes in the network have the same data.
The gossip protocol was originally created in 1987 by Alan Demers, a researcher at Xerox's Palo Alto Research Center. www.iteblog.com/archives/25...

The Gossip protocol is already a relatively mature protocol in P2P networks. The biggest advantage of the Gossip protocol is that even if the number of cluster nodes increases, the load of each node will not increase much, and it is almost constant. This allows the scale of the cluster managed by Consul to scale horizontally to thousands of nodes.

The Gossip algorithm is also called Anti-Entropy. Entropy is a concept in physics that represents chaos, and anti-entropy seeks consistency in chaos. This fully illustrates the characteristics of Gossip: in a bounded network , Each node randomly communicates with other nodes, after a messy communication, the state of all nodes will eventually reach agreement. Each node may know all other nodes, or it may only know a few neighbor nodes. As long as these nodes can be connected through the network, they will eventually be in the same state. Of course, this is also a characteristic of the spread of the epidemic. www.backendcloud.cn/2017/11/12/...

The above descriptions are relatively academic. In fact, the Gossip protocol is not unfamiliar to the people who eat melons. The Gossip protocol has also become a gossip protocol. To put it bluntly, it is a gossip protocol. This kind of spreading scale and spreading speed are very fast, you can Experience it. Therefore, many algorithms in computers are derived from life and are higher than life.

Use of Gossip protocol

Redis clusters are decentralized. The state synchronization between each other depends on the gossip protocol. The cluster messages have the following types:

  • Meet through the "cluster meet ip port" command, the nodes of the existing cluster will send an invitation to the new node to join the existing cluster.
  • The Ping node sends ping messages to other nodes in the cluster every second. The messages contain the addresses, slots, status information, and last communication time of the two nodes they know.
  • After receiving the ping message, the Pong node will reply to the pong message, which also contains the information of the two nodes that it knows.
  • After the Fail node fails to ping a node, it will broadcast the message that the node has failed to all nodes in the cluster. After receiving the message, other nodes mark it offline.

Due to decentralization and communication mechanisms, Redis Cluster chose to be eventually consistent and basically usable.

For example, when joining a new node (meet), only the inviting node and the invited node know about this, and the rest of the nodes have to wait for the ping message to spread layer by layer. Except for Fail, which is notified immediately by the entire network, other nodes such as new nodes, re-online of nodes, election of slave nodes to become master nodes, slot changes, etc., all need to wait to be notified, that is, the Gossip protocol is an eventually consistent protocol.

Because the gossip protocol has high requirements for server time, otherwise inaccurate timestamps will affect the validity of the node's judgment of the message. In addition, the network overhead after the increase in the number of nodes will also put pressure on the server. At the same time, the number of nodes is too large, which means that the time to reach the final consistency is relatively longer, so the official recommended maximum number of nodes is about 1000. The figure shows the communication interaction diagram when the node server is newly added:

In summary, the official Redis cluster is a decentralized P2P-like network. P2P was very popular in the early years. P2P networks like eMule and BT are all P2P networks. In the Redis cluster, the Gossip protocol acts as a decentralized communication protocol, and implements the autonomous behavior of the entire cluster without central management nodes according to the established communication rules.

Fault detection based on Gossip protocol

Each node in the cluster will periodically send PING messages to other nodes in the cluster to exchange status information of each node and detect the status of each node: online status, suspected offline status PFAIL, and offline status FAIL.

Save information by yourself: When the master node A knows through the message that the master node B thinks that the master node D has entered the suspected offline (PFAIL) state, the master node A will find the clusterNode structure corresponding to the master node D in its clusterState.nodes dictionary , And add the offline report of the master node B to the fail_reports linked list of the clusterNode structure, and subsequently notify other nodes of the suspected offline status of node D through the Gossip protocol.

It is ruled together that if more than half of the master nodes in the cluster report master node D as suspected offline, then master node D will be marked as FAIL status, and master node D will be marked as offline node The FAIL message of the master node D will be broadcast to the cluster, and all nodes that receive the FAIL message will immediately update the status of the master node D in nodes to mark it as offline.

Final ruling: To mark a node as FAIL, the following two conditions must be met:

  • More than half of the master nodes mark node as PFAIL status.
  • The current node also marks node as PFAIL state.

That is to say, the current node finds that other nodes are suspected to be down, so write it in its own notebook, and wait for the notification to other good friends, so that they can see for themselves. In the end, more than half of the good friends think If the node is down, and the node itself thinks that it is down, then it is really down, and the process is relatively rigorous.

Guess you like

Origin blog.csdn.net/chushoufengli/article/details/115243527