[Redis study notes (13)] Redis cluster detailed explanation

This article is published by the official account [Developing Pigeon]! Welcome to follow! ! !


Old Rules-Sister Town House:

One. Cluster

(I. Overview

        Redis cluster is a distributed database solution provided by Redis. The cluster uses sharding to share data and provides replication and failover functions.

(Two) node

1. Connect

        A Redis cluster usually consists of multiple nodes, and these nodes need to be connected to form a cluster containing multiple nodes. The work of connecting each node is completed by the CLUSTER MEET command, the format is as follows:

cluster meet <ip> <port>

        Sending a cluster meet command to a node node allows the node node to shake hands with the node specified by ip and port. When the handshake is successful, the node node will add the node specified by ip and port to the cluster where the node node is currently located.

2. Start the node

        A node is a Redis server running in cluster mode. When it is started, it will decide whether to enable the cluster mode of the server according to the cluster-enabled configuration item. The node will continue to use the server components used in the stand-alone mode, and will also use the new structure to save the data used in the cluster mode.

3. Cluster data structure

        The clusterNode structure saves the current state of a node, such as the node's creation time, name, configuration epoch, etc. Each node creates a corresponding clusterNode structure for itself and other nodes in the cluster. At the same time, each node also saves a clusterState structure, which records the current state of the cluster from the perspective of the current node, such as whether the cluster is offline, the number of nodes, and so on.


4. Connection process

        In fact, it is a three-way handshake process. When sending the cluster meet command to node A, node A will add another node B to the cluster where A is located.

        Node A receiving the command will shake hands with B to confirm the existence of each other, and then A will create a clusterNode structure for B and add it to its clusterState.nodes dictionary. After that, node A sends a meet message to B;

        B receives the message, creates a clusterNode structure for A, and also puts it in its own clusterState.nodes dictionary, and then returns a PONG message to A;

        Node A receives the PONG message until B has received the meet message sent by itself, and then sends a PING message to B;

        B receives the PING message until A has received the PONG message returned by itself, and the handshake is completed.

        After that, A will propagate B's information to other nodes in the cluster through the Gossip writing protocol, allowing other nodes to shake hands with B. After a period of time, B will be recognized by all nodes in the cluster.


(3) Slot assignment

1 Overview

        The Redis cluster saves key-value pairs in the database by sharding. The entire database of the cluster is divided into 16384 slots. Each key belongs to one of these slots. Each node in the cluster can handle 0 or up to 16384 slots. When all slots are processed by nodes, the cluster is online. If any slot is not processed, the cluster will be offline.


2. Slot assignment

        By sending the cluster addslots command to the node, one or more slots can be assigned to the node to be responsible;

cluster addslots <>

3. Record the slot assignment information of the node

        The slots attribute and numslot attribute of the clusterNode structure of each node record which slots the node is responsible for processing and the number of slots. The slots attribute is a binary bit array. The length of this array is 2048 bytes and contains 16384 bits. Therefore, as long as each index bit i is 1, it means that the node is responsible for processing slot i. Then when the node wants to find whether it When processing a slot, only O(1) time complexity is required.


4. Propagate the slot assignment information of the node

        In addition to recording the slots it is responsible for processing in the clusterNode structure, a node will also send its own slots array to other nodes in the cluster through messages, and other nodes will update their clusterState.nodes dictionary after receiving the message. The corresponding node of the node, and update or save the slots array in the node.


5. Record the assignment information of all slots in the cluster

        The slots array in the clusterState structure records the assignment information of all slots in the cluster, and each array item is a pointer to the clusterNode structure, representing the assignment information of each slot. If the program wants to know whether a certain slot i is assigned, it only needs to access the index i of the array, which is O(1) time complexity.


(4) Executing commands in the cluster

1. Send a command

        After the cluster is online, the client can send data commands to the nodes in the cluster. When the client sends a command related to the database key to the node, the node receiving the command will calculate which slot the database key to be processed by the command belongs to and check Is this slot assigned to yourself? If this slot is assigned to itself, execute the command directly; otherwise, the node will return a MOVED error to the client, direct the client to the correct node, and send the previous command again.


2. Key slot allocation algorithm

        The node uses the following algorithm to calculate which slot a given key belongs to:

CRC16(key) & 16383

        CRC-16 is the checksum of the key, and & 16383 is the slot number that calculates an integer between 0 and 16383 as the key. Use the cluster keyslot <key>command to view which slot a key belongs to.


3. MOVED wrong format

MOVED <slot> <ip>:<port>

        Slot is the slot where the key is located, and ip and port are the IP and port number of the node responsible for processing the slot. The steering operation is to change a socket to send a command. A cluster client usually creates a socket connection with multiple nodes in the cluster.

        Note that when the redis-cli client in cluster mode receives a moved error, it will not print the moved error, but will automatically redirect it, while the stand-alone client does not know the function of the moved error and will print it directly.


4. Implementation of Node Database

        The node can only use database 0, and the stand-alone Redis server does not have this restriction.

(5) Re-fragmentation

1 Overview

        The resharding operation of the Redis cluster can assign a slot that has been assigned to a node to another node, and the key-value pair that the relevant slot belongs to will also be moved from the source node to the target node. The resharding operation can be performed online, and the two nodes can still process command requests.


2. Principle

        The redis cluster re-sharding operation is executed by Redis's cluster management software redis-trib. Redis provides all the commands required for re-sharding, and redis-trib is performed by sending commands to the source node and the target node. Re-sharding operation. Proceed as follows:

        (1) Send a command to the target node, ready to import the key-value pairs belonging to the slot from the source node;

        (2) Send a command to the source node, ready to migrate the key-value pairs belonging to the slot to the target node;

        (3) Send a command to the source node to obtain the key names of up to count key-value pairs belonging to the slot;

        (4) For each key name in step 3, send a migration command to the source node to migrate to the target node;

        (5) Repeat (3) and (4) until all the key-value pairs belonging to the slot in the source node are migrated to the target node;

        (6) Send a command to any node in the cluster, assign the slot to the target node, and then send it to the entire cluster;


(6) ASK error

1 Overview

        During the re-sharding process, when the source node migrates a slot to the target node, this situation may occur: some key-value pairs belonging to the slot are stored in the source node, and some key-value pairs are stored in the target node. At this time, if the client sends a command related to the slot, the source node will first look up the corresponding key in its own database. If it is not found, it will return an ASK error to the client, directing the client to switch to the slot being imported. If the target node receives the wrong ASK client, it will turn to the target node, then send an ASKING command, and send the previous command again.


2. ASKING command

        The only thing the ASKING command does is to open the redis_asking identifier of the client that sent the command, and then carry this identifier to access the command about slot i in the target node. If the node is importing slot i, the node will execute this command as an exception Once, then it will refuse to execute this command and return a MOVED error.


3. The difference between ASK error and MOVED error

        The MOVED error means that the responsibility of the slot has been transferred from one node to another node, and the ASK error is just a temporary measure used by the two nodes in the process of migrating the slot. After the client receives the ASK error about slot i , The client will only send the command request for slot i to the node specified by the ASK error in the next command request, but it will not affect the client's future command request for slot i.


(7) Replication and failover

1 Overview

        The nodes in the Redis cluster are divided into master nodes and slave nodes. The master node is used to process slots, and the slave node is used to replicate a certain master node. When the replicated master node goes offline, it replaces the offline master node to continue processing command requests .

2. Set up the slave node

        Send a command to a node:

cluster replicate <id>

        The node that receives the command can be called the slave node of the node specified by id, and start to copy the master node, which is equivalent to sending the slaveof command to the slave node. The information that a node becomes a slave node and starts to replicate a certain master node will be sent to other nodes in the cluster through a message.

3. Fault detection

        Each node in the cluster will periodically send PING messages to other nodes in the cluster to check whether the other node is online. If the node receiving the PING message does not return the PONG message within the specified time, then the node sending the PING message will The node receiving the PING message is marked as suspected offline. Each node in the cluster exchanges the status information of each node in the cluster by sending messages to each other. When a master node A learns through a message that the master node B thinks that the master node C is suspected to be offline, A will offline B The report is added to the structure of C. If more than half of the master nodes responsible for processing slots in a cluster report a master node as suspected offline, then this master node will be marked as offline and broadcast to the cluster .

4. Failover

        If a slave node finds that the master node that it is replicating has entered the offline state, the slave node will start failover to the offline master node. The steps are as follows:

        (1) Copy all slave nodes and select one slave node;

        (2) The selected slave node executes the slave of no one command and becomes the new master node;

        (3) The new master node will revoke all slot assignments to the offline master node and assign them all to itself;

        (4) The new master node broadcasts a PONG message to the cluster to let other nodes know;

        (5) The new master node starts to accept command requests related to the slot that it is responsible for processing, and the failover is completed;


5. Election of a new master node

        It also uses the leader election method based on the Raft algorithm, the same as the leader election Sentinel.


(8) News

1 Overview

        Each node in the cluster communicates by sending and receiving messages. The messages sent by the nodes mainly include 5, MEET, PING, PONG, FAIL, and PUBLISH messages.


2. MEET news

        When the sender receives the cluster meet command sent by the client, the sender sends a meet message to the recipient, requesting the recipient to join the cluster.


3. Ping message

        By default, each node in the cluster randomly selects 5 nodes from the known nodes every 1s, and sends the PING command to the node that has not sent the PING command for the longest time.


4. PONG news

        In order to confirm to the sender that this MEET message or PING message has arrived, the recipient will send a PONG message. In addition, a node can also send its own PONG message to the cluster to refresh other nodes' cognition.


5. FAIL message

        When a master node judges that another master node enters the FAIL state, it will broadcast a FAIL message about B to the cluster.


6. PUBLISH message

        When a node receives a PUBLISH command, it will execute this command and broadcast a PUBLISH message to the cluster. All nodes that receive the message will execute the same PUBLISH command.


7. gossip protocol

        Each node in the Redis cluster exchanges information about the status of different nodes through the Gossip protocol. The Gossip protocol is implemented by three messages: MEET, PING, and PONG.

Guess you like

Origin blog.csdn.net/Mrwxxxx/article/details/114377508