Learn more about Redis Cluster - build a Redis cluster based on Docker and DockerCompose, handle faults, and expand capacity

Table of contents

1. Build a Redis cluster based on Docker and DockerCompose

1.1. Preface

1.2. Write shell script

1.3. Execute the shell script and create the cluster configuration file

1.4. Write the docker-compose.yml file

1.5. Start the container

1.6. Build a cluster

1.7. Using clusters

1.8. What should I do if a node in the cluster fails?

2. Cluster failure and expansion processing

2.1. Cluster fault handling

a) Fault determination

b) Failover

2.2. Cluster downtime

2.3. Cluster expansion

a) Analysis

b) Add the new master node 110 to the cluster

c) Reassign slots

Question: Can the client access the redis cluster during the process of moving slots/keys?


1. Build a Redis cluster based on Docker and DockerCompose


1.1. Preface

At the current stage, since I only have one cloud server, it is more troublesome to build a distributed system. In actual work, clusters are usually built using multiple hosts.

So here I will build a redis cluster based on docker and docker-compose (container orchestration).

Ps: Before setting up, be sure to stop the previously started redis container.

1.2. Write shell script

On Linux, files with the .sh suffix are called "shell scripts". Through this file, we can batch execute instructions that are usually executed on Linux. At the same time, we can also add conditions, loops, functions and other mechanisms. .

Here we create 11 redis nodes. The contents of these redis configuration files are similar, so we use a script to generate them in batches (you can also change them one by one manually without using a script).

for port in $(seq 1 9); \
do \
mkdir -p redis${port}/
touch redis${port}/redis.conf
cat << EOF > redis${port}/redis.conf
port 6379
bind 0.0.0.0
protected-mode no
appendonly yes
cluster-enabled yes
cluster-config-file nodes.conf
cluster-node-timeout 5000
cluster-announce-ip 172.30.0.10${port}
cluster-announce-port 6379
cluster-announce-bus-port 16379
EOF
done

# 注意 cluster-announce-ip 的值有变化,和上面分开写也是因为这个原因

for port in $(seq 10 11); \
do \
mkdir -p redis${port}/
touch redis${port}/redis.conf
cat << EOF > redis${port}/redis.conf
port 6379
bind 0.0.0.0
protected-mode no
appendonly yes
cluster-enabled yes
cluster-config-file nodes.conf
cluster-node-timeout 5000
cluster-announce-ip 172.30.0.1${port}
cluster-announce-port 6379
cluster-announce-bus-port 16379
EOF
done

for port in $(seq 1 9): represents a loop, seq is a Linux command, generates [1, 9] numbers, and assigns them to the variable port in turn.

do, done: In the shell, { } is used to represent variables, not code blocks. For for, do and done are used to represent the beginning and end of a code block (this is how programming languages ​​in ancient times were).

\: Indicates the continuation character, which means merging the content of the next line and the current line into one line. By default, the shell requires all codes to be written on one line, but you can use the continuation character to break the line and continue to add.

The content of the first loop body (the second loop body has the same idea): create 9 folders named redis1, redis2, redis3... through mkdir, and then touch redis${port} Under each folder, create a redis.conf file. The contents of the file (starting at EOF and ending at EOF) are written to each redis.conf file through cat.

String splicing: In the shell, string splicing is written directly together, without using +.

cluster-enabled yes: enable the cluster

cluster-config-file: After the node is subsequently started, the automatically generated node configuration file will configure some redis cluster information.

cluster-node-timeout 5000: The timeout period of the heartbeat packet is set to 5000 ms.

cluster-announce-ip: Indicates the IP address of the host where the current redis node is located (currently it is a host simulated using a docker container, so it should be the IP of the docker container).

cluster-announce-port: Indicates the port bound to the current redis node itself (the port in the container). Different containers can have the same port inside, and subsequent port mapping is performed, and then different ports outside the container are mapped to ports inside the container. Can.

cluster-announce-bus-port: A server can be bound to multiple port numbers. Currently, this represents the management port (the business port just mentioned above is used for business data communication), which is used to complete some management tasks. Communication (if the redis master node in a certain shard hangs up, the slave node needs to become the master node, and this needs to be done through the management port).

1.3. Execute the shell script and create the cluster configuration file

Execute the shell script through the following command

Centos executes the shell script command:

sh generate.sh

Ubuntu executes the shell script command:

bash generate.sh

After execution, you will get 11 directories, each directory has a configuration file, and the IP addresses in the configuration files are different.

1.4. Write the docker-compose.yml file

In the configuration file, you need to manually create the networks network first, and then assign the network segment to the static IP of each node in the subsequent creation of the redis cluster.

Ps: The static IP (fixed IP) is configured here for subsequent observation.

version: '3.3'
networks:
  mynet:
    ipam:
      config:
        - subnet: 172.30.0.0/24

Then create each node in the redis cluster

services:
  redis1:
    image: 'redis:5.0.9'
    container_name: redis1
    restart: always
    volumes:
      - ./redis1/:/etc/redis/
    ports:
      - 6371:6379
      - 16371:16379
    command:
      redis-server /etc/redis/redis.conf
    networks:
      mynet:
        ipv4_address: 172.30.0.101

  redis2:
    image: 'redis:5.0.9'
    container_name: redis2
    restart: always
    volumes:
      - ./redis2/:/etc/redis/
    ports:
      - 6372:6379
      - 16372:16379
    command:
      redis-server /etc/redis/redis.conf
    networks:
      mynet:
        ipv4_address: 172.30.0.102

  redis3:
    image: 'redis:5.0.9'
    container_name: redis3
    restart: always
    volumes:
      - ./redis3/:/etc/redis/
    ports:
      - 6373:6379
      - 16373:16379
    command:
      redis-server /etc/redis/redis.conf
    networks:
      mynet:
        ipv4_address: 172.30.0.103

  redis4:
    image: 'redis:5.0.9'
    container_name: redis4
    restart: always
    volumes:
      - ./redis4/:/etc/redis/
    ports:
      - 6374:6379
      - 16374:16379
    command:
      redis-server /etc/redis/redis.conf
    networks:
      mynet:
        ipv4_address: 172.30.0.104

  redis5:
    image: 'redis:5.0.9'
    container_name: redis5
    restart: always
    volumes:
      - ./redis5/:/etc/redis/
    ports:
      - 6375:6379
      - 16375:16379
    command:
      redis-server /etc/redis/redis.conf
    networks:
      mynet:
        ipv4_address: 172.30.0.105

  redis6:
    image: 'redis:5.0.9'
    container_name: redis6
    restart: always
    volumes:
      - ./redis6/:/etc/redis/
    ports:
      - 6376:6379
      - 16376:16379
    command:
      redis-server /etc/redis/redis.conf
    networks:
      mynet:
        ipv4_address: 172.30.0.106

  redis7:
    image: 'redis:5.0.9'
    container_name: redis7
    restart: always
    volumes:
      - ./redis7/:/etc/redis/
    ports:
      - 6377:6379
      - 16377:16379
    command:
      redis-server /etc/redis/redis.conf
    networks:
      mynet:
        ipv4_address: 172.30.0.107

  redis8:
    image: 'redis:5.0.9'
    container_name: redis8
    restart: always
    volumes:
      - ./redis8/:/etc/redis/
    ports:
      - 6378:6379
      - 16378:16379
    command:
      redis-server /etc/redis/redis.conf
    networks:
      mynet:
        ipv4_address: 172.30.0.108
 
  redis9:
    image: 'redis:5.0.9'
    container_name: redis9
    restart: always
    volumes:
      - ./redis9/:/etc/redis/
    ports:
      - 6379:6379
      - 16379:16379
    command:
      redis-server /etc/redis/redis.conf
    networks:
      mynet:
        ipv4_address: 172.30.0.109

  redis10:
    image: 'redis:5.0.9'
    container_name: redis10
    restart: always
    volumes:
      - ./redis10/:/etc/redis/
    ports:
      - 6380:6379
      - 16380:16379
    command:
      redis-server /etc/redis/redis.conf
    networks:
      mynet:
        ipv4_address: 172.30.0.110
 
  redis11:
    image: 'redis:5.0.9'
    container_name: redis11
    restart: always
    volumes:
      - ./redis11/:/etc/redis/
    ports:
      - 6381:6379
      - 16381:16379
    command:
      redis-server /etc/redis/redis.conf
    networks:
      mynet:
        ipv4_address: 172.30.0.111

- subnet: 172.30.0.0/24: Here 172.30.0 is the network number, and the ip is the internal network ip. This requires that it cannot conflict with other existing network segments on your current host (everyone’s existing network segment on the host) paragraph, the specifics may not be the same).

ipv4_address: 172.30.0.101: Configure static IP here. Note that the network number part must be consistent with the previous network. The host number part can be configured arbitrarily between 1 and 255, as long as it is not repeated. However, here we need to correspond to the cluster-announce-ip written in the configuration file before. on, as shown below

1.5. Start the container

Start all containers configured in yml through docker-compose up -d.

1.6. Build a cluster

Here, the first 9 hosts are built into a cluster, 3 masters and 6 slaves. The last 2 hosts are temporarily unused.

Just build it with the following command

redis-cli --cluster create 172.30.0.101:6379 172.30.0.102:6379 172.30.0.103:6379 172.30.0.104:6379 172.30.0.105:6379 172.30.0.106:6379 172.30.0.107:6379 172.30.0.108:6379 172.30.0.109:6379  --cluster-replicas 2

--cluster create: Indicates creating a cluster. Fill in the IP and address of each node (make sure the IP of this command is consistent with the actual environment).

--cluster-replicas 2: Indicates that each master node requires two slave node backups. After this configuration is set, redis will know that 3 nodes are in a group (on one shard), a total of 9 nodes, a total of 3 shards.

After entering yes, as follows

1.7. Using clusters

Now there are nine nodes from 101 to 109, which is a cluster. Using a client to connect to any node is essentially equivalent (each cluster stores a part of the "full set" data, and when connecting to any client With the -c option, the entire set of data can be accessed ).

After establishing the cluster, you can connect to the client through -h -p, -h, or directly connect to the external port through -p, as follows (the following are all connected to 172.30.0.103:6379):

View the current cluster information through cluster nodes.

Store data in the cluster

The reason for the error in the above picture is that after hash calculation of the key k1, the slot is 12706. This slot number belongs to shard No. 3 in the cluster information just viewed.

 The error message prompts us to forward the client request to the node 103.

Wouldn't this be troublesome?

In fact, we can add the -c option when starting redis-cli. At this time, the redis client will automatically find the matching shard host based on the slot number calculated by the current key and further complete the operation.

Pay attention to the picture above. After redirection, the client connected to redis will also change.

In addition, if you try to write operations on the slave node, it will also be automatically redirected to the specified master node.

Ps: In fact, the redis-related commands mentioned before are basically applicable (except for a few, such as mset, mget... which can operate multiple keys, are not available. Because the keys may be scattered on different shards).

1.8. What should I do if a node in the cluster fails?

What if the downed node is a slave node? It's okay~

What happens if the main node crashes? The write operation cannot be performed! At this time, the work done by the cluster is somewhat similar to that done by the sentinel. It will automatically select one of the slave nodes under the master node and promote it to the master node.

Here I let the main node redis1 hang up.

docker stop redis1

Before redis1 hangs up, the cluster information is as follows:

After redis1 hangs up, the cluster information is as follows:

It can be seen that the cluster mechanism can also handle failover.

2. Cluster failure and expansion processing


2.1. Cluster fault handling

a) Fault determination

1. Each node, every second, will send ping packets to some random nodes (this includes cluster configuration information, such as ID, which shard it belongs to, whether it is a master node or a slave node, and which slots it holds.. ....), the received node will return a pong packet. The ping packet here is not sent all at once. This setting is to avoid that when there are many nodes, there are also a lot of heartbeat packets, which seriously consumes network bandwidth.

2. When node A sends a ping packet to node B, and B cannot respond as expected, A will reset the TCP connection with B. If the reconnection fails, A will set B to PFAIL (equivalent to subjective offline) .

3. After A determines that B is PFAIL, it will communicate with other nodes through the built-in Gossip protocol of redis and confirm the status of B with other nodes.

4. If A finds that there are many other nodes that also think B is PFAIL, and the number exceeds the number of clusters, then A will mark B as FAIL (equivalent to objective offline), and synchronize this message to other nodes. , let other nodes also mark B as FAIL.

b) Failover

First there will be a judgment:

  • If B is a slave node, there is no need for failover.
  • If B is the master node, then B's slave nodes (such as C and D) will trigger failover.

concrete:

1. The slave node determines whether it is qualified to participate in the election. If the slave node has not communicated with the master node for too long (no data synchronization for too long, the difference is too big), it will lose the qualification to run.

2. Qualified nodes, such as C and D, will sleep for a certain period of time first. Sleep time = 500ms basic time + [0, 500ms] random time + ranking * 1000ms. The larger the offset value (indicating that the data is closer to the master node) ), the higher the ranking (the shorter the sleep time), that is to say, the sleep time mainly depends on the ranking .

3. If C's sleep time is up at this time, C will canvass votes for all nodes in the cluster, but only the master node is eligible to vote . (Whoever has a shorter sleep time will most likely be the new master node).

4. The master node will vote for C (each master node has only 1 vote). When the number of votes received by C exceeds half of the number of master nodes, C will be promoted to become the master node (C executes slaveof no one by itself, And let D execute slaveof C).

5. At the same time, C will also synchronize the news that it has become the master node to other cluster nodes, and everyone will update their saved cluster structure information.

6. Finally, if the previously downed host point is restored, it will become a slave node and connected to the cluster.

2.2. Cluster downtime

Cluster downtime will occur in the following three situations:

For a certain shard, all master nodes and slave nodes are down. At this time, the shard cannot provide data services.

For a certain shard, the master node is down, but there are no slave nodes and data services cannot be provided.

If more than half of the master nodes are down, it means that the cluster has encountered a very serious situation, and it is necessary to stop and check to see if there is any problem!

Ps: If a node in the cluster is down, no matter what node it is, our programmers should handle it as soon as possible (at the latest, it should be handled before going to work the next day).

2.3. Cluster expansion

a) Analysis

The above operations have combined 101 ~ 109 9 hosts into a cluster with 3 masters and 6 slaves.

Next, in order to demonstrate the expansion, 110 and 111 are also added to the cluster.

Take 110 as the master and 111 as the slave, and split the data from 3 -> 4.

b) Add the new master node 110 to the cluster

redis-cli --cluster add-node 172.30.0.110:6379 172.30.0.101:6379

add-node: The first ip and port number indicate the new node, and the second ip and port number indicate any node on the cluster (any one will do, as long as it is a node in a cluster you want to join) (that's it), indicating which cluster the new node should be added to.

Afterwards, you can see through cluster nodes that the redis10 master node has joined the cluster, but slots are not assigned.

c) Reassign slots

Take out the slots from the previous three sets of masters and assign them to the new master 

redis-cli --cluster reshard 172.30.0.101:6379

After entering the command, the status of each machine in the current cluster will be printed out, and then the user will be asked to enter the number of slots to be divided.

There are 4 shards, a total of 16384. Divide by 4 to get 4096, so just fill in 4096 here (4096 slot numbers are divided into redis10).

Immediately afterwards, it will ask which node to receive, just paste the id of the redis10 host.

Next, let you choose which nodes to branch out slots from:

  1. all: means to click from every other master holding slots.
  2. Manually specify to move slots from one or several nodes (ending with done).

After entering all, the actual transportation will not be carried out, but the transportation plan will be given first.

After entering yes, the transfer actually begins. At this time, not only the slots are re-divided, but the corresponding data on the slots will also be transferred to the new host . (This is a relatively heavyweight operation)

d) Add slave nodes to the new master node

redis-cli --cluster add-node 172.30.0.111:6379 172.30.0.101:6379 --cluster-slave

 

After the execution is completed, the slave node has been added.

 

Question: Can the client access the redis cluster during the process of moving slots/keys?

We have learned about the hash slot partitioning algorithm before, and we can know that most keys do not need to be moved. For these keys that have not been moved, they can be accessed normally at this time. For keys that are being moved, access errors may occur . situation .

Assume that the client accesses k1, and the k1 obtained by the cluster through the sharding algorithm is the data of the first shard, and it will be redirected to the node of the first shard. Then there is a possibility that after the redirection, k1 Once it has been moved, it will naturally become inaccessible.

If you want to expand the capacity for a production environment, you still have to take your time. For example, find a time in the dead of night when there are no clients to access the cluster, and then expand the capacity to minimize the loss.

Obviously, if you want to pursue higher availability and make expansion have less impact on users, you need to build a new set of machines, rebuild the cluster, import the data, and use the new cluster to replace the old cluster (but the cost is the highest ).

Ps: Regarding the reduction of the cluster, it is to remove some nodes and reduce the number of shards.

However, it is generally expanded and rarely reduced.

Guess you like

Origin blog.csdn.net/CYK_byte/article/details/132940548