Redis4.0 single-node cluster to three-master three-slave node cluster experiment

Environment related:

OS:CentOS release 7.4.1708
IP:192.168.77.101/102/103
MEM:16G
DISK:50G

Brief description

Refer to "Redis4.0 Three-Master Three-Standby Cluster Installation and Configuration" to configure 6 redis instances on three-node hosts and start them.
The fifth step is to configure the cluster using the operation of this blog post.

Cluster creation

1. Create a single-instance cluster on node one:

cp -av /usr/local/redis/src/redis-trib.rb /usr/local/bin
redis-trib.rb create --replicas 0 192.168.77.101:7000
# 创建一个单实例集群会报错,因为redis的集群需要至少3个master实例
# 该报错可以忽略掉,使用fix命令进行修复

write picture description here

redis-trib.rb fix 192.168.77.101:7000
# 使用fix命令修复集群,需要交互式输入yes进行确认

write picture description here
write picture description here

redis-cli -c -h 192.168.77.101 -p 7000 cluster nodes
redis-trib.rb check 192.168.77.101:7000
# 查看集群状态,目前集群只有一个master实例,所有的数据槽都在该实例上
# 单实例集群创建就完成了

2. Add two master nodes:

redis-trib.rb add-node 192.168.77.102:7002 192.168.77.101:7000
redis-trib.rb add-node 192.168.77.103:7004 192.168.77.101:7000
# 使用add-node命令,将102的7002和103的7004添加到集群里
redis-trib.rb check 192.168.77.101:7000
# 查看集群状态,发现目前集群中有三个master实例,但是所有的slot都在第一个master之上
# 此时,新加入的两个master实例没有任何的slot,是可以直接删掉的
redis-trib.rb reshard 192.168.77.101:7000
# 使用reshard命令重新分配slot,需要交互输入重新分配的slot数量和接收实例以及来源实例
# 两次使用该命令,将新添加的两个实例都分配一定数量的slot

write picture description here
Enter the number of slots that need to be reallocated
write picture description here
Enter the ID of the instance receiving these slots
write picture description here
Enter the source instance ID of these slots to
write picture description here
confirm the move

If it is a real production environment, when moving data slots, it may fail to time out, such as moving 1000 data slots to another master instance. The
reason is that one or some data slots occupy a large amount of memory, and the move operation times out. You need to use the fix command to fix it. A
timeout failure will not affect the cluster, and the status of a data slot that has been moved successfully is fine.
You can redistribute the data slots after repairing to reduce the number of data slots and prevent another timeout failure.

redis-trib.rb fix 192.168.77.101:7000
redis-trib.rb check 192.168.77.101:7000
# 检查集群节点状态,发现三个master实例都有slot存在,此时就不能直接删除master实例

3. Add three slave instances:

redis-trib.rb add-node --slave 192.168.77.101:7001 192.168.77.101:7000
redis-trib.rb add-node --slave 192.168.77.102:7003 192.168.77.101:7000
redis-trib.rb add-node --slave 192.168.77.103:7005 192.168.77.101:7000
# 添加三个slave实例,每个slave实例会自动分配到一个master实例之下
redis-cli -c -h 192.168.77.101 -p 7000 cluster nodes|sort -k2
redis-trib.rb check 192.168.77.101:7000
# 查看集群节点状态

4. Generate a cluster status report:
There is too much information output at this time. You can convert the output of the cluster nodes command to generate a status report.

redis-cli -c -h 192.168.77.101 -p 7000 cluster nodes|\
awk 'BEGIN{print"节点 角色 自身ID 挂接的主实例ID"}
     {if($4=="-") {print $2,$3,$1,$1} 
      else        {print $2,$3,$1,$4}}'|\
sort -rk4|column -t
# 查看集群状态,主备关系
# 当自身ID和挂接的主实例ID相同时,代表本实例是一个master实例
# 否则代表本实例是一个slave实例

5. Manually adjust the relationship between the master instance and the slave instance.
When there are more than two instances on a node, if a downtime instance switch occurs during
operation, multiple master instances will run on one node or all nodes will be disabled. It is the case of slave
or two instances on a node are exactly a pair of master instance and slave instance
. At this time, it is necessary to manually adjust the corresponding relationship between the master instance and the slvae instance to avoid unavailability when the node fails.

redis-cli -c -h 192.168.77.101 -p 7001 cluster replicate 45c239f2a6637cb8c5209c7ccd5754acfbaeb6b4
redis-cli -c -h 192.168.77.102 -p 7003 cluster replicate 8af251aab2566469e8742abc80737acc9f7b60a2
redis-cli -c -h 192.168.77.103 -p 7005 cluster replicate 48c93c36580fc7314307ed4eeae6363b15b1be50
# 根据 生成集群状态报表 命令查看集群状态,验证切换操作

6. Manually shut down the master instance and check the cluster eagerness:

redis-cli -c -h 192.168.77.102 -p 7002 debug segfault
# 将102的7002实例停机
# 根据 生成集群状态报表 命令查看集群状态,验证集群热切
# 发现角色一栏102的7002实例 出现 master,fail 信息
# 该实例原本的slave实例现在变成的master实例,集群热切成功

Node 2 will start the stopped instance

/usr/local/bin/redis-server /usr/local/redis/redis_cluster/redis_7002.conf
# 根据 生成集群状态报表 命令查看集群状态,查看集群状态
# 发现停掉的实例启动后变成了slave状态的实例

Cluster deletion

1. Delete the slave instance:

redis-trib.rb del-node 192.168.77.101:7000 45c239f2a6637cb8c5209c7ccd5754acfbaeb6b4
redis-trib.rb del-node 192.168.77.101:7000 d65247f83fbe5f3c9bc0b25c8ffbd3aace203bea
redis-trib.rb del-node 192.168.77.101:7000 a1c5ede35007ba6c1bf6d9fad907dbe09f7c6f62
# slave角色的实例可以直接删除
# 实例从集群删除后,实例会关闭

2. Delete the master instance:

redis-trib.rb reshard 192.168.77.101:7000
# 需要将即将删除的master实例上的slot移动到其他master实例上
# 当master实例上有slot存在时,是无法删除的
redis-trib.rb check 192.168.77.101:7000
# 查看待删除的实例上已经没有slot了
redis-trib.rb del-node 192.168.77.101:7000 f2b7cb07b0f3a16c7c650df205ef269933d5badc
redis-trib.rb del-node 192.168.77.101:7000 8af251aab2566469e8742abc80737acc9f7b60a2
# 根据 生成集群状态报表 命令查看集群状态 只剩下一个master实例了
redis-cli -c -h 192.168.77.101 -p 7000 debug segfault
# 将最后一个master实例关闭

3. Deletion of residual information:
After the instance is deleted, there is residual information, which needs to be deleted before the cluster can be re-created and added.
These residual information is generated by the running of the instance, so the three node hosts must delete the cache information.

cd /usr/local/redis/run/data
rm -rf nodes*.conf
# 删除集群配置信息
rm -rf dump*.rdb
# 删除内存快照
rm -rf appendonly*.aof
# 删除aof日志文件
cd ..
rm -rf log/redis*.log
# 删除实例运行日志
tree

simple summary

It can be seen from this experiment that the redis cluster does not actually limit the number of masters and slaves. The addition and deletion of the
two instances are more flexible. You can increase the master according to the memory usage, and then migrate the data slot
. Of course, the memory of the added master There is no limit to the limit. You can add a master slave that doubles the memory limit according to the actual production load.
Generally, the master slave must be as large as the master memory limit, because the slave will automatically switch to the master after the master goes down
for cluster security. Generally, a master has at least one Slave, and to ensure that the slave does not go down before
the

[TOC]

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324483323&siteId=291194637