Redis单点时,当一台机器挂机了,redis的服务完全停止,这时就会影响其他服务的正常运行。下面利用redis sentinel做一个主从切换的集群管理。
下面两段官方的说辞:
Redis Sentinel provides high availability for Redis. In practical terms this means that using Sentinel you can create a Redis deployment that resists without human intervention to certain kind of failures.
Redis Sentinel also provides other collateral tasks such as monitoring, notifications and acts as a configuration provider for clients.
环境配置:
由于我这次配置没有太多的机器,参考前面的主从搭建,测试环境就两台Linux机器。
集群配置最少需要三台机器,那么我就两台Linux机器。
IP分别:
10.253.100.34 (redis 主)
10.253.100.35 (redis 从)
10.253.100.36 (redis 从)
启动主和从,然后在主查看Replication信息
startRedis
cd /usr/local/redis/bin
[root@CPS-redis-1 bin]# ./redis-cli -h 10.253.100.34 -p 6380 info Replication
# Replication
role:master
connected_slaves:2
slave0:ip=10.253.100.35,port=6380,state=online,offset=624792,lag=0
slave1:ip=10.253.100.36,port=6380,state=online,offset=624792,lag=0
master_replid:a76880302e47ecd3e5217157addd98ff3e0f622f
master_replid2:65f480b247f30c202a7ffd49cabd5e28e351655b
master_repl_offset:624792
second_repl_offset:532809
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:528843
repl_backlog_histlen:95950
相看从机器的Replication信息
[root@cps-redis-2 bin]# ./redis-cli -h 10.253.100.35 -p 6380 info Replication
# Replication
role:slave
master_host:10.253.100.34
master_port:6380
master_link_status:up
master_last_io_seconds_ago:1
master_sync_in_progress:0
slave_repl_offset:635447
slave_priority:100
slave_read_only:1
connected_slaves:0
master_replid:a76880302e47ecd3e5217157addd98ff3e0f622f
master_replid2:65f480b247f30c202a7ffd49cabd5e28e351655b
master_repl_offset:635447
second_repl_offset:532809
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:635447
配置redis sentinel集群监控服务
1.添加一份redis sentinel 配置文件,在10.253.100.34上,配置此文件。
cp /usr/local/src/redis-stable/sentinel.conf /usr/local/redis/bin/
cd /usr/local/redis/bin
vi /usr/local/redis/bin/sentinel.conf
(1) 15 行# bind 127.0.0.1 192.168.1.1→bind 10.253.100.34(不能写成:bind 127.0.0.1 10.253.96.35,这样绑定的还是127,不能切换)
(2) 26行 daemonize no→daemonize yes
(3) 84 行 sentinel monitor mymaster 10.253.100.34 6380 2 放开注释
其中10.253.100.35/36上(都一样):sentinel monitor mymaster 10.253.100.34 6380 2
(4) 121行 sentinel parallel-syncs mymaster 1 121行取消注释
(5) 113行 sentinel down-after-milliseconds mymaster 60000 113行取消注释
(6) 146行 sentinel failover-timeout mymaster 180000 146行取消注释
down-after-milliseconds
这个配置项指定了需要多少失效时间,一个master才会被这个sentinel主观地认为是不可用的。 单位是毫秒,默认为30秒
2,启动redis sentinel
把redis sentinel 集群监听启动,观察redis sentinel 日志信息
cd /usr/local/redis/bin
[root@slave2 bin]# ./redis-sentinel sentinel.conf --sentinel
执行以下命令,查看redis主从信息。
cd /usr/local/redis/bin
./redis-cli -h 10.253.100.34 -p 6380 info Replication
./redis-cli -h 10.253.100.34 -p 6380 info Replication
那么表示一切都正常了。你的redis sentinel集群已经配置成功!
3.故障演示
执行以下命令使用主的redis(34)服务停止stopRedis。
redis sentinel监控到主的redis(34)服务停止,然后自动把从的redis(35or 36)切换到主。
再执行以下命令,查看redis主从信息。
在10.253.100.34上
cd /usr/local/redis/bin
[root@CPS-redis-1 bin]# ./redis-cli -h 10.253.100.34 -p 6380 info Replication
Could not connect to Redis at 10.253.100.34:6380: Connection refused
[root@CPS-redis-1 bin]# ./redis-cli -h 10.253.100.34 -p 6380 info Sentinel
Could not connect to Redis at 10.253.100.34:6380: Connection refused
在10.253.100.35上
[root@cps-redis-2 bin]# ./redis-cli -h 10.253.100.35 -p 6380 info Replication
# Replication
role:master
connected_slaves:1
slave0:ip=10.253.100.36,port=6380,state=online,offset=1092447,lag=1
master_replid:a424e48cefde025f2fa584ef102035ef4b353f09
master_replid2:a76880302e47ecd3e5217157addd98ff3e0f622f
master_repl_offset:1092737
second_repl_offset:1088772
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:44162
repl_backlog_histlen:1048576
[root@cps-redis-2 bin]# ./redis-cli -h 10.253.100.35 -p 26379 info Sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=mymaster,status=ok,address=10.253.100.35:6380,slaves=2,sentinels=3
在10.253.100.36上
[root@CPS-redis-3 bin]# ./redis-cli -h 10.253.100.36 -p 6380 info Replication
# Replication
role:slave
master_host:10.253.100.35
master_port:6380
master_link_status:up
master_last_io_seconds_ago:1
master_sync_in_progress:0
slave_repl_offset:1122948
slave_priority:100
slave_read_only:1
connected_slaves:0
master_replid:a424e48cefde025f2fa584ef102035ef4b353f09
master_replid2:a76880302e47ecd3e5217157addd98ff3e0f622f
master_repl_offset:1122948
second_repl_offset:1088772
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:543038
repl_backlog_histlen:579911
[root@CPS-redis-3 bin]# ./redis-cli -h 10.253.100.36 -p 26379 info Sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=mymaster,status=ok,address=10.253.100.35:6380,slaves=2,sentinels=3
4恢复启动原主Redis
[root@CPS-redis-1 bin]# startRedis
21103:C 19 Apr 2019 13:53:16.456 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
21103:C 19 Apr 2019 13:53:16.456 # Redis version=5.0.3, bits=64, commit=00000000, modified=0, pid=21103, just started
21103:C 19 Apr 2019 13:53:16.456 # Configuration loaded
在10.253.100.34上:
[root@CPS-redis-1 bin]# ./redis-cli -h 10.253.100.34 -p 26379 info Sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=mymaster,status=ok,address=10.253.100.35:6380,slaves=2,sentinels=3
[root@CPS-redis-1 bin]# ./redis-cli -h 10.253.100.34 -p 6380 info Replication
# Replication
role:slave
master_host:10.253.100.35
master_port:6380
master_link_status:up
master_last_io_seconds_ago:1
master_sync_in_progress:0
slave_repl_offset:1163692
slave_priority:100
slave_read_only:1
connected_slaves:0
master_replid:a424e48cefde025f2fa584ef102035ef4b353f09
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:1163692
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1153595
repl_backlog_histlen:10098
redis sentinel 集群服务,会把上次主redis重新加入服务中,但是他再以不是主的redis了,变成从的reids。
5.恢复原主,这是手动操作,现实中的自动切换不用操作这步,在现在的主Redis 35上的的执行下面的命令,将34变成主。
现在的主:!!!!
在35/36上:
cd /usr/local/redis/bin
./redis-cli -h 10.253.100.35 -p 6380 slaveof 10.253.100.34 6380
在10.253.100.34上
[root@CPS-redis-1 bin]# ./redis-cli -h 10.253.100.34 -p 6380 info Replication
# Replication
role:master
connected_slaves:2
slave0:ip=10.253.100.35,port=6380,state=online,offset=1198145,lag=1
slave1:ip=10.253.100.36,port=6380,state=online,offset=1198145,lag=1
master_replid:b1d32669dc9c9f793f29a810e99887b7a5754838
master_replid2:a424e48cefde025f2fa584ef102035ef4b353f09
master_repl_offset:1198145
second_repl_offset:1194339
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1153595
repl_backlog_histlen:44551
常用命令:
./redis-cli -h 10.253.100.34 -p 26379 info Sentinel
./redis-cli -h 10.253.100.35 -p 26379 info Sentinel
./redis-cli -h 10.253.100.36 -p 26379 info Sentinel
./redis-cli -h 10.253.100.34 -p 6380 info Replication
./redis-cli -h 10.253.100.35 -p 6380 info Replication
./redis-cli -h 10.253.100.36 -p 6380 info Replication
参考资料:
http://redis.io/topics/sentinel