Redis之-哨兵模式原理

master服务器异常down机后,两个原有的slave1,slave2服务器接管服务,如slave1变成新的master服务器,slave2变成slave1的从库。

配置文件主要参数讲解:

sentinel monitor mymaster 127.0.0.1 6379 1 几个哨兵发现down才认为真正的down

sentinel down-after-milliseconds mymaster 30000 多少毫秒后连接不到master认为断开

sentinel parallel-syncs mymaster 1 同时把几台master指到新的master机器。

sentinel failover-timeout mymaster 180000 多长时间失败

启动哨兵
[root@ZFRC-YW-YJF-TEST-370123 redis]# ./bin/redis-server ./sentinel.conf --sentinel
17400:X 28 Jun 17:17:32.853 # Not listening to IPv6: unsupproted
.
_.-__ ''-._ <br/>_.- .. ''-. Redis 3.2.13 (00000000/0) 64 bit
.-.-```. ```\/ _.,_ ''-._ <br/>( ' , .-` | `, ) Running in sentinel mode<br/>|`-._`-...-` __...-.-.|'` .-'| Port: 26379
| -._. / .-' | PID: 17400
-._-. `-./ .-' .-'
|`-.
-._-..-' .-'.-'|
| -._-. .-'.-' | http://redis.io
`-.
-._-.
.-'.-' .-'
|-._-._ -.__.-' _.-'_.-'| <br/>|-.`-. .-'.-' |
-._-._-.__.-'_.-' _.-' <br/>-._ -.__.-' _.-' <br/>-. .-'
`-.__.-'

17400:X 28 Jun 17:17:32.854 # Sentinel ID is b81b851b02fec76bcfc7144b0a675fdedecf7188
17400:X 28 Jun 17:17:32.854 # +monitor master mymaster 127.0.0.1 6379 quorum 1
17400:X 28 Jun 17:17:32.854 +slave slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379
17400:X 28 Jun 17:17:32.855
+slave slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379

测试将master down,查看哨兵是否有故障转移
[root@ZFRC-YW-YJF-TEST-370123 ~]# cd /usr/local/redis/
[root@ZFRC-YW-YJF-TEST-370123 redis]# ./bin/redis-cli
127.0.0.1:6379> shutdown
not connected>

日志打印出了一些枚举的过程,关键字switch为master机

17400:X 28 Jun 17:19:03.363 # +sdown master mymaster 127.0.0.1 6379
17400:X 28 Jun 17:19:03.363 # +odown master mymaster 127.0.0.1 6379 #quorum 1/1
17400:X 28 Jun 17:19:03.363 # +new-epoch 1
17400:X 28 Jun 17:19:03.363 # +try-failover master mymaster 127.0.0.1 6379
17400:X 28 Jun 17:19:03.364 # +vote-for-leader b81b851b02fec76bcfc7144b0a675fdedecf7188 1
17400:X 28 Jun 17:19:03.364 # +elected-leader master mymaster 127.0.0.1 6379
17400:X 28 Jun 17:19:03.364 # +failover-state-select-slave master mymaster 127.0.0.1 6379
17400:X 28 Jun 17:19:03.464 # +selected-slave slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379
17400:X 28 Jun 17:19:03.464 +failover-state-send-slaveof-noone slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379
17400:X 28 Jun 17:19:03.564
+failover-state-wait-promotion slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379
17400:X 28 Jun 17:19:03.917 # +promoted-slave slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379
17400:X 28 Jun 17:19:03.917 # +failover-state-reconf-slaves master mymaster 127.0.0.1 6379
17400:X 28 Jun 17:19:04.006 +slave-reconf-sent slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379
17400:X 28 Jun 17:19:04.982
+slave-reconf-inprog slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379
17400:X 28 Jun 17:19:04.982 +slave-reconf-done slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379
17400:X 28 Jun 17:19:05.064 # +failover-end master mymaster 127.0.0.1 6379
17400:X 28 Jun 17:19:05.064 # +switch-master mymaster 127.0.0.1 6379 127.0.0.1 6380
17400:X 28 Jun 17:19:05.064
+slave slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6380
17400:X 28 Jun 17:19:05.064 * +slave slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6380
17400:X 28 Jun 17:19:35.080 # +sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6380

同时登陆到6380从库,查看是否现在为master主节点
127.0.0.1:6380> info replication

role:master
connected_slaves:1
slave0:ip=127.0.0.1,port=6381,state=online,offset=22858,lag=0
master_repl_offset:22858
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:2
repl_backlog_histlen:22857
127.0.0.1:6380>

127.0.0.1:6381> info replication

role:slave
master_host:127.0.0.1
master_port:6380
master_link_status:up
master_last_io_seconds_ago:0
master_sync_in_progress:0
slave_repl_offset:35773
slave_priority:100
slave_read_only:1
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0

有时候主库down了,从库切换为master不是顺序晋升,如master挂了后,6381为主库了。其实是有个参数控制,在redis配置文件中,不在哨兵配置文件。
slave-priority 100 该数字越小。优先级越高。

猜你喜欢

转载自blog.51cto.com/yangjunfeng/2415069