Redis sentinel mode (Sentinel) and scenario practice

Redis sentinel mode (Sentinel) and scenario practice

According to the model we used in the previous article is a Master and two Salve nodes, then when the Master is down, the two slave nodes cannot perform write operations, which will inevitably affect the normal use of our program. Is there any way to do it? Can a Salve node become the Master and continue to provide write operations when the Master is down? The answer is yes, our sentry was born to solve this problem!

What is sentinel mode?

Before Redis 2.8, when the Master node was down, we manually configured the Salve node to become the Master node (or restore the service of the Master node) to continue working, but this operation is not only time-consuming and laborious, but also It may affect the write operation during the recovery, causing data loss. Therefore, Redis introduced the sentinel mode in version 2.8 to automatically allocate Master nodes.
Insert picture description here

As shown in the figure above, the sentinel supervises the Master node and the Salve node in real time with an independent process. When the Master node fails to operate normally in an unexpected situation, the sentinel will elect a Salve node to be the Master node to ensure the normal read and write operations of the program.
In fact, the sentinel is only responsible for two things:

  • Monitor all Redis nodes to see if they are operating normally (send commands to all nodes and judge whether they are operating normally according to the return status of redis)
  • When it detects that the Master node is down, it will recommend a Salve node as the Master node, and then notify other nodes to modify their configuration files through the publish and subscribe mode, and tell them that I am now the emperor, and you are all my people. listen to me.

Although we used the sentinel mode, what should we do when the sentinel suddenly dies? Then let's send more sentries to supervise them. These sentinels can not only monitor each node but also each other. This is the multi-sentinel mode.

Insert picture description here
Scenario: For example, the Master node suddenly collapsed and our sentinel 1 detected it, but it will not immediately start to recommend other Salve nodes, but wait for other sentinels to also detect that the Master node has indeed collapsed before the sentries will vote. The salve node with more votes will replace the Master. After the switch is successful, through the publish and subscribe mode, each sentry can switch the master monitored by itself to the current master.

How Redis implements master-slave replication is not repeated here. If you don't understand, please move the configuration of Redis master-slave replication and test the scene .

Single sentinel configuration:

1. Create a new file sentinel.conf in the redis directory with the following content

#sentinel monitor 随便起名 Master主机IP Master端口  1代表Master宕机时哨兵进行投票选举
sentinel monitor redis6379 127.0.0.1 6379 1 

2. Start the sentry just configured

redis-sentinel sentinel.conf   #启动哨兵

Insert picture description here
After the sentinel is started successfully, we can see some information of the Master node and the Salve node in the console.

Scene test:

When the Master suddenly goes down, will the Salve node be elected as the Master?
Let's write a piece of data before the Master goes down

#6379为Master节点
127.0.0.1:6379> set name Silence-wen
OK
127.0.0.1:6379> get name
"Silence-wen"
127.0.0.1:6379> 

#6380Salve节点
127.0.0.1:6380> get name
"Silence-wen"
127.0.0.1:6380> 

#6381Salve节点
127.0.0.1:6381> get name
"Silence-wen"
127.0.0.1:6381> 

The slave information of the Master node before the shutdown of the Master process:

127.0.0.1:6379> info replication
# Replication
role:master
connected_slaves:2
slave0:ip=127.0.0.1,port=6380,state=online,offset=16717,lag=1
slave1:ip=127.0.0.1,port=6381,state=online,offset=16717,lag=0
master_replid:600a20a2c7420d206c084442548b39ebed39f4de
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:16717
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:16717
127.0.0.1:6379> 

The slave information of the 6380Salve node before shutdown the Master process:

127.0.0.1:6380> info replication
# Replication
role:slave
master_host:127.0.0.1
master_port:6379
master_link_status:up
master_last_io_seconds_ago:5
master_sync_in_progress:0
slave_repl_offset:16773
slave_priority:100
slave_read_only:1
connected_slaves:0
master_replid:600a20a2c7420d206c084442548b39ebed39f4de
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:16773
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:16773
127.0.0.1:6380> 

The slave information of the 6381Salve node before the shutdown of the Master process:

127.0.0.1:6381> info replication
# Replication
role:slave
master_host:127.0.0.1
master_port:6379
master_link_status:up
master_last_io_seconds_ago:1
master_sync_in_progress:0
slave_repl_offset:16815
slave_priority:100
slave_read_only:1
connected_slaves:0
master_replid:600a20a2c7420d206c084442548b39ebed39f4de
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:16815
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:16815
127.0.0.1:6381> 

Shut down the Master process:

[root@Silence bin]# ps -ef|grep redis
root      6280  6261  0 16:33 pts/0    00:00:00 redis-cli
root      6343     1  0 16:41 ?        00:00:19 redis-server 127.0.0.1:6381
root      6351  6323  0 16:41 pts/3    00:00:00 redis-cli -p 6381
root      6422     1  0 17:05 ?        00:00:17 redis-server 127.0.0.1:6380
root      6427  6283  0 17:06 pts/1    00:00:00 redis-cli -p 6380
root      6469  6208  0 17:53 pts/2    00:00:14 redis-server 127.0.0.1:6379
root      6649  6382  0 20:22 pts/4    00:00:00 grep --color=auto redis
[root@Silence bin]# kill 6469
[root@Silence bin]# ps -ef|grep redis
root      6280  6261  0 16:33 pts/0    00:00:00 redis-cli
root      6343     1  0 16:41 ?        00:00:19 redis-server 127.0.0.1:6381
root      6351  6323  0 16:41 pts/3    00:00:00 redis-cli -p 6381
root      6422     1  0 17:05 ?        00:00:17 redis-server 127.0.0.1:6380
root      6427  6283  0 17:06 pts/1    00:00:00 redis-cli -p 6380
root      6651  6382  0 20:23 pts/4    00:00:00 grep --color=auto redis
[root@Silence bin]# 

The master node process has been shut down. Let's see if the sentinel will elect a Salve as the new Master node as we said before.

After waiting for a few seconds, we shut down the Master node and look at the information printed by the sentry process:

6698:X 13 Aug 2020 20:33:13.148 # +sdown master redis6379 127.0.0.1 6379
6698:X 13 Aug 2020 20:33:13.148 # +odown master redis6379 127.0.0.1 6379 #quorum 1/1
6698:X 13 Aug 2020 20:33:13.148 # +new-epoch 1
6698:X 13 Aug 2020 20:33:13.148 # +try-failover master redis6379 127.0.0.1 6379
6698:X 13 Aug 2020 20:33:13.151 # +vote-for-leader 50b2a5b7aa9a471788c1df7a09775533f5392f14 1
6698:X 13 Aug 2020 20:33:13.151 # +elected-leader master redis6379 127.0.0.1 6379
6698:X 13 Aug 2020 20:33:13.151 # +failover-state-select-slave master redis6379 127.0.0.1 6379
6698:X 13 Aug 2020 20:33:13.218 # +selected-slave slave 127.0.0.1:6381 127.0.0.1 6381 @ redis6379 127.0.0.1 6379
6698:X 13 Aug 2020 20:33:13.218 * +failover-state-send-slaveof-noone slave 127.0.0.1:6381 127.0.0.1 6381 @ redis6379 127.0.0.1 6379
6698:X 13 Aug 2020 20:33:13.295 * +failover-state-wait-promotion slave 127.0.0.1:6381 127.0.0.1 6381 @ redis6379 127.0.0.1 6379
6698:X 13 Aug 2020 20:33:14.257 # +promoted-slave slave 127.0.0.1:6381 127.0.0.1 6381 @ redis6379 127.0.0.1 6379
6698:X 13 Aug 2020 20:33:14.257 # +failover-state-reconf-slaves master redis6379 127.0.0.1 6379
6698:X 13 Aug 2020 20:33:14.338 * +slave-reconf-sent slave 127.0.0.1:6380 127.0.0.1 6380 @ redis6379 127.0.0.1 6379
6698:X 13 Aug 2020 20:33:15.330 * +slave-reconf-inprog slave 127.0.0.1:6380 127.0.0.1 6380 @ redis6379 127.0.0.1 6379
6698:X 13 Aug 2020 20:33:15.330 * +slave-reconf-done slave 127.0.0.1:6380 127.0.0.1 6380 @ redis6379 127.0.0.1 6379
6698:X 13 Aug 2020 20:33:15.381 # +failover-end master redis6379 127.0.0.1 6379
6698:X 13 Aug 2020 20:33:15.381 # +switch-master redis6379 127.0.0.1 6379 127.0.0.1 6381
6698:X 13 Aug 2020 20:33:15.381 * +slave slave 127.0.0.1:6380 127.0.0.1 6380 @ redis6379 127.0.0.1 6381
6698:X 13 Aug 2020 20:33:15.381 * +slave slave 127.0.0.1:6379 127.0.0.1 6379 @ redis6379 127.0.0.1 6381
6698:X 13 Aug 2020 20:33:45.425 # +sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ redis6379 127.0.0.1 6381

From the output information, we can see that after the Master node is shut down, the sentry will:

1. try-failover master redis6379 (failover)
2. vote-for-leader (elect leader)
3. elected-leader master redis6379 (elect a leader)
4. failover-state-select-slave master (failover state from Election from nodes)
5. selected-slave slave
6. failover-state-send-slave of-noone (failover send command slave of no one)
7. failover-state-wait-promotion (failover state waiting Promotion)
8. promoted-slave (promote slave node)
9. failover-state-reconf-slaves (failover state reconfiguration slave node)
10. slave-reconf-sent (slave node reconfiguration send)
11. slave-reconf- done (reconfiguration of the slave node is complete)
12, failover-end (failover completed)
13, switch-master redis6379 127.0.0.1 6379 127.0.0.1 6381 (elect 6381 instead of 6379 as the Master)
14. slave slave 127.0.0.1:6380 127.0 .0.1 6380 @ redis6379 127.0.0.1 6381 (6380 is 6381 Salve)
15. Slave slave 127.0.0.1:6379 127.0.0.1 6379 @ redis6379 127.0.0.1 6381 (6379 is reduced to 6381 Salve)

From the information printed by the sentry, we can see that after the 6379Master node is shut down, the sentry performs a series of operations, electing 6381 as the Master, and at the same time rationing 6379 and 6380 to 6381, becoming slaves of 6381.

Does 6381 become the Master like the log printed by the sentry? Let's take a look.

View the subordinate information of the two Salve nodes:

127.0.0.1:6381> info replication
# Replication
role:master
connected_slaves:1
slave0:ip=127.0.0.1,port=6380,state=online,offset=83338,lag=1
master_replid:d20833d54acc941615c6c0b4a989e6f8f4d1de42
master_replid2:47c74f39e93685c12f13ddc96bca80b76f1ecce1
master_repl_offset:83338
second_repl_offset:18322
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:83338
127.0.0.1:6381> 

Subordinate information of 6380:

127.0.0.1:6380> info replication
# Replication
role:slave
master_host:127.0.0.1
master_port:6381
master_link_status:up
master_last_io_seconds_ago:2
master_sync_in_progress:0
slave_repl_offset:90594
slave_priority:100
slave_read_only:1
connected_slaves:0
master_replid:d20833d54acc941615c6c0b4a989e6f8f4d1de42
master_replid2:47c74f39e93685c12f13ddc96bca80b76f1ecce1
master_repl_offset:90594
second_repl_offset:18322
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:90594
127.0.0.1:6380> 

Sure enough, 6381 became the Master node, but it seems to have only one slave node of 6380, why is 6379 not displayed?
Let's start 6379 at this time to see what the affiliation of 6379 is like?

127.0.0.1:6379> info replication
# Replication
role:slave
master_host:127.0.0.1
master_port:6381
master_link_status:up
master_last_io_seconds_ago:0
master_sync_in_progress:0
slave_repl_offset:97041
slave_priority:100
slave_read_only:1
connected_slaves:0
master_replid:d20833d54acc941615c6c0b4a989e6f8f4d1de42
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:97041
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:95665
repl_backlog_histlen:1377
127.0.0.1:6379> 

Sure enough, 6379 became a slave node of 6381. After 7379 is started, it will be displayed in the affiliation of 6381.

127.0.0.1:6381> info replication
# Replication
role:master
connected_slaves:2
slave0:ip=127.0.0.1,port=6380,state=online,offset=101561,lag=0
slave1:ip=127.0.0.1,port=6379,state=online,offset=101561,lag=0
master_replid:d20833d54acc941615c6c0b4a989e6f8f4d1de42
master_replid2:47c74f39e93685c12f13ddc96bca80b76f1ecce1
master_repl_offset:101561
second_repl_offset:18322
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:101561
127.0.0.1:6381> 

From the above operations, it can be seen that the sentry helps us realize the automatic failover operation when the Master is down, thereby reducing the time of system failure.

But when we are in the actual project, we can do more than just this configuration. In actual work, you need to configure multiple sentinels, and also configure some configurations such as sending emails to relevant personnel when faults are found.

Guess you like

Origin blog.csdn.net/nxw_tsp/article/details/107991413