Redis (6) master-slave mode and sentinel mechanism


1. Master-slave mode

Configure one master and two slave clusters

开启三个linux,并安装redis
insert image description here
info replicationQuery information about the current library
insert image description here
replicaof 192.168.31.238 6379

insert image description here重启redis服务,重新查看信息
Host:
insert image description hereslave information
insert image description here

Test host writes, slave reads
主机可读可写,但是多用于写

insert image description here从机可读不可写
insert image description here

当主机断电宕机后,默认情况下从机的角色不会发生变化 ,
集群中只是失去了写操作,
当主机恢复以后,又会连接上从机恢复原状。

Host Loss
insert image description here
从机依然可以获取数据
insert image description here
两个从机的角色并没有发生改变
insert image description here
By default, after a host failure, no new host will appear. There are two ways to generate a new host:

  • We can use slaveof no one to make ourselves a master! (Temporary)
    insert image description here
    Another slave still belongs to the previous master
    insert image description here
    If the master disconnects, we can use SLAVEOF no oneto make ourselves the master! Other nodes can manually connect to the latest master node (manually)! If the boss is repaired at this time, then reconnect for a long time!
  • Sentinel mechanism

2. Sentinel Mechanism

哨兵是Redis的一种工作模式,以监控节点状态及执行故障转移为主要工作,
哨兵总是以固定的频率去发现节点、故障检测,然后在检测到主节点故障时以安全的方式执行故障转移,确保集群的高可用性。
哨兵是一个独立的进程,作为进程,它会独立运行。
其原理是哨兵通过发送命令,等待Redis服务器响应,从而监控运行的多个Redis实例。

insert image description here
The role of sentry:

  • By sending commands, the Redis server returns to monitor its running status, including the master server and slave servers.
  • When Sentinel detects that the master is down, it will automatically switch the slave to the master, and then notify other slave servers through the publish and subscribe mode, modify the configuration file, and let them switch hosts.
    To sum up: the responsibilities of the sentry are:监控、选主、通知

Sentry mode demo:

Configure 3 sentinels and a Redis server with 1 master and 2 slaves to demonstrate this process.

Service type Is it the main server? IP address port
Redis yes 192.168.31.238 6379
Redis no 192.168.31.130 6379
Redis no 192.168.31.128 6379
Sentinel 192.168.31.238 26379
Sentinel 192.168.31.130 26379
Sentinel 192.168.31.128 26379

Redis安装目录下有一个sentinel.conf文件,copy一份进行修改
insert image description here
cp一份
insert image description hereCheck if cp is successfulfind | grep sentinel.conf
insert image description here
修改redis.conf

#bind 127.0.0.1 -::1  注释掉 # 使得Redis服务器可以跨网络访问
protected-mode no  关闭保护模式
指定主服务器,注意:有关replicaof的配置只是配置从服务器,主服务器不需要配置
replicaof 192.168.31.238 6379

修改sentinel.conf

protected-mode no  禁止保护模式
daemonize yes   开启后台模式
# 配置监听的主服务器,这里sentinel monitor代表监控,
# mymaster代表服务器的名称,
# 192.168.31.238代表监控的主服务器,
# 主服务器的端口6379,
# 2代表只有两个或两个以上的哨兵认为主服务器不可用的时候,才会进failover操作。
sentinel monitor mymaster 192.168.31.238 6379 2
port 26379   端口号
# sentinel auth-pass <master-name> <password> # 如果主机设置密码

start up

[root@heng bin]# redis-server /etc/redis.conf
[root@heng bin]# redis-sentinel /etc/sentinel.conf

查看两个服务是否启动成功
insert image description here
注意启动的顺序。首先是主机(192.168.11.128)的Redis服务进程,然后启动从机的服务进程,最后启动3个哨兵的服务进程。 查看主服务器信息
insert image description here查看从服务器信息
insert image description here
模拟主服务器宕机redis-cli shutdown
insert image description here
可以看到从机192.168.31.128变成了主机

insert image description here
insert image description here
重启192.168.31.238(原主机)发现其变成了从机
insert image description here

How Sentinel monitors nodes

The sentry will periodically send PINGcommands to all master and slave nodes. When the master and slave nodes receive PINGthe command, they will send a response command to the sentinel, so that they can be judged whether they are operating normally.
insert image description here

"Subjective offline" and [objective offline]

If the master node or slave node does not respond to the Sentinel's PING command within the specified time, the Sentinel will mark them as "subjectively offline". This "prescribed time" is down-after-millisecondsset by the configuration item parameter, and the unit is milliseconds.
Because it is possible that the "master node" is not actually faulty, it may just be that the system pressure of the master node is relatively high or the network is congested, causing the master node to fail to respond to the sentinel's PING command within the specified time.
In order to reduce misjudgments, Sentinel will not deploy only one node during deployment, but will deploy multiple nodes into a Sentinel cluster (at least three machines are required to deploy a Sentinel cluster). Through multiple sentinel nodes, the judgment can be made together. This can prevent a single sentinel from misjudging that the master node is offline due to poor network conditions. At the same time, the probability that multiple sentinel networks are unstable at the same time is small. If they make decisions together, the misjudgment rate can also be reduced.
insert image description here

 down-after-milliseconds
 指定哨兵在监控Redis服务时,当Redis服务在一个默认毫秒数内都无法回答时,单个哨兵认为的主观下线时间,默认为30000(30秒)

insert image description here
When a sentinel determines that the master node is "subjectively offline", it will issue commands to other sentinels.

SENTINEL is-master-down-by-addr <ip> <port> <current-epoch> <runid>

, after other sentinels receive this command, they will respond by agreeing to vote or rejecting the vote according to the network conditions of themselves and the master node.

When the number of approval votes of this sentinel reaches the value set by the quorum configuration item in the sentinel configuration file, the master node will be marked as "objectively offline" by the sentinel.
insert image description hereThe quorum configuration is 2, so a sentinel needs 2 approval votes to mark the master node as "objective offline". These 2 affirmative votes include one of the sentry's own and the affirmative votes of two other sentinels.

After the Sentinel determines that the master node is objectively offline, the Sentinel will begin to select a slave node from multiple "slave nodes" to be the new master node.

The reason why two states of "subjective offline" and "objective offline" are designed for the "master node" is because it is possible that the "master node" is not actually faulty. It may just be because the system pressure of the master node is relatively high or the network has failed. Congestion causes the master node to fail to respond to the sentinel's PING command within the specified time.

How does Sentinel elect a new master node

  1. Filter out slave nodes with bad network status. First, filter out the slave nodes that have gone offline, and then filter out the slave nodes with poor network connection status in the past.

  2. Sort according to the priority of the slave node. The smaller the priority, the higher the ranking.
    You can set the priority of the node through redis.conf.insert image description here

  3. If the priorities are equal, select the slave node with higher replication progress.
    insert image description here

  4. If the priority and replication progress of the two slave nodes are the same, compare the ID numbers of the two slave nodes, and the slave node with the smaller ID number wins.
    After the slave node is elected, the sentinel sends a command to the selected slave node SLAVEOF no oneto let the slave node program the new master node.
    After sending SLAVEOF no onethe command, the Sentinel Leader will send an INFO command to the upgraded slave node at a frequency of once per second (before failover, the frequency of the INFO command is once every 10 seconds), and observe the role information in the command reply. When the role information of the upgraded node changes from the original slave to master, the Sentinel Leader knows that the selected slave node has been successfully upgraded to the master node.

Which sentinel does the transfer

You need to select a leeder in the sentinel cluster and let the leader perform master-slave switching.
The process of electing Leader is actually a voting process. Before the voting begins, there must be a "candidate". Which
sentinel node determines that the master node is "objectively offline", this sentinel node is the candidate, and the so-called candidate is I want to be the leader's sentinel.
The candidate will send a command to other sentinels indicating that it wants to become the leader to perform master-slave switching, and let all other sentinels vote on it.
Each sentry has only one chance to vote. If it is used up, it cannot participate in the voting. It can vote for itself or others, but only the candidate can vote for itself.
Then during the voting process, any "candidate" must meet two conditions:

  • First, get more than half of the votes in favor;
  • Second, the number of votes received needs to be greater than or equal to the quorum value in the sentinel configuration file.
    Each candidate will first cast a vote for themselves and then request votes from other sentinels. If the voter receives a voting request from "Candidate A" first, he will vote for it first. If the voter uses up his voting opportunity and receives a voting request from "Candidate B", he will refuse to vote. At this time, candidate A first meets the two conditions above, so "candidate A" will be elected as Leader.

If there are only 2 sentinel nodes in the sentinel cluster, if a sentinel wants to successfully become a leader, it must obtain 2 votes instead of 1 vote.
Therefore, if a sentry in the sentry cluster dies, then there will be only one sentry left. If this sentry wants to become the leader, the number of votes will not reach 2, and it will not be able to successfully become the leader. At this time, it is impossible to proceed. Master-slave node switching.

Note: It is recommended that the value of quorum be set to one-half of the number of sentinels plus 1. For example, if there are 3 sentinels, set it to 2, if there are 5 sentinels, set it to 3, and the number of sentinel nodes should be an odd number.

How to notify the client of the new master node information?

It is implemented through the publisher/subscriber mechanism of Redis.
When the sentinel selects the new master node, the client will see the following switch-master event. This event indicates that the master node has been switched and the IP address and port information of the new master node have been changed. Yes. At this time, the client can communicate using the new master node address and port.

switch-master <master-name> <master-old-ip> <master-old-port> <master-new-ip> <master-new-port>

Check the log: you can see that the address of the main library has changed
insert image description here

 哨兵集群是如何组成的?
 在主从集群中,主节点上有一个名为__sentinel__:hello的频道,不同哨兵就是通过它来相互发现,实现互相通信的。

Master node: Slave
insert image description here
node:
insert image description here
Sentinel A publishes its IP address and port information to the __sentinel__:hello channel, and Sentinels B and C subscribe to this channel. At this time, Sentinels B and C can directly obtain the IP address and port number of Sentinel A from this channel.

哨兵集群会对「从节点」的运行状态进行监控,那哨兵集群如何知道「从节点」的信息?
主节点知道所有「从节点」的信息,所以哨兵会向主节点发送 INFO 命令来获取所有「从节点」的信息。

Guess you like

Origin blog.csdn.net/qq_45637894/article/details/130905482