Redis (six) - availability of sentinel sentinel configuration and start the main and recovery from downtime and service

, Master-slave replication availability

# Master-slave replication problems: 
1 master-slave replication occurs, the primary node fails, the failover needs to be done, you can manually transfer: let one of the slave becomes the Master
 2 master-slave replication, the master can only write data, the write power and storage limited ability

 

 

Sentinel Redis is monitoring the operation of the system, which is a separate process, it will run independently, has two functions :
  • By sending a command, so that Redis server returns monitor its operational status, including primary and secondary servers.

  • When the sentinel surveillance to master is down, it will automatically switch to the slave master, then publish subscribe model notice from the other server, modify the configuration file, so that they switch hosts.

 

Second, the architecture description

Fault diagnosis can be done, failover, inform the client (in fact, is a process), the client address directly connected to the sentinel

 

 

Process

1 Duoge sentinel to detect and confirm the master in question

2 election Touch a sentinel as a leader

3 to select a new master as a slave

4 inform the rest of the slave become the new master of the slave

5 informs the client of the main changes from

6 waiting for the resurrection of the old master to become the new master of the slave

Third, configure Sentinel

Usually a plurality of guards, in addition to monitoring various redis server outside, between the sentinel will monitor each other.

1. Environment Configuration

Hosting Services Host IP port sentinel port
master (main library)

127.0.0.1

 

6379

26379

slave (from the library) 127.0.0.1 6380 26380
slave (from the library) 

127.0.0.1

6381

 

26381

redis default file sentinel.conf

2. Create a custom sentinel file

Under the redis server into the folder, create a configuration redis6379_sentinel.conf

26379 Port     # The port number is the port number of the Sentinel files, different port number for each document Sentinel
daemonize no
dir /root/data 
protected-mode no
bind 0.0.0.0
logfile "redis6379_sentinel.log"
#sentinel monitor代表监控,mymaster是给主库取得别名,ip地址代表监控的主库,6379是主库的端口号,2代表有两个或者两个以上的哨兵认为主库不可用时,才会进行换库 sentinel monitor mymaster 127.0.0.1 6379 2
#此配置指需要多少时间,一个master才会被sentinel主观认定是不可用的,单位是毫秒,默认是30秒 sentinel down-after-milliseconds mymaster 30000
#此配置值在发生故障时,最多可以有几个slave同时对新的master进行同步,这个数字越小完成故障处理的时间越短 sentinel parallel-syncs mymaster 1
sentinel failover-timeout mymaster 180000
#failover-timeout可以用在以下这些方面
#1. 同一个sentinel对同一个master两次failover之间的间隔时间。   
#2. 当一个slave从一个错误的master那里同步数据开始计算时间。直到slave被纠正为向正确的master那里同步数据时。    
#3.当想要取消一个正在进行的failover所需要的时间。    
#4.当进行failover时,配置所有slaves指向新的master所需的最大时间。不过,即使过了这个超时,slaves依然会被正确配置为指向master,但是就不按parallel-syncs所配置的规则来了。

需要三个哨兵,所以创建三个sentinel.conf

 

 

redis6380_sentinel.conf 

 

 

注意:需要将端口号改为26380(******)

同理,再复制出一份redis6381_sentinel.conf,这样就完成了三个哨兵的配置

四、启动哨兵

1.首先需要先把redis的主从服务器启动:redis-server redis.conf

 

 

2.然后启动3个哨兵:redis-sentinel sentinel.conf

 

 

3.查看sentinel信息

启动之后可以看到redis6381_sentinel.conf配置有一些添加的内容

 

 

如果发生了故障,sentinel的配置文件会自动进行相应的更改。

客户端连接:redis-cli -p 26379,再输入info查看到的部分信息

# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
#主节点是mymaster,ip和端口是127.0.0.1和6379,有2个从节点,4个哨兵 master0:name
=mymaster,status=ok,address=127.0.0.1:6379,slaves=2,sentinels=4

这样就是配置成功了

五、python中使用哨兵模式

import redis
from redis.sentinel import Sentinel

# 连接哨兵服务器(主机名也可以用域名)
# 101.132.167.242是远程服务器的ip地址
sentinel = Sentinel([('101.132.167.242', 26379),
                     ('101.132.167.242', 26380),
                     ('101.132.167.242', 26381)
             ],
                    socket_timeout=5)

print(sentinel)       #Sentinel<sentinels=[101.132.167.242:26379,101.132.167.242:26380,101.132.167.242:26381]>
# 获取主服务器地址
master = sentinel.discover_master('mymaster')
print(master)      #('127.0.0.1', 6379)

# 获取从服务器地址
slave = sentinel.discover_slaves('mymaster')
print(slave)       #[('127.0.0.1', 6380), ('127.0.0.1', 6381)]

# 获取主服务器进行写入
master = sentinel.master_for('mymaster', socket_timeout=0.5)   #获取主服务器,往里面写入值
w_ret = master.set('foo', 'bar')  

slave = sentinel.slave_for('mymaster', socket_timeout=0.5)  #获取从服务器,往里面获取值
r_ret = slave.get('foo')
print(r_ret)

出现这种错误,可能是阿里云没设置哨兵端口号

 

 

在阿里云中设置哨兵端口号

出现time out报错可能是因为代码中timeout时间设置太短了

 

 

 

 

 

时间设置长一点就行

 

 

六、主服务故障转移

1.先关闭主服务器redis,过一会查看一下sentinel日志。

 

 

2.查看26379 sentinel文件

 

 

 

15637:X 12 Jan 2020 21:44:38.983 # +sdown master mymaster 127.0.0.1 6379    #发现master服务已经不能用
15637:X 12 Jan 2020 21:44:39.063 # +new-epoch 1
15637:X 12 Jan 2020 21:44:39.065 # +vote-for-leader c8391221c81f7c2b0c0ed04bb7de6ae84a8f7afd 1   #投票选举哪个哨兵当leader
15637:X 12 Jan 2020 21:44:39.066 # +odown master mymaster 127.0.0.1 6379 #quorum 3/2   #3个哨兵有2哨兵投票不能使用了
15637:X 12 Jan 2020 21:44:39.066 # Next failover delay: I will not start a failover before Sun Jan 12 21:50:40 2020
15637:X 12 Jan 2020 21:44:40.066 # +config-update-from sentinel c8391221c81f7c2b0c0ed04bb7de6ae84a8f7afd 127.0.0.1 26381 @ mymaster 127.0.0.1 6379 
                          #26381哨兵当leader修改配置,6381升级为master 15637:X 12 Jan 2020 21:44:40.066 # +switch-master mymaster 127.0.0.1 6379 127.0.0.1 6381 #主数据库从6379转变为6381 15637:X 12 Jan 2020 21:44:40.066 * +slave slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6381 #添加6380为6381的从库 15637:X 12 Jan 2020 21:44:40.066 * +slave slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381 #添加6379为6381的从库 15637:X 12 Jan 2020 21:45:10.088 # +sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381 #发现6379已经宕机,等待6379的恢复

3.客户端连接查看主服务器转移

客户端连接6381,输入info replication

 

可以看出,6381目前是master,拥有一个slave,slave是6380

客户端连接6380,输入info replication

 

可以看出6380是slave,master是6381

4.重新启动6379查看状态

启动6379:

 

客户端连接6381查看状态:

已经将6379设置为6381的slave

 

查看6379 sentine.log文件

15637:X 12 Jan 2020 23:47:40.287 # -sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381   #6379已经恢复服务
15637:X 12 Jan 2020 23:47:50.234 * +convert-to-slave slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6381   #将6379设置为6381的slave

七、从服务故障转移

 

关闭6380从服务

查看6380 sentinel.log文件

 

说明已经监控到6380slave宕机了,那么如果恢复6380端口服务,会自动加入到主从复制吗?

从6381的客户端也可以查出6380宕机了,slave数量变为1

重新启动6380从服务,查看6380 sentinel.log文件

 

可以看出6380slave新加入了主从复制中,-sdown:说明是恢复服务

Guess you like

Origin www.cnblogs.com/wangcuican/p/12185385.html