Redis Sentinel Introduction
Redis Redis Sentinel is the official high-availability solutions
Redis Sentinel provides high availability for Redis. In practice, this means that you can create a Redis Sentinel deployment, can withstand certain types of failures without human intervention.
Redis Sentinel also provides other ancillary tasks such as monitoring, notification, and acts as a client configuration provider.
This is the complete list of Sentinel function of macro-level (ie, larger image ):
- Monitoring . Sentinel will continue to check the master and slave instance instance is working as expected.
- Notice . Sentinel API can notify the system administrator, another computer program, which monitored Redis instance of a problem.
- Automatic failover . If the primary server is not working as expected, Sentinel can start the failover process, in which the server upgrade from the primary server, the other servers other re-configured to use the new primary server, and applications that use Redis server address notification about the new server. connection.
- Configuring provider . Sentinel acts as a client service discovery rights Source: client connects to the Sentinels, in order to address the current asking Redis master server is responsible for a given service. If a failover occurs, Sentinels will report new address
Redis Sentinel building
In the present embodiment built on the same machine, the actual structures, building machines can be divided into three
Service type | Character | IP addresses | port |
---|---|---|---|
Redis | master | 127.0.0.1 | 17007 |
Redis | slave | 127.0.0.1 | 17008 |
Redis | slave | 127.0.0.1 | 17009 |
Sentinel | 127.0.0.1 | 17107 | |
Sentinel | 127.0.0.1 | 17108 | |
Sentinel | 127.0.0.1 | 17109 |
1, the primary structures of three nodes from Redis mode, with reference to: [] Redis Redis master mode to build from
Contents are as follows:
2, built Sentinel node, the node configuration files 17107 as follows:
1 # bind注释掉,需要在外网访问,将protected-model改为no 2 protected-mode no 3 4 # 端口 5 port 17107 6 7 # 后台运行 8 daemonize yes 9 10 # pid文件 11 pidfile sentinel_17107.pid 12 13 # 日志文件 14 logfile "/data/log/redis-sentinel-log/sentinel-17107-log/sentinel-17107.log" 15 16 # 目录 17 dir /data/soft/redis-sentinel/sentinel-17107/ 18 19 # sentinel monitor <master-name> <ip> <redis-port> <quorum> 20 # 配置sentinel监控的master 21 # sentinel监控的master的名字叫做mymaster,地址为127.0.0.1:6379 22 # sentinel在集群式时,需要多个sentinel互相沟通来确认某个master是否真的死了; 23 # 数字2代表,当集群中有2个sentinel认为master死了时,才能真正认为该master已经不可用了。 24 sentinel monitor mymaster 127.0.0.1 17007 2 25 26 # sentinel auth-pass <master-name> <password> 27 # sentinel author-pass定义服务的密码,mymaster是服务名称,123456是Redis服务器密码 28 sentinel auth-pass mymaster 123456 29 30 # sentinel down-after-milliseconds <master-name> <milliseconds> 31 # sentinel会向master发送心跳PING来确认master是否存活 32 # 如果master在“一定时间范围”内不回应PONG或者是回复了一个错误消息 33 # 那么这个sentinel会主观地认为这个master已经不可用了(SDOWN) 34 # 而这个down-after-milliseconds就是用来指定这个“一定时间范围”的,单位是毫秒。 35 sentinel down-after-milliseconds mymaster 30000 36 37 # sentinel parallel-syncs <master-name> <numreplicas> 38 # 在发生failover主备切换时,这个选项指定了最多可以有多少个slave同时对新的master进行同步 39 # 这个数字越小,完成failover所需的时间就越长 40 # 但是如果这个数字越大,就意味着越多的slave因为replication而不可用。 41 # 可以通过将这个值设为 1 来保证每次只有一个slave处于不能处理命令请求的状态。 42 sentinel parallel-syncs mymaster 1 43 44 # sentinel failover-timeout <master-name> <milliseconds> 45 # 实现主从切换,完成故障转移的所需要的最大时间值。 46 # 若Sentinel进程在该配置值内未能完成故障转移的操作,则认为本次故障转移操作失败。 47 sentinel failover-timeout mymaster 180000 48 49 # 指定Sentinel进程检测到Master-Name所指定的“Master主服务器”的实例异常的时候,所要调用的报警脚本。 50 # sentinel notification-script mymaster <script-path> 51 52 # 安全 53 # 避免脚本重置,默认值yes 54 # 默认情况下,SENTINEL SET将无法在运行时更改notification-script和client-reconfig-script。 55 # 这避免了一个简单的安全问题,客户端可以将脚本设置为任何内容并触发故障转移以便执行程序。 56 sentinel deny-scripts-reconfig yes
3、启动Redis Sentinel
注意启动的顺序。首先是redis主节点的Redis服务进程,然后启动从机的服务进程,最后启动3个哨兵的服务进程。
启动Sentinel方式
方式1:redis-sentinel redis-sentinel.conf
方式2:redis-server sentinel.conf --sentinel
本例编辑了一个脚本(start-all.sh)进行启动,内容如下:
1 #!/bin/bash 2 3 # 启动 Redis-Server 4 echo "Star Redis-Server ..." 5 6 cd /data/soft/redis-sentinel 7 redis-5.0.5/src/redis-server redis-17007/redis-17007.conf 8 9 # sleep 1 睡眠1秒 10 # sleep 1s 睡眠1秒 11 # sleep 1m 睡眠1分 12 # sleep 1h 睡眠1小时 13 sleep 3 14 15 redis-5.0.5/src/redis-server redis-17008/redis-17008.conf 16 redis-5.0.5/src/redis-server redis-17009/redis-17009.conf 17 18 # 启动 Redis-Sentinel 19 echo "Star Redis-Sentinel ..." 20 21 redis-5.0.5/src/redis-sentinel sentinel-17107/sentinel-17107.conf 22 redis-5.0.5/src/redis-sentinel sentinel-17108/sentinel-17108.conf 23 redis-5.0.5/src/redis-sentinel sentinel-17109/sentinel-17109.conf
关闭脚本(start-all.sh)
1 #!/bin/bash 2 3 # 停止 Redis-Server 4 echo "Shutdown Redis-Sentinel ..." 5 6 cd /data/soft/redis-sentinel 7 8 redis-5.0.5/src/redis-cli -h 127.0.0.1 -p 17107 shutdown 9 redis-5.0.5/src/redis-cli -h 127.0.0.1 -p 17108 shutdown 10 redis-5.0.5/src/redis-cli -h 127.0.0.1 -p 17109 shutdown 11 12 # 停止 Redis-Server 13 echo "Shutdown Redis-Server ..." 14 15 redis-5.0.5/src/redis-cli -h 127.0.0.1 -p 17007 -a 123456 shutdown 16 redis-5.0.5/src/redis-cli -h 127.0.0.1 -p 17008 -a 123456 shutdown 17 redis-5.0.5/src/redis-cli -h 127.0.0.1 -p 17009 -a 123456 shutdown
4、测试用客户端连接redis,进行操作,使用命令:redis-5.0.5/src/redis-cli -h 127.0.0.1 -p 17007 -a 123456
Redis Sentinel Java连接
使用Jedis连接Redis,测试类如下:
1 package com.test.jedis; 2 3 import java.util.Arrays; 4 import java.util.HashSet; 5 import java.util.Set; 6 7 import org.junit.Test; 8 9 import redis.clients.jedis.Jedis; 10 import redis.clients.jedis.JedisPoolConfig; 11 import redis.clients.jedis.JedisSentinelPool; 12 13 public class TestSentinels { 14 15 @Test 16 public void testSentinel() { 17 JedisPoolConfig jedisPoolConfig = new JedisPoolConfig(); 18 jedisPoolConfig.setMaxTotal(10); 19 jedisPoolConfig.setMaxIdle(5); 20 jedisPoolConfig.setMinIdle(5); 21 // 哨兵信息 22 Set<String> sentinels = new HashSet<String>(Arrays.asList("127.0.0.1:17107", "127.0.0.1:17108","127.0.0.1:17109")); 23 // 创建连接池 24 JedisSentinelPool pool = new JedisSentinelPool("mymaster", sentinels,"123456"); 25 // 获取客户端 26 Jedis jedis = pool.getResource(); 27 // 执行两个命令 28 jedis.set("mykey", "myvalue"); 29 String value = jedis.get("mykey"); 30 System.out.println(value); 31 } 32 }
测试故障转移
1、模拟发生故障,使用命令关闭主Redis节点17001,命令:redis-5.0.5/src/redis-cli -h 127.0.0.1 -p 17007 -a 123456 shutdown
2、查看日志Redis Sentinel(17107)节点日志,如下
2564:X 25 Aug 2019 16:18:42.215 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo 2564:X 25 Aug 2019 16:18:42.215 # Redis version=5.0.5, bits=64, commit=00000000, modified=0, pid=2564, just started 2564:X 25 Aug 2019 16:18:42.215 # Configuration loaded 2571:X 25 Aug 2019 16:18:42.222 * Running mode=sentinel, port=17107. 2571:X 25 Aug 2019 16:18:42.234 # Sentinel ID is 4e78a59c5ac6f8aa23bd0a22bbb1741faa6d6ce5 2571:X 25 Aug 2019 16:18:42.235 # +monitor master mymaster 127.0.0.1 17007 quorum 2 2571:X 25 Aug 2019 16:18:42.240 * +slave slave 127.0.0.1:17008 127.0.0.1 17008 @ mymaster 127.0.0.1 17007 2571:X 25 Aug 2019 16:18:42.244 * +slave slave 127.0.0.1:17009 127.0.0.1 17009 @ mymaster 127.0.0.1 17007 2571:X 25 Aug 2019 16:18:44.282 * +sentinel sentinel 0c009020ab689e56f755e8756818c667f04baa45 127.0.0.1 17108 @ mymaster 127.0.0.1 17007 2571:X 25 Aug 2019 16:18:44.319 * +sentinel sentinel 7248affc9dd5089ef46b6943b89652a33f23f4cf 127.0.0.1 17109 @ mymaster 127.0.0.1 17007 2571:X 25 Aug 2019 16:19:42.017 # +sdown master mymaster 127.0.0.1 17007 2571:X 25 Aug 2019 16:19:42.043 # +new-epoch 1 2571:X 25 Aug 2019 16:19:42.048 # +vote-for-leader 0c009020ab689e56f755e8756818c667f04baa45 1 2571:X 25 Aug 2019 16:19:42.093 # +odown master mymaster 127.0.0.1 17007 #quorum 3/2 2571:X 25 Aug 2019 16:19:42.093 # Next failover delay: I will not start a failover before Sun Aug 25 16:25:42 2019 2571:X 25 Aug 2019 16:19:42.641 # +config-update-from sentinel 0c009020ab689e56f755e8756818c667f04baa45 127.0.0.1 17108 @ mymaster 127.0.0.1 17007 2571:X 25 Aug 2019 16:19:42.641 # +switch-master mymaster 127.0.0.1 17007 127.0.0.1 17008 2571:X 25 Aug 2019 16:19:42.641 * +slave slave 127.0.0.1:17009 127.0.0.1 17009 @ mymaster 127.0.0.1 17008 2571:X 25 Aug 2019 16:19:42.641 * +slave slave 127.0.0.1:17007 127.0.0.1 17007 @ mymaster 127.0.0.1 17008 2571:X 25 Aug 2019 16:20:12.693 # +sdown slave 127.0.0.1:17007 127.0.0.1 17007 @ mymaster 127.0.0.1 17008
从日志上可以看出
a、主观下线(sdown)
当某个哨兵心跳检测master超时后,则认定其sdown
+sdown master mymaster 127.0.0.1 17007
b、客观下线(odown)
当认定sdown的哨兵数>=quorum时,则master下线事实最终成立,即odown
+odown master mymaster 172.31.175.142 6379 #quorum 2/2
c、选举哨兵leader
各哨兵协商,选举出一个leader,由其进行故障转移操作
+vote-for-leader 0c009020ab689e56f755e8756818c667f04baa45 1
d、故障转移
选择一个slave作为新的master, 并将其他节点设置为新master的slave (刚才已下线的老master的配置文件也会被设置slaveof…)
+switch-master mymaster 127.0.0.1 17007 127.0.0.1 17008
当故障转移成功后, redis是一主一从, 如下
127.0.0.1:17008> info Replication # Replication role:master connected_slaves:1 slave0:ip=127.0.0.1,port=17009,state=online,offset=261636,lag=0
3、故障恢复
模拟故障恢复,重启redis-server的17007节点, 之后查看其redis主从信息, 发现老的master已经变成slave了,如下
127.0.0.1:17008> info Replication # Replication role:master connected_slaves:2 slave0:ip=127.0.0.1,port=17009,state=online,offset=303086,lag=0 slave1:ip=127.0.0.1,port=17007,state=online,offset=303086,lag=0