【redis之哨兵原理】

哨兵也是 Redis 服务器,只是它与我们平时提到的 Redis 服务器职能不同,哨兵负责监视普通的 Redis 服务器,提高一个服务器集群的健壮和可靠性。哨兵和普通的 Redis 服务器所用的是同一套服务器框架,这包括:网络框架,底层数据结构,订阅发布机制等。

定时程序是哨兵服务器的重要角色,所做的工作主要包括:监视普通的 Redis 服务器(包括主机和从机),执行故障修复,执行脚本命令。

一、Sentinel的作用:

A、Master 状态监测

B、如果Master 异常,则会进行Master-slave 转换,将其中一个Slave作为Master,将之前的Master作为Slave 

C、Master-Slave切换后,master_redis.conf、slave_redis.conf和sentinel.conf的内容都会发生改变,即master_redis.conf中会多一行slaveof的配置,sentinel.conf的监控目标会随之调换 

二、Sentinel的工作方式:

1)每个Sentinel以每秒钟一次的频率向它所知的Master,Slave以及其他 Sentinel 实例发送一个 PING 命令 

2)如果一个实例(instance)距离最后一次有效回复 PING 命令的时间超过 down-after-milliseconds 选项所指定的值, 则这个实例会被 Sentinel 标记为主观下线。 

3)如果一个Master被标记为主观下线,则正在监视这个Master的所有 Sentinel 要以每秒一次的频率确认Master的确进入了主观下线状态。 

4)当有足够数量的 Sentinel(大于等于配置文件指定的值)在指定的时间范围内确认Master的确进入了主观下线状态, 则Master会被标记为客观下线 

5)在一般情况下, 每个 Sentinel 会以每 10 秒一次的频率向它已知的所有Master,Slave发送 INFO 命令 

6)当Master被 Sentinel 标记为客观下线时,Sentinel 向下线的 Master 的所有 Slave 发送 INFO 命令的频率会从 10 秒一次改为每秒一次 

7)若没有足够数量的 Sentinel 同意 Master 已经下线, Master 的客观下线状态就会被移除。 

若 Master 重新向 Sentinel 的 PING 命令返回有效回复, Master 的主观下线状态就会被移除。

Redis Sentinel Documentation

Redis Sentinel provides high availability for Redis. In practical terms this means that using Sentinel you can create a Redis deployment that resists without human intervention to certain kind of failures.

Redis Sentinel also provides other collateral tasks such as monitoring, notifications and acts as a configuration provider for clients.

This is the full list of Sentinel capabilities at a macroscopical level (i.e. the big picture):

Monitoring. Sentinel constantly checks if your master and slave instances are working as expected.

Notification. Sentinel can notify the system administrator, another computer programs, via an API, that something is wrong with one of the monitored Redis instances.

Automatic failover. If a master is not working as expected, Sentinel can start a failover process where a slave is promoted to master, the other additional slaves are reconfigured to use the new master, and the applications using the Redis server informed about the new address to use when connecting.

Configuration provider. Sentinel acts as a source of authority for clients service discovery: clients connect to Sentinels in order to ask for the address of the current Redis master responsible for a given service. If a failover occurs, Sentinels will report the new address.

Redis 的 Sentinel 系统用于管理多个 Redis 服务器(instance), 该系统执行以下三个任务:

监控(Monitoring): Sentinel 会不断地检查你的主服务器和从服务器是否运作正常。

提醒(Notification): 当被监控的某个 Redis 服务器出现问题时, Sentinel 可以通过 API 向管理员或者其他应用程序发送通知。

自动故障迁移(Automatic failover): 当一个主服务器不能正常工作时, Sentinel 会开始一次自动故障迁移操作, 它会将失效主服务器的其中一个从服务器升级为新的主服务器, 并让失效主服务器的其他从服务器改为复制新的主服务器; 当客户端试图连接失效的主服务器时, 集群也会向客户端返回新主服务器的地址, 使得集群可以使用新主服务器代替失效服务器。

配置提供者:哨兵作为Redis客户端发现的权威来源:客户端连接到哨兵请求当前可靠的master的地址。如果发生故障,哨兵将报告新地址。

Distributed nature of Sentinel

Redis Sentinel is a distributed system:

Sentinel itself is designed to run in a configuration where there are multiple Sentinel processes cooperating together. The advantage of having multiple Sentinel processes cooperating are the following:

  1. Failure detection is performed when multiple Sentinels agree about the fact a given master is no longer available. This lowers the probability of false positives.
  2. Sentinel works even if not all the Sentinel processes are working, making the system robust against failures. There is no fun in having a fail over system which is itself a single point of failure, after all.

The sum of Sentinels, Redis instances (masters and slaves) and clients connecting to Sentinel and Redis, are also a larger distributed system with specific properties. In this document concepts will be introduced gradually starting from basic information needed in order to understand the basic properties of Sentinel, to more complex information (that are optional) in order to understand how exactly Sentinel works.

Sentinel commands

The following is a list of accepted commands, not covering commands used in order to modify the Sentinel configuration, which are covered later.

  • PING This command simply returns PONG.
  • SENTINEL masters Show a list of monitored masters and their state.
  • SENTINEL master <master name> Show the state and info of the specified master.
  • SENTINEL slaves <master name> Show a list of slaves for this master, and their state.
  • SENTINEL sentinels <master name> Show a list of sentinel instances for this master, and their state.
  • SENTINEL get-master-addr-by-name <master name> Return the ip and port number of the master with that name. If a failover is in progress or terminated successfully for this master it returns the address and port of the promoted slave.
  • SENTINEL reset <pattern> This command will reset all the masters with matching name. The pattern argument is a glob-style pattern. The reset process clears any previous state in a master (including a failover in progress), and removes every slave and sentinel already discovered and associated with the master.
  • SENTINEL failover <master name> Force a failover as if the master was not reachable, and without asking for agreement to other Sentinels (however a new version of the configuration will be published so that the other Sentinels will update their configurations).
  • SENTINEL ckquorum <master name> Check if the current Sentinel configuration is able to reach the quorum needed to failover a master, and the majority needed to authorize the failover. This command should be used in monitoring systems to check if a Sentinel deployment is ok.
  • SENTINEL flushconfig Force Sentinel to rewrite its configuration on disk, including the current Sentinel state. Normally Sentinel rewrites the configuration every time something changes in its state (in the context of the subset of the state which is persisted on disk across restart). However sometimes it is possible that the configuration file is lost because of operation errors, disk failures, package upgrade scripts or configuration managers. In those cases a way to to force Sentinel to rewrite the configuration file is handy. This command works even if the previous configuration file is completely missing.

Sentinel 命令

下面是一系列的接收命令,而不是全部的命令,用于修改Sentinel配置。

PING - 这个命令简单的返回PONE。

SENTINEL masters - 展示监控的master清单和它们的状态。

SENTINEL master [master name] - 展示指定master的状态和信息。

SENTINEL slaves [master name] - 展示master的slave清单和它们的状态。

SENTINEL sentinels [master name] - 展示master的sentinel实例的清单和它们的状态。

SENTINEL get-master-addr-by-name [master name] - 返回master的IP和端口。如果故障转移在处理中或成功终止,返回晋升的slave的IP和端口。

SENTINEL reset [pattern] - 这个命令将重置所有匹配名字的masters。参数是blog风格的。重置的过程清空master的所有状态,并移除已经发现和关联master的所有slave和sentinel。

SENTINEL failover [master name] - 如果master不可到达,强制执行一个故障转移,而不征求其他Sentinel的同意。

SENTINEL ckquorum [master name] - 检查当前的Sentinel配置是否能够到达故障转移需要的法定人数,并且需要授权故障转移的多数。这个命令应该用于监控系统检查部署是否正确。

SENTINEL flushconfig - 强制Sentinel在磁盘上重写它的配置,包括当前的Sentinel状态。通常Sentinel每次重写配置改变它的状态。然而有时由于操作错误、硬盘故障、包升级脚本或配置管理器可能导致配置文件丢失。在这种情况下收到强制Sentinel重写配置文件。这个命令即使上面的配置文件完全不见了。

运行sentinel有两种方式:

第一种

redis-sentinel /path/to/sentinel.conf

第二种

redis-server /path/to/sentinel.conf --sentinel

以上两种方式,都必须指定一个sentinel的配置文件sentinel.conf,如果不指定,将无法启动sentinel。sentinel默认监听26379端口,所以运行前必须确定该端口没有被别的进程占用。

猜你喜欢

转载自gaojingsong.iteye.com/blog/2384761