Sentinel mode
Sentinel model is redis 高可用
one implementation
using a system of one or more sentinel (the Sentinel) of instances of redis node monitors if the primary node fails, the node can be an upgrade from the master node, fault escaped ensure system availability.
They are all aware how Sentinel node (master node / slave node / sentinel node) in the entire system of
- First, information is arranged in the master node Sentry (the Sentinel) profile
- Sentinel node and the master node configured to establish two connections
命令连接
and订阅连接
- By Sentinel will
命令连接
send every 10sINFO
command, throughINFO命令
the master node and will return to their own run_id从节点信息
- The Sentinel will also establish two connections from the node
命令连接
and订阅连接
- Sentinel by
命令连接
the transmitting node from theINFO
command information acquired some of his
A. Run_id
B. Role
C. Offset from the server copy offset
D. Et - Since both of the two sentinel nodes connected to the other (from the master node) in the current cluster,
命令连接
and订阅连接
a. Through命令连接
to the server_sentinel:hello
sends a message channel, including their ip port, run_id, arranged epoch time (subsequent vote will used), etc.
b. through订阅连接
the server's_sentinel:hello
channel did listen, so sentinels of all messages sent to the channel can be received
c. resolved to listen to the news, to analyze extraction, we can know there are those who do not Sentinel node also listen to these services from the master node, and update the structure of these sentinel nodes recorded
other sentinels d. the observed node establishes命令连接
---- no订阅连接
Sentinels under failover mode
Subjective offline
Sentinel (the Sentinel) once a second node to establish a connection instance sends PING command is a command, if down-after-milliseconds
there is no respond effectively than the millisecond response including (PONG / LOADING / MASTERDOWN), the instance will Sentinel the present state of the structure is marked SRI_S_DOWN
subjective offline
Objective offline
When a sentinel node discovery master node is subjective offline state, it will issue a query to the other Sentinel node that is not already subjective off the assembly line. If you exceed the configuration parameters quorum
when a node is considered subjective offline, the sentinel node will own the structure is maintained in the master node is marked as SRI_O_DOWN
objective offline
query commandSENTINEL is-master-down-by-addr <ip> <port> <current_epoch> <run_id>
parameter | significance |
---|---|
ip/port | The master node is currently considered downline ip and port |
current_epoch | Configuration era |
run_id | * For identification only ask if off the assembly line has a value that identifies the sentinel node want them to set itself as leader 询问时用*,选举时用run_id |
leader election
In the considered the master node 客观下线
in the case of, among sentinel node initiates an election, or the command above command SENTINEL is-master-down-by-addr <ip> <port> <current_epoch> <run_id>
, but run_id
this will 自己的run_id
bring in the hope that the recipient will set its own master. If more than half of the nodes to return the leader node is marked in the case, the fault will be the leader Migration
Failover
- In the selection of the new node from the master node
a. Normal communication
b. Prioritizing
c. Selecting the same priority is the largest offset - The new node is arranged to the master node
SLAVEOF no one
, and to ensure that in subsequent INGO command, the master node returns to state - The other slave node to the master node is provided from the new copy,
SLAVEOF命令
- The old master node becomes the new primary node from the node
Advantages and disadvantages
- Advantages of
availability, at the master node fails to achieve transfer failure - Cons: seems to be no way to do the level of development, if the circumstances under great content
Cluster Mode
Distributed official program (slot assigned / re-fragmentation / fail)
nodes in the cluster, the node will have information within the data structures to store the entire cluster
//整体
struct clusterState{
clusterNode *mySelf; .... dict *nodes; //集群内的所有节点 } // 单个节点 struct clusterNode { char name[]; char ip[]; int port; clusterLink *link; //保存节点间,连接的信息 int flags; //状态标记 } //节点间连接的信息 struct clusterLink{ mstime_t ctime; //创建时间 int fd; //tcp套接字描述符 sds sndbuf; // 输出缓存区 sds rcvbuf; //输入缓存区 struct clusterNode *node; }
Slot assignment
case redis clusters may be divided into 16,384 slots, only those slots have all been assigned to a node processing, the status of a cluster to be on-line state (ok)
operation redis cluster when the key as parameters, can calculate the corresponding in the treatment tank, the storage and other operations of the groove should be in the corresponding node. In this way, you can achieve the perfect level of clustered storage expansion.
def slot_number(key): return CRC16(key) & 16383 //得到的结果就是槽的序号
槽指派的信息是怎么存储的
struct clusterState{
clusterNode *slots[16384] } struct clusterNode{ unsigned char slots[16384/8] }
It can be seen in two of the above defined structure, the slot assignment information is divided in two ways, which is stored in the structure.
分两种存储的好处
1. 如果需要判断某一个节点负责的槽,只需要获取方式二中的数组做判断就可以
2.如果找某个槽是哪个节点负责,只需要获取方式一的列表,一查就知道
重新分片
将已经指派给节点的槽,重新执行新的节点。
故障转移
发现故障节点
- 集群内的节点会向其他节点发送PING命令,检查是否在线
- 如果未能在规定时间内做出PONG响应,则会把对应的节点标记为疑似下线
- 集群中一半以上
负责处理槽的主节点
都将主节点X标记为疑似下线的话,那么这个主节点X就会被认为是已下线
- 向集群广播主节点X
已下线
,大家收到消息后都会把自己维护的结构体里的主节点X标记为已下线
从节点选举
- 当从节点发现自己复制的主节点已下线了,会向集群里面广播一条消息,要求所有有投票权的节点给自己投票(
所有负责处理槽的主节点都有投票权
) - 主节点会向第一个给他发选举消息的从节点回复支持
- 当支持数量超过N/2+1的情况下,该从节点当选新的主节点
故障的迁移
- 新当选的从节点执行
SLAVEOF no one
,修改成主节点 - 新的主节点会撤销所有已下线的老的主节点的槽指派,指派给自己
- 新的主节点向集群发送命令,通知其他节点自己已经变成主节点了,负责哪些槽指派
- 新的主节点开始处理自己负责的槽的命令
集群模式和哨兵模式的区别
- 哨兵模式监控权交给了哨兵系统,集群模式中是工作节点自己做监控
- 哨兵模式发起选举是选举一个leader哨兵节点来处理故障转移,集群模式是在从节点中选举一个新的主节点,来处理故障的转移