Redis study notes (16) Sentinel (sentinel) (below)

I occasionally felt unwell and disappeared for a while, and I am back again. Not much to say, continue to read the sentry.

Detect subjective offline status

By default, Sentinel will send a PING command to all instances (master-slave server and other Sentinel) that have created a command connection with him at a frequency of once per second, and judge whether the instance is online through the PING command reply returned by the instance.

The instance's response to the PING command can be divided into two situations:

Valid reply: The example returns one of three replies: +PONG, -LOADING, and -MASTERDOWN.

Invalid reply: The instance returns a reply other than +PONG, -LOADING, -MASTERDOWN or no reply is received within the specified time.

Specify the length of time that Sentinel judges that the instance enters the subjective response is specified by the down-after-milliseconds option in the Sentinel configuration file.

If it does not receive a response from the master server, Sentinel will mark the master subjectively offline and turn on the SRI_S_DOWN flag in the flags attribute of the instance structure corresponding to the master.

Check objective offline status

When Sentinel judges a main server to be offline, in order to confirm whether the main server is really offline, it will ask other Sentinels that also monitor this main server to see if they also think that the main server has been offline. Online status, when Sentinel receives a sufficient number of offline judgments from other Sentinel, Sentinel will judge the main server to be offline and perform failover operations on the main server.

使用 SENTINEL is-master-down-by-addr < ip >< port >< current_epoch >< runid >

When the target Sentinel receives the SENTINEL command from the source Sentinel, it parses the parameters in the command and checks whether the main server is offline according to the ip port number of the main server, and then replies to the source Sentinel, <down_state >< leader_runid >< leader_epoch>

According to the sentinel command reply sent by other Sentinel, count the number of other Sentinel's same main servers that have gone offline. When this number reaches the number required by the configuration to determine objective offline, Sentinel will set the flags attribute of the main server instance structure The SRI_O_DOWN flag is turned on, indicating that the main server has entered an objective offline state.

Election leader Sentinel

When a main server is judged to be objectively offline, each Sentinel monitoring the offline main server will negotiate, elect a leader Sentinel, and the leader Sentinel will perform a failover operation on the offline main server.

1. All online Sentinel will be eligible to be selected as the leader Sentinel. In other words, any one of the multiple online Sentinel monitoring the same main server will become the leader Sentinel.

2. After each election of the leader Sentinel, regardless of whether the election is successful or not, the value of all Sentinel configuration epochs will be incremented once, and the configuration epoch is actually a counter.

3. In a configuration epoch, all Sentinels have an opportunity to set a certain Sentinel as the local leader Sentinel, and once the local leader is set, it cannot be changed again in this configuration epoch.

4. Every Sentinel that finds that the main server has entered an objective offline will ask other Sentinel to set itself as the local leader Sentinel.

5. When the source Sentinel sends the Sentinel is-master-down-by-addr command to the target Sentinel, and the runid in the command is not * but the run ID of the source Sentinel, this means that the source Sentinel requires the target Sentinel to set itself as the local leader Sentinel .

6. Sentinel's rule for setting local leader Sentinel is first come first served.

7. After receiving the SENTINEL is-master-down-by-addr command, the target Sentinel will return a command reply to the source Sentinel. The leader_runid parameter and leader_epoch in the reply record the running ID and configuration of the local leader Sentinel of the target Sentinel. era.

8. After the source Sentinel receives the command returned by the target Sentinel, it will check whether the value of the leader_epoch parameter in the reply is the same as its own configuration epoch. If it is the same, the source Sentinel will continue to take out the leader_runid parameter in the reply. If the value of the leader_runid parameter is the same If the running ID of the source Sentinel is the same, then the identification target Sentinel sets the source Sentinel as the local leader Sentinel.

9. If a certain Sentinel is set as the local leader Sentinel by more than half of the Sentinel, then this Sentinel is called the leader Sentinel.

10. If within a given time limit, no Sentinel is elected as the leader Sentinel, then each Sentinel will elect again after a period of time until the leader Sentinel is elected.

Failover

After the leader Sentinel is elected, the leader Sentinel will perform a failover operation on this offline server.

1. Among all the slave servers under the offline master server, select a slave server and convert it to the master server.

Selection process:

(1) Delete all the slave servers that are offline or disconnected in the list.

(2) Delete all slave servers in the list that have not served the INFO command of the leader Sentinel in the last 5 seconds.

(3) Delete slave servers that have been disconnected from the offline master server for more than down-after-milliseconds * 10 milliseconds.

(4) Sort according to priority. If there are multiple units with the highest priority, they will be sorted according to the largest offset. If there are multiple units, the slave server with the smallest running ID will be sorted by running ID.

2. Change all slave servers under the offline master server to replicate the new master server.

Send the command SLAVEOF <IP of the new main server><PORT of the new main server>

3. Set the offline master server as the slave server of the new master server. When the old master server comes back online, it will become the slave server of the new master server.


Learn a little every day, there will always be gains.

Note: Respect the author's intellectual property rights, refer to "Redis Design and Implementation" for the content in the article, and only learn here to share with you.


Insert picture description here

Guess you like

Origin blog.csdn.net/xuetian0546/article/details/106630351