22_redis sentinel and master switch data loss problem: asynchronous replication, cluster split

1. Two data loss scenarios

The process of master-slave switching may cause data loss

(1) Data loss caused by asynchronous replication

Because the replication of master-> slave is asynchronous, there may be some data that has not been copied to the slave, the master will be down, and this part of the data will be lost at this time.

(2) Data loss caused by split brain

Split brain, that is to say, the machine where a master is suddenly disconnected from the normal network and cannot connect to other slave machines, but in fact the master is still running

At this point, the sentry may think that the master is down, and then open the election to switch other slaves to master.

At this time, there will be two masters in the cluster, also known as split brain

At this time, although a slave is switched to the master, the client may not have time to switch to the new master, and the data written to the old master may be lost.

Therefore, when the old master is restored again, it will be hung as a slave to the new master, its own data will be cleared, and the data will be copied from the new master

------------------------------------------------------------------

2. Solve the data loss caused by asynchronous replication and split brain

min-slaves-to-write 1
min-slaves-max-lag 10

At least one slave is required, and the delay of data replication and synchronization cannot exceed 10 seconds

If the latency of data replication and synchronization exceeds 10 seconds for all slaves, then at this time, the master will not receive any more requests

The above two configurations can reduce data loss caused by asynchronous replication and split brain

(1) Reduce the data loss of asynchronous replication

With the configuration of min-slaves-max-lag, you can ensure that once the slave replication data and ack delay is too long, it is considered that too much data may be lost after the master is down. When the master is down, part of the data is not synchronized to the controllable range of data loss caused by the slave.

(2) Reduce data loss due to split brain

If a master has a split brain and loses connection with other slaves, then the above two configurations can ensure that if it cannot continue to send data to the specified number of slaves, and the slave does not give itself an ack message for more than 10 seconds, then it directly refuses Client write request

In this way, the old master after splitting the brain will not accept the new data of the client, and thus avoid data loss

The above configuration ensures that if a connection is lost with any slave, and after 10 seconds, no slave is found to give itself ack, then the new write request is rejected

So in a split-brain scenario, you lose up to 10 seconds of data

Guess you like

Origin www.cnblogs.com/hg-super-man/p/12723406.html