Reading the configuration, explaining the principles, and reading the real interview questions, I can only help you so far. . .

When I started writing about master-slave, cluster, and sentinel, who could bear it , many friends couldn’t wait to leave a message to see these modes. Today we will start from three aspects: configuration files, design principles, and real interview questions. Let's talk about Redis's master-slave replication.

Based on Redis replication, it is very simple to use and configure master-slave replication , which enables the slave Redis server (hereinafter referred to as replica) to accurately replicate the contents of the master Redis server (hereinafter referred to as master). Every time the connection between the replica and the master is broken, the replica will automatically reconnect to the master, and no matter what happens to the master during this time, the replica will try to make itself an exact copy of the master.

Master-slave replication, starting from version 5.0.0, Redis officially renamed the SLAVEOF command to the REPLICAOF command and gradually discarded the original SLAVEOF command

Redis uses asynchronous replication by default, which is characterized by low latency and high performance , and is the natural replication mode for the vast majority of Redis use cases. However, the replica will asynchronously acknowledge the amount of data it has received from the primary master cycle.

master-slave topology

The master is used for writing operations, and the replicas are used for reading data, which is suitable for scenarios with more reads and fewer writes. For scenarios with high write concurrency, multiple slave nodes will cause the master node to send write commands multiple times, thereby excessively consuming network bandwidth, and at the same time increasing the load on the master and affecting service stability.

A replica can accept connections from other replicas. In addition to multiple replicas can be connected to the same master, replicas can also be connected to other replicas in a cascading-like structure. Since Redis 4.0, all sub-replicas will receive exactly the same replication stream from the master.

When the master needs multiple replicas, in order to avoid performance interference to the master, a tree-like master-slave structure can be used to reduce the pressure on the master node.

In order to let everyone have a clearer understanding of the concept, let's take a look at the introduction of the parameters of the master-slave replication in the configuration file:

REPLICATION

replicaof <masterip> <masterport>

By setting the ip and port of the master, the current Redis instance can be made a copy of another Redis instance. When Redis starts, it will automatically synchronize data from the master.

  • Redis replication is asynchronous. By modifying the configuration of the master, when the master is not connected to a given number of replicas, the master stops receiving writes;
  • If the replication link is lost for a relatively short time, the Redis replica can perform a partial resynchronization with the master, which can be configured with a reasonable backlog value (see below );
  • Replication is automatic and requires no user intervention. After a network partition, the replica will automatically try to reconnect to the master and resynchronize with the master;

masterauth <master-password>

When the master is password protected, the password for the replica service to connect to the master

replica-serve-stale-data yes

When the replica loses connection with the master or the master-slave replication is in progress, the replica can have two different settings:

  • replica-serve-stale-data: yes (the default), then the replica will still respond to client requests, possibly with stale data, or if this is the first sync, the dataset may be empty.
  • replica-serve-stale-data: no, replica will serve all request commands (but not including INFO, replicaOF, AUTH, PING, SHUTDOWN, REPLCONF, ROLE, CONFIG, SUBSCRIBE, UNSUBSCRIBE, PSUBSCRIBE, PUNSUBSCRIBE, PUBLISH, PUBSUB, COMMAND, POST, HOST: and LATENCY) returns an error of SYNC with master in progress .

replica-read-only

You can configure whether the replica is read-only, yes means it is read-only, and all write commands will be rejected; no means it can be written. Since Redis 2.6, replica supports read-only mode and is enabled by default. Can be turned on or off at any time at runtime using CONFIG SET.

Writing to the replica might be helpful for storing some temporary data (since data written to replica is easily deleted after resync with master), computing slowset or sortedset operations and storing them to a local key is more One use case for writable replicas was observed. But it can also cause problems if clients write data to it due to a misconfiguration.

In the cascade structure, even if the replica B node is writable, Sub-replica C will not see the write of B, but will have the same data set as master A.

Setting it to yes does not mean that when the client connects to the cluster with the replica as the entrance, the set operation cannot be performed, and the data of the set operation will not be placed in the slot of the replica, but will be placed in the slot of a master .

Note: A read-only replica is not designed to be exposed to untrusted clients on the Internet, it is just a layer of protection against instance misuse. By default, read replicas still export all administrative commands such as CONFIG, DEBUG, etc. To a certain extent, can be used rename-commandto hide all administrative/dangerous commands, thus increasing the security of the read replica .

repl-diskless-sync

Replication synchronization strategy: disk (disk) or socket (socket), the default is no to use disk.

New replicas and reconnecting replicas need to perform a "full sync" if they cannot continue the replication process because they received differences. RDB files are transferred from the master to the replicas. The transfer can be done in two different ways:

  1. Disk-backed: The Redis master node creates a new process and writes the RDB file to disk , and then the file is incrementally transferred to the replicas node through the parent process;
  2. Diskless: The Redis master node creates a new process and directly writes RDB files to the sockets of replicas without writing to disk.
  • When performing disk-backed replication, the RDB file is generated, and multiple replicas are queued to synchronize the RDB file.
  • When performing diskless replication, the master node will wait for a period of time (repl-diskless-sync-delay configuration below) before transmitting in the hope that multiple replicas will connect, so that the master node can synchronize to multiple replicas nodes at the same time . If the waiting time is exceeded, you need to queue up and wait for the processing of the current replica to complete the processing of the next replica.

The performance of the hard disk is poor, and the effect of diskless is better when the network performance is good.

WARNING: Diskless copying is currently experimental

repl-diskless-sync-delay

When diskless replication is enabled, you can use this option to set the waiting time before the master node creates a child process, that is, to delay the start of data transmission. The purpose is to wait for more replicas to be ready after the first replica is ready. The unit is seconds, and the default is 5 seconds .

repl-ping-replica-period

The interval at which Replica sends PING to the master is 10 seconds by default.

repl-timeout

The default value is 60 seconds. This option is used to set the timeout judgment in the following situations:

  • I/O transmission in the SYNC process from the perspective of the replica node - the rdb snapshot data transmitted by the master SYNC has not been received;
  • The master's timeout (such as data, pings) from the perspective of the replica node - the replica has not received the data packet or ping sent by the master;
  • The replica timeout from the perspective of the master node (such as REPLCONF ACK pings) - the master has not received the confirmation message of REPLCONF ACK; it
    should be noted that this option must be greater than repl-ping-replica-period, otherwise the master and replica Timeouts often occur when there is low traffic in between.

repl-disable-tcp-nodelay

Whether to turn off the TCP_NODELAY option for the connection between master and replicas nodes.

  • If you choose "yes", Redis will use fewer TCP packets and less bandwidth to send data to replicas. But this will increase the delay of data displayed on the replicas side. For the Linux kernel using the default configuration, the delay can reach 40 milliseconds.
  • If "no" is selected, there will be less latency for data to appear on the replicas side, but replication will use more bandwidth.

This actually affects the options of the TCP layer, which will be set with setsockopt. The default is no, which means that the TCP layer will disable the Nagle algorithm and send the data as soon as possible. Setting it to yes means that the TCP layer will enable the Nagle algorithm, and the data will accumulate to a certain extent, or After a certain period of time, the TCP layer will send it out.

By default we optimize for low latency, but in very high traffic situations, or when master and replicas are many hops away, it may be better to change this option to "yes".

repl-backlog-size

Set the copied backlog buffer size, the default is 1mb. The backlog is a buffer. When the replica is disconnected for a period of time, it will accumulate replica data. Therefore, when the replica wants to reconnect again, it generally does not need full synchronization , but only partial synchronization . Part of the data lost when opening the connection.

A larger backlog buffer size means that after the replicas are disconnected and reconnected, the time for resumed transmission is longer (support for longer disconnection time).

The backlog buffer only needs to be created when at least one replica node is connected to the master node.

repl-backlog-ttl

When the replicas node is disconnected, the master node will release the backlog buffer after a period of time. This option sets how many seconds the master needs to wait before releasing the buffer after the last replica is disconnected. The default is 3600 seconds, 0 means never release.

The replicas node will never release this buffer, because it may connect to the master node again, and then try to do "incremental synchronization".

replica-priority

replica-priority is an integer published by Redis through the INFO interface, and the default value is 100. When the master node fails to work normally, Redis Sentinel uses this value to decide which replica node to promote to the master node. The smaller the value , the higher the priority for improvement. If there are three replica nodes whose priority values ​​are 10, 100, and 25 respectively, Sentinel will select the node with priority 10 to upgrade. A value of 0 means that replica nodes can never be promoted to master nodes.

min-replicas-to-write

min-replicas-max-lag

//表示要求至少3个延迟<=10秒的副本存在
min-replicas-to-write 3  //下文中的 N
min-replicas-max-lag 10 //下文中的 M

Starting from Redis 2.8, if the number of connected replicas whose latency is less than or equal to M seconds is less than N (N replicas need to be in the "online" state), the master may stop accepting writes and reply with an error. Since Redis uses asynchronous replication, there is no guarantee that a replica actually received a given write command, so there is always a window of data loss.

The principle is as follows:

  • The replica will ping the master every second to confirm the number of replication streams processed;
  • The master will remember the last time it received a ping from each replica, and the delay is calculated based on the last ping received by the master from the replica;
  • Users can configure the minimum number of replicas whose delay does not exceed the maximum number of seconds;

This option does not guarantee that N replicas will accept the write, but limits the exposure window for lost writes to the specified number of seconds in the event that not enough replicas are available.

The default value of N is 0, and the default value of M is 10. Either setting to 0 means that this function is not enabled.

replica-announce-ip 5.5.5.5

replica-announce-port 1234

The Redis master can list the addresses and ports of the connected replicas nodes in different ways. For example, Redis Sentinel will use the "INFO replication" command to obtain replica instance information, and the master's "ROLE" command will also provide this information.

Generally speaking, this information is obtained and reported by the replica node in the following ways:

  • IP: Obtained automatically by automatically identifying the information connected to the Socket
  • Port: Generally speaking, this value is the listening port used by the replicas node to accept client connections

However, if port forwarding or NAT is enabled, other addresses and ports may be required to connect to replicas nodes. In this case, these two options need to be set, so that replicas will use the values ​​​​set by these two options to override the values ​​​​obtained by the default behavior, and then report to the master node. According to the actual situation, you can only set one of the options, instead of both options.

This is the end of the introduction to the configuration. Next, let's connect the concepts mentioned above and talk about the principles of master-slave replication.

principle

The operation of the system relies on three main mechanisms

  • When a master instance and a replica instance are connected normally, the master will send a series of command streams to keep the replica updated, so that the changes of its own data set can be replicated to the replica, including client writing, key expiration or being evicted out and so on.
  • When the connection between the master and replica is broken, because of network problems or the master-slave realizes that the connection has timed out, the replica will reconnect to the master and try to perform partial resynchronization. This means it will try to get only the command stream lost during the disconnection period.
  • When a partial resynchronization cannot be performed, the replica will request a full resynchronization. This would involve a more complex process, for example the master needs to create a snapshot of all the data, send it to the replica, and then continuously send the command stream to the replica as the dataset changes.

How Redis replication works

Redis masterThere is one for each replication ID: this is a larger pseudorandom string that labels a given dataset. Each master also holds an offset. When the master sends the replication stream generated by itself to the replica, how many bytes of data it sends will increase its own offset. The purpose is that when there is a new operation to modify itself It can use this to update the status of the replica.

Replication offsets are auto-incremented even when no replica is connected to the master, so essentially every given pair Replication ID, offsetidentifies an exact version of the master dataset.

When the replica connects to the master, it uses the PSYNC command to send its record of the old master replication ID and the offsets it has processed so far. In this way, the master is able to send only the deltas needed by the replica. But if there is not enough backlog in the master's buffer or the replica refers to a history record (replication ID) that the master does not know, it will switch to a full resynchronization: in this case, the replica will get a complete data set Copy, start from scratch.

Speaking of which, what is full synchronization, and what is incremental synchronization?

full sync

  1. The replica connects to the master and sends the PSYNC command;
  2. The master executes bgsave to start a background save process in order to produce an RDB file. At the same time it starts buffering all new write commands received from clients.
  3. When the background save is completed, the master transfers the dataset file to all replicas, and continues to record the executed write commands during the sending;
  4. After replica receives the RDB file, it discards all old data, and then loads the new file into memory;
  5. After the replica is loaded, notify the master to send all buffered commands to the replica. This process is completed in the form of instruction stream and has the same format as the Redis protocol itself;
  6. The replica starts receiving command requests and executes write commands from the master buffer.

Note: SYNC is an old protocol that is no longer used in the new Redis, but is still backward compatible. Because it does not allow partial resynchronization, PSYNC is now used instead of SYNC.

Under normal circumstances, a full resynchronization requires creating an RDB file on the disk, and then loading it from the disk into the memory, and then the replica uses it for data synchronization. This can be a stressful operation on the master if disk performance is low. Redis 2.8.18 is the first version to support diskless replication. In this setup, child processes send RDB files directly to replica sockets without using a disk as an intermediate storage medium.

Incremental synchronization

While the master sends the command to all replicas, it also writes the command into the backlog buffer.

When the replica is disconnected from the master and then reconnected, it is necessary to determine whether the difference between the offset of the replica and the offset of the master exceeds the size of the backlog.

  • If not, send CONTINUE to the replica and wait for the master to send the data in the backlog to the replica;
  • If it exceeds, return FULLRESYNC runid offset, replica saves the runid and performs full synchronization;

Finally, let's talk about a few interview questions that are often mentioned during the interview process.

interview questions

In the process of master-slave replication, what problems will be caused by turning off the persistence of the master?

Data will be deleted from the master and all replicas. Let's illustrate with a case:

  1. We set node A as master and turn off its persistence settings, and set nodes B and C as replicas;
  2. When the master crashes, because the automatic restart script is configured in the system, the master will restart automatically at this time. But because the persistence is turned off, the data collection of the master is empty after restarting;
  3. At this point, if the replica synchronizes data from the master, the data in the replica will also become an empty collection.

Therefore, when we use the Redis replication function, we strongly recommend enabling persistence in the master and replica. If persistence is not enabled due to latency issues caused by very slow disk performance, the node should be configured to avoid automatic restart after reset .

How Redis replication handles key expiration

The expiration mechanism of Redis can limit the life time of the key, which depends on the ability of Redis to calculate time. However, Redis replicas correctly replicated these keys even if they were made expired using a Lua script.

In order to achieve such a function, Redis cannot rely on the master-slave to use synchronous clocks , because this is an unsolvable problem that will cause inconsistencies between race conditions and data sets, so Redis uses three main technologies to make the replication of expired keys correct Work:

  • The replica does not let the key expire, but waits for the master to let the key expire. When a master expires a key (or evicts it due to the LRU algorithm), it synthesizes a DEL command and transmits it to all replicas;
  • Due to the main driver, the master cannot provide the DEL command in time, so sometimes there may still be logically expired keys in the memory of the replica. To handle this, the replica uses its logical clock to report that the key for the read operation does not exist without violating data consistency. In this way, replicas avoid reporting that logically expired keys still exist. In practice, extended HTML fragment caching using the replica program will avoid returning data items that are older than expected.
  • During Lua script execution, no key expiration operation is performed. When a Lua script is run, time in the master is conceptually frozen such that a given key either exists or does not exist while the script is running. This prevents keys from expiring in the middle of a script, guaranteeing that the same script is sent to the replica, resulting in the same effect in both datasets.

Once a replica is promoted to a master, it will start expiring keys independently without any help from the old master.

The above is all the content of today. If you have different opinions or better ideas idea, please contact Ah Q. Add Ah Q to join the technical exchange group to participate in the discussion!

Guess you like

Origin blog.csdn.net/Qingai521/article/details/121415875