Principle and implementation of redis master-slave replication

1. The concept of master-slave replication

For high availability, redis will copy data to multiple copies and deploy them to other nodes. Through replication, the high availability of redis is realized, and the redundant backup of data is realized to ensure the reliability of data and services.
Data replication is one-way, only from the master node to the slave node. The following figure describes a simple master-slave replication architecture.
Insert picture description here
The role of master-slave replication
1) Data redundancy
Master-slave replication realizes hot backup of data, which is a data redundancy method besides persistence.

2) Failure recovery
When the master node has a problem, the slave node can provide services to achieve rapid failure recovery; in fact, it is a kind of service redundancy.

3) Load balancing
is based on master-slave replication, with read-write separation, the master node can provide write services, and the slave nodes provide read services (that is, the application connects to the master node when writing Redis data, and the application connects to the slave node when reading Redis data ) To share the server load; especially in the scenario of writing less and reading more, sharing the read load by multiple slave nodes can greatly increase the concurrency of the Redis server.

4) Read-write separation It
can be used for read-write separation, master library write, slave library read, read-write separation can not only improve the load capacity of the server, but also change the number of slave libraries according to changes in demand.

5) The cornerstone of high availability
In addition to the above four functions, master-slave replication is also the basis for the implementation of sentinels and clusters. Therefore, master-slave replication is the basis for Redis high availability.


Two, master-slave replication mechanism

Redis's master-slave replication function not only supports simultaneous replication of one Master node corresponding to multiple slave nodes, but also supports slave node replication to other multiple slave nodes. This allows the architect to flexibly organize the dissemination of business cache data. For example, while using multiple Slaves as data reading services, one Slave node is used exclusively for streaming analysis tools.
Redis's master-slave replication function is divided into two data synchronization modes: full data synchronization and incremental data synchronization.

1. Full data synchronization The
Insert picture description here
above figure briefly illustrates the full data synchronization process from the Master node to the Slave node in Redis.

When will full synchronization be performed?
1) When the replication id given by the slave node is inconsistent with the replication id of the master ;
2) or the offset position of the last incremental synchronization given by the slave cannot be located in the ring memory (replication backlog) of the master .
The Master will initiate a full synchronization operation on the Slave .

During full synchronization, regardless of whether the RDB snapshot function is turned on in the Master, each full synchronization operation between it and the Slave node will update/create the RDB file on the Master.
After the slave connects to the master and completes the first full data synchronization, the subsequent data synchronization process from the master to the slave is generally in the form of incremental synchronization (also called partial synchronization). The incremental synchronization process no longer mainly depends on the RDB file. The Master will store the newly generated data change operation in the memory buffer area of ​​the replication backlog. This memory area is a ring buffer, that is, a FIFO queue.

Features:

  • The master node uses the BGSAVE command fork child process to perform RDB persistence, which consumes CPU, memory (page table copy), and hard disk IO very much.
  • The master node sends the RDB file to the slave node through the network, which will consume a lot of bandwidth of the master node.
  • The process of clearing data from the node and loading a new RDB file is blocked and cannot respond to client commands; if the bgrewriteaof is executed from the node, it will also bring additional consumption.

2. Incremental data synchronization
Redis 2.8 began to provide partial replication to handle data synchronization during network interruptions.
Incremental synchronization-The master connects to the slave as a normal client and forwards all write operations to the slave. There is no special synchronization protocol. The specific process is as follows:
Insert picture description here

Q : Why is the new addition to the data on the Master RDB according to the Master node or AOF settings for log file updates, will be the same
when the data changes written into a ring memory structure (replication backlog), and the latter were based on Slave What about incremental updates of nodes? The main reasons are as follows:
1) Due to the instability of the network, network jitter/delay may cause the slave and the master to be temporarily disconnected. This situation is far more common than the situation where a new slave is connected to the master. If full update is used in all the above situations, it will greatly increase the load pressure of the master-writing RDB files involves a large number of I/O processes, although the Linux Page Cache feature will reduce performance consumption.
2) In addition, when the amount of data reaches a certain scale, using full update for the first synchronization with the Slave is a last resort-because the data difference between the Slave node and the Master node must be reduced as soon as possible. Therefore, it can only occupy the resources of the Master node and network bandwidth resources.
3) The use of memory to record data incremental operations can effectively reduce the I/O cost of the Master node in this regard. The reason for making the ring memory is to ensure that the memory usage is reduced as much as possible while meeting the data recording requirements. The size of this ring memory can be set by the repl-backlog-size parameter

After the slave reconnects, it will send the previously received Master replication id information and the offset location information of the last partial synchronization to the Master. If the Master can determine that this replication id is consistent with its own replication id (there are two) and can find the location of the offset in the ring memory, the Master will send incremental data to the slave starting from the offset location.

Question : How do all slave nodes that are connected normally receive new data?
The normally connected slave node will actively receive the data replication information from the master after the master node writes the data into the ring memory.

Q : What is the appropriate size of the Replication backlog?
The default size set by redis for the Replication backlog is 1M, and this value can be adjusted. If the main service needs to execute a large number of write commands, or it takes a long time to reconnect after disconnection between the main services, then this size may not be appropriate. If the size of the replication backlog is not properly set, then the replication synchronization mode of the PSYNC command cannot function normally
. Therefore, it is very important to correctly estimate and set the size of the replication backlog.

计算参考公式:size = reconnect_time_second * write_size_per_second*2

For example, if the average time of network interruption is 60s, and the average number of bytes of write commands (specific protocol format) generated by the master node per second is 100KB, the average demand for copying the backlog buffer is 6MB. To be safe, you can set It is 12MB to ensure that partial replication can be used in most disconnected situations.


Three, master-slave replication realization

1. The activation of master-slave replication is completely initiated by the slave node, and nothing needs to be done on the master node.

There are 3 ways to enable master-slave replication:

1) Method 1-Modify the configuration file The
configuration file is modified from the redis.conf file of the node, and the configuration file is added:slaveof <masterip> <masterport>

2) Method 2-Use the start command to
start the slave node, add after the redis-server start command--slaveof <masterip> <masterport>
Insert picture description here

3) Method 3-Use client commands After
starting the Redis server from the node, execute the command directly through the client:, slaveof <masterip> <masterport>then the Redis instance becomes the slave node.
Insert picture description here

2. Master-slave replication effect display :
1) Use info replicationcommands to view the redis information of master and slave respectively, as shown in the figure below.
Insert picture description here
2) After implementing master-slave replication, write data on the master side, and the corresponding data can also be read on the slave.
Insert picture description here

3. Slaveof command operation content

  • Determine whether the current environment is in cluster mode, because this command is not executed in cluster mode.
  • Whether to execute the SLAVEOF NO ONE command, this command will disconnect the master-slave relationship and set the current node as the master server.
  • Set the IP and port of the master node to which the slave node belongs. Called the replicationSetMaster() function.

Redis-5.0.8 Slaveof command realizes source code reading:
Insert picture description here

void replicaofCommand(client *c) {
    
    
    /* SLAVEOF is not allowed in cluster mode as replication is automatically
     * configured using the current address of the master node. */
    if (server.cluster_enabled) {
    
     //如果是集群模式 ,则退出,不执行
        addReplyError(c,"REPLICAOF not allowed in cluster mode.");
        return;
    }

    /* The special host/port combination "NO" "ONE" turns the instance
     * into a master. Otherwise the new master address is set. */
    if (!strcasecmp(c->argv[1]->ptr,"no") &&
        !strcasecmp(c->argv[2]->ptr,"one")) {
    
    
        if (server.masterhost) {
    
    
            replicationUnsetMaster();  //取消原来的复制操作
            sds client = catClientInfoString(sdsempty(),c);
            serverLog(LL_NOTICE,"MASTER MODE enabled (user request from '%s')",
                client);
            sdsfree(client);
        }
    } else {
    
    
        long port;

        if (c->flags & CLIENT_SLAVE)
        {
    
    
            /* If a client is already a replica they cannot run this command,
             * because it involves flushing all replicas (including this
             * client) */
            addReplyError(c, "Command is not valid when client is a replica.");
            return;
        }

        if ((getLongFromObjectOrReply(c, c->argv[2], &port, NULL) != C_OK))
            return;

        /* Check if we are already attached to the specified slave */
        if (server.masterhost && !strcasecmp(server.masterhost,c->argv[1]->ptr)
            && server.masterport == port) {
    
    
            serverLog(LL_NOTICE,"REPLICAOF would result into synchronization with the master we are already connected with. No operation performed.");
            addReplySds(c,sdsnew("+OK Already connected to specified master\r\n"));
            return;
        }
        /* There was no previous master or the user specified a different one,
         * we can continue. */
        replicationSetMaster(c->argv[1]->ptr, port); //设置服务器复制操作的主节点IP和端⼝
        sds client = catClientInfoString(sdsempty(),c);
        serverLog(LL_NOTICE,"REPLICAOF %s:%d enabled (user request from '%s')",
            server.masterhost, server.masterport, client);
        sdsfree(client);
    }
    addReply(c,shared.ok);
}

//replicationSetMaster函数
/* Set replication to the specified master address and port. */
void replicationSetMaster(char *ip, int port) {
    
    
    int was_master = server.masterhost == NULL;
	//清除原来的主节点
    sdsfree(server.masterhost);
    
    //设置新的IP和端口
    server.masterhost = sdsnew(ip);
    server.masterport = port;
    
    //释放其他主节点
    if (server.master) {
    
    
        freeClient(server.master);	
    }
    //解除所有客户端的阻塞状态
    disconnectAllBlockedClients(); /* Clients blocked in master, now slave. */

    /* Force our slaves to resync with us as well. They may hopefully be able
     * to partially resync with us, but we can notify the replid change. */
    //关闭所有从节点服务器的连接,强制从节点服务器进行重新同步操作
    disconnectSlaves();
    //取消执行复制操作
    cancelReplicationHandshake();
    /* Before destroying our master state, create a cached master using
     * our own parameters, to later PSYNC with the new master. */
    if (was_master) {
    
    
        replicationDiscardCachedMaster(); //释放主节点结构的缓存,不会执行部分重同步PSYNC
        replicationCacheMasterUsingMyself();
    }
    server.repl_state = REPL_STATE_CONNECT;	//设置复制必须重新连接主节点的状态
}

The slaveof command is an asynchronous command. When the command is executed, the slave node saves the information of the master node and returns immediately after establishing the master-slave relationship. The subsequent replication process is executed asynchronously within the node. So how to trigger the execution of replication?
Periodically executed function: replicationCron() function, this function is called by the server's time event callback function serverCron(), and the serverCron() function is set as the time event processing function when the Redis server is initialized.
Insert picture description here
This function replicationCron is executed every second.
Insert picture description here
The code for the replicationCron() function to handle this situation is as follows: The
Insert picture description here
replicationCron() function calls connectWithMaster() to connect to the master node non-blocking according to the status of the slave node. code show as below:

//以非阻塞的方式连接主节点
int connectWithMaster(void) {
    
    
    int fd;
	//连接主节点
    fd = anetTcpNonBlockBestEffortBindConnect(NULL,
        server.masterhost,server.masterport,NET_FIRST_BIND_ADDR);
    if (fd == -1) {
    
    
        serverLog(LL_WARNING,"Unable to connect to MASTER: %s",
            strerror(errno));
        return C_ERR;
    }
	//监听主节点fd的可读和可写事件的发生,并设置其处理程序为syncWithMaster
    if (aeCreateFileEvent(server.el,fd,AE_READABLE|AE_WRITABLE,syncWithMaster,NULL) ==
            AE_ERR)
    {
    
    
        close(fd);
        serverLog(LL_WARNING,"Can't create readable event for SYNC");
        return C_ERR;
    }
	//最近一次读到RDB文件内容的时间
    server.repl_transfer_lastio = server.unixtime;
    //从节点和主节点的同步套接字
    server.repl_transfer_s = fd;
    //处于和主节点正在连接的状态
    server.repl_state = REPL_STATE_CONNECTING;
    return C_OK;
}

4. Related parameter configuration of master-slave replication
1) Related configuration of master node

  • repl-timeout 60: It is related to the judgment of the connection timeout of the master and slave nodes in each stage.

  • repl-diskless-sync no: Acts in the full replication stage, controls whether the master node uses diskless replication (diskless replication). Diskless replication means that during full replication, the master node no longer writes the data into the RDB file first, but directly writes it to the slave socket. The hard disk is not involved in the whole process; diskless replication is very slow in disk IO and the network speed is very high. Faster has an advantage. It should be noted that as of Redis3.0, diskless replication is in the experimental stage and is disabled by default.

  • repl-diskless-sync-delay 5: This configuration is used in the full replication stage. When the master node uses diskless replication, this configuration determines the pause time before sending the master node to the slave node. The unit is seconds; it is valid only when diskless replication is enabled, and the default is 5s. The reason for setting the pause time is based on the following two considerations: (1) Once the transmission to the slave socket starts, the newly connected slave can only wait for the end of the current data transmission before starting a new data transmission (2) Multiple slave nodes There is a higher probability of establishing master-slave replication in a short time.

  • client-output-buffer-limit slave 256MB 64MB 60: It is related to the buffer size of the master node in the full replication stage.

  • repl-disable-tcp-nodelay no: Related to the delay in the command propagation phase.

  • masterauth <master-password>: It is related to the authentication during the connection establishment phase.

  • repl-ping-slave-period 10: It is related to the timeout judgment of the master and slave nodes in the command propagation phase.

  • repl-backlog-size 1MB: Copy the size of the backlog buffer.

  • repl-backlog-ttl 3600: When the master node has no slave nodes, copy the backlog buffer retention time, so that when the disconnected slave node reconnects, partial replication can be performed; the default is 3600s. If set to 0, the copy backlog buffer will never be released.

  • min-slaves-to-write 3AND min-slaves-max-lag 10: specifies the minimum number of slave nodes of the master node and the corresponding maximum delay.

2) Slave node related configuration

  • slaveof <masterip> <masterport>: Redis works when it is started; the role is to establish a replication relationship, and the Redis server with this configuration turned on becomes a slave node after it is started. This comment is commented out by default, that is, Redis server is the master node by default.
  • repl-timeout 60: It is related to the judgment of the connection timeout of the master and slave nodes in each stage.
  • slave-serve-stale-data yes: It is related to whether to respond to client commands when the slave node data is out of date.
  • slave-read-only yes: Whether the slave node is read-only; the default is read-only. Since the write operation of the slave node can easily lead to inconsistent data between the master and slave nodes, this configuration should not be modified as much as possible.

Four, master-slave replication needs attention

1. Persistence issues Make
sure that the master activates persistence, or make sure that it will not automatically restart after crashing.
Because the slave is a complete backup of the master, if the master restarts with an empty data set, the slave will also be cleared.

2. Password problem
If the password is set for the master database when configuring the redis replication function, you need to set the password of the master database through the masterauth parameter in the configuration file of the slave data, so that the slave database will automatically use the auth command to authenticate when connecting to the master database Up. It is equivalent to a password-free login.

3. The problem of read-write separation
1) Delay and inconsistency.
Master-slave replication is asynchronous, and inconsistency between delay and data is inevitable. Optimized measures:
1) Optimize the network between master and slave.
2) Monitor the delay of the master and slave nodes (judged by offset). If the delay of the slave node is too large, notify the application to no longer read data through the slave node.
3) Use the cluster to expand the write load and read load at the same time.

2) Data expiration problem
The deletion strategy adopted by stand-alone redis: lazy deletion and regular deletion.
Lazy deletion: The server does not actively delete data. Only when the client queries certain data, the server determines whether the data has expired to decide whether to delete it.
Periodic deletion: Execute timing tasks to delete expired data, which will affect the memory and CPU, and the deletion frequency and execution time are limited.

Master-slave replication: For the consistency of master-slave data, the master node controls the deletion of expired data from the slave node. Since the lazy deletion and periodic deletion of the master node cannot guarantee timely deletion of expired data, when the client reads data from the node through redis, it is easy to read the expired data.

3) In
the read-write separation scenario where the sentinel is not used for failover , the application connects to different redis nodes for reading and writing; when the master node or the slave node has problems and changes, the connection of the application to read and write redis needs to be modified in time; The switch can be done manually, or you can write your own monitoring program to switch.

4. Replication timeout problem.
Meaning of setting timeout
-the master node releases the slave connection and releases resources. Avoid invalid connections occupying output buffers, bandwidth, connections, etc.
-The slave node times out, re-establish the connection to avoid data inconsistency with the master node.

Questions that lead to
1)
Description of the problem in the data synchronization phase : If the RDB file is too large, the master node will take too much time to fork the child process + saving the RDB file, which may cause the slave node to fail to receive data for a long time and trigger a timeout; The slave node will reconnect to the master node, then replicate in full again, time out again, and reconnect again... forming a vicious circle.
Solution: Don't have too much data on a single redis machine, and increase the repl-timeout value appropriately.

2) Command propagation stage.
Network jitter causes individual PING commands to be lost, resulting in overtime misjudgment.

3) Blocking caused by slow query The
master node or slave node executes some slow queries (such as key* or hgetall for big data, etc.), causing the server to block; during the blocking period, it cannot respond to the request of the other node in the replication connection, which may cause replication timeout.

5.
Replication interruption There are many situations in which replication interruption can cause. The timeout of the master and slave nodes is one of the reasons for the interruption. The most important reason is the overflow of the replication buffer .

Problem description: During the
full copy phase, the master node will put the executed write commands into the copy buffer. The data stored in the buffer includes the write commands executed by the master node in the following time periods:
BGSAVE generates RDB files -> RDB files Sent from the master node to the slave node -> the slave node clears the old data and loads the data in the RDB file.
When the amount of data on the master node is large, or the network delay of the active node is large, the size of the buffer may exceed the limit. At this time, the master node will disconnect from the slave node. In this case, full replication may occur- > Copy buffer overflow causes connection interruption -> reconnect -> full copy -> copy buffer overflow causes connection interruption... the loop continues...

Solution:
The size of the copy buffer is configured by client-output-buffer-limit slave {hard limit} {soft limit} {soft seconds}, the default value is: 256MB 64MB 60
Meaning: If the buffer is greater than 256MB, or if the buffer is greater than 64MB for continuous 60s , The master node will disconnect from the slave node.
Parameters can be dynamically configured through the config set command (it can take effect without restarting redis).

It should be noted that the parent-child buffer is a type of client output buffer, the master node will allocate a copy buffer for each slave node; while the copy backlog buffer is a master node only one, no matter how many slaves it has node.

6. Restart of the master node The restart of the
master node can be divided into two situations: failures lead to downtime, and planned restarts.
1) Failure causes downtime After the
main node goes down, the runid will change, so partial replication cannot be performed, but full replication can only be performed.
When the master node is down, a failover process should be performed. One of the nodes should be upgraded to the master node, and the other slave nodes should be replicated from the new master node; at the same time, the failover should be automated (sentinel mode).

2) Safe restart debug reload-planned restart
The memory fragmentation rate of the master node is too high, or if you want to adjust some parameters that can only be adjusted at startup, then a safe restart is required.
If the primary node is restarted by ordinary means, the runid will change, which may cause unnecessary full replication.

In order to solve this problem, Redis provides a restart method of debug reload: After restart, the runid and offset of the master node are not affected, avoiding full replication .
At the same time, it should be noted that debug reload will clear the data in the current memory and reload it from the RDB file. This process will cause direct blocking, so use it with caution.

7. Slave node restart After the
slave node is down and restarted, its saved runid of the master node will be lost, so even if slaveof is executed again, partial replication cannot be performed.

8. Network interruption
1) In the first case, the
network problem is extremely short, causing only a short packet loss, and neither the master nor the slave node determines the timeout (repl-timeout is not triggered); at this time, only REPLCONF ACK is needed to supplement the lost data That's it.

2) In the second case, the
network problem takes a long time, the master-slave node judges timeout (repl-timeout is triggered), and the lost data exceeds the storage range of the replication backlog buffer; at this time, the master-slave node cannot Partial copy, only full copy.
In order to avoid this situation as much as possible, the size of the copy backlog buffer should be appropriately adjusted according to the actual situation; in addition, timely detection and repair of network interruptions can also reduce the full amount of copying.

3) The third case.
Between the above two cases, the master and slave nodes judge the timeout, and the lost data is still in the replication backlog buffer; at this time, the master and slave nodes can perform partial replication.

Guess you like

Origin blog.csdn.net/locahuang/article/details/110817368