Small research - design and application of Mysql fast full synchronous replication technology (1)

Mysql semi-synchronous replication technology is widely used in high-performance data management, but it has insufficient reliability. This paper optimizes the semi-synchronous replication technology and proposes a fast full-synchronous replication technology. Technical means such as transaction process setting, reasonable application of thread resources, and batch log application in the replication process reduce the performance loss in the replication process on the basis of ensuring data reliability, and realize fast full-synchronous replication. The test results show that fast full-scale Synchronous replication technology can achieve a good balance in terms of performance, reliability and consistency, effectively improving the business carrying capacity of Mysql storage clusters.

Table of contents

1 Introduction

1 Principle of Fast Full Synchronous Replication Technology

1.1 Fast full synchronous replication

1.2 Features of fast full synchronous replication

1.2.1 Thread reuse

1.2.2 Data reliability guarantee

1.2.3 Batch write


1 Introduction

Data replication is to apply the data on the host node server and its changes to one or more standby node servers, so as to achieve the same purpose of the data of the host node and the standby node. The replication function is to achieve high availability, scalability, The foundation of disaster recovery capability and backup functions is widely used in distributed database multi-copy read-write, read-write separation, backup disaster recovery and other scenarios. According to the timing and consequences of replication technology, Mysql replication technology can be divided into asynchronous replication , synchronous replication, and semi-synchronous replication.

Asynchronous replication is a replication technology natively supported by Mysql. When the master node writes transaction information into the Binlog file, the master node will send these new Binlog changes to the standby node through the Binlog dump thread, without waiting for the response from the library and then submitting the transaction And write to Binlog, so asynchronous replication cannot guarantee that the Binlog data of these transaction changes can be reliably transmitted and applied to any standby node, so there is a possibility of data loss.

In the Mysql 5.5 version released in 2010, semi-synchronous replication was introduced. Semi-synchronous replication solved the data reliability problem of asynchronous replication. The master node needs to wait for at least one standby node to receive and successfully write the log to the relay log file , the client can receive the confirmation message of the completion of the replication, so as to enter the next stage of the transaction. Compared with asynchronous replication, semi-synchronous replication knows that the data exists in at least two locations when the commit returns successfully, thus improving data integrity. sex.

However, there is still the possibility of data loss in semi-synchronous replication. In semi-synchronous replication, if an exception occurs, without any confirmation message from the library, it will cause the transaction to wait for a timeout. In this case, the main library will degenerate into asynchronous Replication, until at least one semi-synchronous slave library returns to normal, the main library will resume semi-synchronous replication. In order to improve data reliability, a new technology was introduced in the Mysql 5.7.17 version released in 2016, called InnoDB Group Replication, that is, full synchronous replication. In full synchronous replication, when the main database executes a transaction, it must wait for all the slave databases to execute the transaction before returning to the client, so that the data can be fully guaranteed in all nodes. Successfully replicated. However, because it needs to wait for all slave libraries to execute the transaction before returning, the time for the master node of full synchronous replication to complete a transaction will be lengthened, resulting in a sharp drop in performance.

1 Principle of Fast Full Synchronous Replication Technology

1.1 Fast full synchronous replication

Fast full synchronous replication is a technology for Mysql storage cluster master nodes and standby nodes to replicate through Binlog. Through optimization methods such as thread resource reuse and batch confirmation during data replication, it ensures that the change log of the master node has been transferred to the standby machine. Based on the nodes, it can quickly respond to the requests and responses of database customers. The architecture of the fast full synchronous replication technology is shown in Figure 1, and the specific operation steps are as follows:


Step 1: The master node (Master) receives the submission request from the client program, and after completing the local submission , sends the data change log to the backup node (Slave) through the Binlog Dump thread. At this time, it does not give feedback to the client program that the request operation is successful news.

Step 2: After the standby node receives n Binlog change logs, the IO thread writes the change log to the Relay Log of the standby node. After completion, it sends a successful write confirmation message (ACK) back to the ACK message of the master node Queue (ACK Wait Queue) thread pool.

Step 3: After receiving the confirmation message from the backup node, the Wait thread pool of the host node feeds back to the client program, and the processing of this request ends.

1.2 Features of fast full synchronous replication

The basic principle of Mysql semi-synchronous replication version 5.6 and version 5.7 is shown in Figure 2.

The semi-synchronization of Mysql5.6 is the replication of the after commit mechanism. After the user transaction is submitted on the master node, the user thread will not feedback to the client program that the transaction is successful until the standby node confirms the message (ACK). Mysql5.7 and later enhancements Semi-synchronization is the replication mechanism of aftersync. Before the user transaction is submitted by the host node, the user thread does not complete the submission until the standby node confirms the message (ACK) and then feedbacks to the client program that the transaction is successful. The difference between these two mechanisms leads to The difference in data consistency between different transactions on the host node has no fundamental change in performance or reliability.

1.2.1 Thread reuse

Regardless of the semi-synchronous replication of version 5.6 or the enhanced semi-synchronous replication of version 5.7 and later, when the user session is waiting for the standby node confirmation message (Wait ACK), the user session always occupies a thread and does not exit until the transaction is completed. In a In a system with high load, a large number of user sessions will occupy a large number of thread resources while waiting for ACK, which will affect performance.

The fast full synchronous replication utilizes the thread pool technology. Each transaction Commit (including DDL, AUTOCOMMIT STMT, COMMIT, XA PREPARE, XA COMMIT, etc.) waits for the standby server before sending the Wait ACK packet to the client after completing all the commit processes. Machine node confirmation. When waiting for a complete WaitACK, because the thread pool is used, the transaction or its session does not occupy any operating system threads, and the worker threads of the database will continue to process other requests from other connections. This mechanism avoids resource waste, thereby Significantly improved performance.

1.2.2 Data reliability guarantee

In semi-synchronous replication, if data replication is abnormal (the backup node is unavailable or the network used for data replication is abnormal), the master node will pause (about 10 s by default for Mysql) to respond to the application, and the replication method will be Reduce to asynchronous replication. Until the data replication returns to normal, it will return to semi-synchronous replication. When the replication method is reduced to asynchronous replication, data reliability cannot be guaranteed, which is not allowed in some business scenarios. Fast full synchronous replication , by default, it cannot be downgraded to asynchronous replication (only in special cases, the downgrade can be supported through parameter configuration), so as to ensure that data will not be lost under any circumstances. The following analyzes from two different scenarios, when How does fast full synchronous replication handle such exceptions.

1.2.3 Batch write

The performance optimization of fast full synchronous replication also includes batch Relay Log writing and group submission. Batch Relay Log writing means that the standby node can write to the Relay Log after receiving a certain number of Binlogs according to parameter configuration, and then Feedback ACK messages to the application program of the host node in batches. This improves the write efficiency of the standby node to a certain extent, and significantly improves the replication performance.

Guess you like

Origin blog.csdn.net/Dream_Weave/article/details/132123908