MySql-master-slave replication principle

Original author: inspiration cloud

Original address: In- depth exploration of the principle of MySQL master-slave replication

table of Contents

1. MySQL master-slave replication concept

2. The main purpose of MySQL master-slave replication

3. MySQL master-slave form

4. MySQL master-slave replication principle

5. MySQL master-slave replication mode

6. Binlog record format

7. GTID copy mode


What is MySQL Replication ( MySQL master-slave replication )? Why master-slave replication and what is its realization principle?

1. MySQL master-slave replication concept

MySQL master-slave replication means that data can be replicated from one MySQL database server master node to one or more slave nodes. MySQL uses asynchronous replication by default, so that the slave node does not need to always access the master server to update its own data. The data update can be performed on a remote connection. The slave node can replicate all databases in the master database or a specific database or a specific table. .

2. The main purpose of MySQL master-slave replication

  • Read and write separation: In development work, sometimes a certain SQL statement needs to lock the table, resulting in temporarily unable to use the read service, which will affect the existing business, use master-slave replication, let the master library be responsible for writing, and the slave library Responsible for reading, so that even if the main library has a lock table scenario, the normal operation of the business can be guaranteed by reading the slave library.
  • Real-time data backup: when a node in the system fails, it can be easily failed over
  • Highly available HA
  • Architecture expansion: With the increase of business access in the system, if the database is deployed on a stand-alone machine, it will lead to excessive I/O access frequency. With master-slave replication, multiple data storage nodes are added, the load is distributed among multiple slave nodes, the frequency of single-machine disk I/O access is reduced, and the I/O performance of a single machine is improved.

3. MySQL master-slave form

One master and one follower:

One-master-multi-slave: One-master-one-slave and one-master-multi-slave are the most common master-slave architectures, which are simple and effective to implement. It can not only realize HA, but also separate read and write, thereby improving the concurrency of the cluster.

Multi-master and one-slave (supported since 5.7): Multi-master and one-slave can back up multiple mysql databases to a server with better storage performance.

Dual-master replication: that is, mutual master-slave replication, each master is both a master and a slave of another server. In this way, the changes made by either party will be replicated to the other party's database.

Cascading replication: In cascading replication mode, data synchronization of some slaves is not connected to the master node, but to the slave node. Because if the master node has too many slave nodes, part of the performance will be lost for replication, then we can let 3~5 slave nodes connect to the master node, and other slave nodes are connected to the slave nodes as secondary or tertiary, so that not only It can alleviate the pressure on the master node and has no negative impact on data consistency.

4. MySQL master-slave replication principle

MySQL master-slave replication involves three threads, one running on the master node (log dump thread), and the other two (I/O thread, SQL thread) running on the slave node, as shown in the following figure:

l Binary log dump thread of the master node

When the slave node connects to the master node, the master node will create a log dump thread to send the contents of the bin-log. When reading the operation in the bin-log, this thread will lock the bin-log on the master node. When the read is completed, the lock will be released even before it is sent to the slave node.

l Slave node I/O thread

After executing the `start slave` command on the slave node, the slave node will create an I/O thread to connect to the master node and request the updated bin-log in the master library. After the I/O thread receives the update sent by the binlog dump process of the master node, it saves it in the local relay-log.

l Slave node SQL thread

The SQL thread is responsible for reading the content in the relay log, analyzing it into specific operations and executing them, and ultimately ensuring the consistency of the master-slave data.

For each master-slave connection, three processes are required to complete. When the master node has multiple slave nodes, the master node will build a binary log dump process for each currently connected slave node, and each slave node has its own I/O process, SQL process. The slave node uses two threads to pull updates and execute them from the main library into independent tasks, so that the performance of read operations will not be reduced when performing data synchronization tasks . For example, if the slave node is not running, the I/O process can quickly get updates from the master node even though the SQL process has not been executed yet. If the slave node service is stopped before the SQL process is executed, at least the I/O process has pulled the latest changes from the master node and saved it in the local relay log. When the service is up again, data synchronization can be completed.

To implement replication, you must first enable the binary log (bin-log) function on the Master side, otherwise it cannot be implemented. Because the entire replication process is actually that the Slave obtains the log from the Master and then executes the various operations recorded in the log in complete order on itself. As shown below:

The basic process of copying is as follows:

  • The I/O process on the slave node connects to the master node and requests the log content after the specified location of the specified log file (or the log from the very beginning);
  • After the master node receives the I/O request from the slave node, the I/O process responsible for replication reads the log information after the specified log location according to the request information and returns it to the slave node. In addition to the information contained in the log, the returned information also includes the bin-log file and bin-log position of the returned information; after the content is received from the node's I/O process, the received log content is updated To the local relay log, and save the read binary log file name and location to the master-info file, so that the next time you read it, you can clearly tell the Master "Which bin-log do I need from? Please send me the content of the log from the beginning of the position.
  • After the SQL thread of the Slave detects that the new content is added to the relay-log, it will parse the content of the relay-log into the operation actually performed on the Zhu node and execute it in this database.

5. MySQL master-slave replication mode

MySQL master-slave replication is asynchronous by default. MySQL add, delete and modify operations will all be recorded in the binary log. When the slave node connects to the master, it will take the initiative to obtain the latest bin log file from the master. And put the sql relay in the bin log.

l Asynchronous mode (mysql async-mode)

The asynchronous mode is shown in the figure below. In this mode, the master node will not actively push the bin log to the slave node. This may lead to a failover, and the slave node may not immediately synchronize the latest bin log to the local.

l Semi-synchronous mode (mysql semi-sync)

In this mode, the master node only needs to receive the return information from one of the slave nodes, and it will commit ; otherwise, it needs to wait until the timeout period and then switch to asynchronous mode and submit; the purpose of this is to reduce the data delay of the master-slave database , Can improve the data security, ensure that after the transaction is submitted, the binlog is transmitted to at least one slave node, and there is no guarantee that the slave node will update this transaction to the db. There will be a certain reduction in performance and a longer response time. As shown below:


The semi-synchronous mode is not built in mysql. Starting from mysql 5.5, the master and slave need to install plug-ins to enable the semi-synchronous mode.

l Full synchronization mode

Full synchronization mode means that the master node and the slave node all execute commit and confirm before returning success to the client.

6. Binlog record format

There are three ways of MySQL master-slave replication: statement-based replication (SBR) based on SQL statements, row-based replication (RBR), and mixed-based replication (MBR). There are also three formats of the corresponding binlog file: STATEMENT, ROW, MIXED.

l Statement-base Replication (SBR) is to record sql statements in the bin log. This replication format is used in Mysql 5.1.4 and earlier versions. The advantage is that only SQL statements that modify data need to be recorded in binlog, which reduces the daily quality of binlog, saves I/O, and improves performance. The disadvantage is that in some cases, the data in the master and slave nodes will be inconsistent (such as sleep(), now(), etc.).

l Row-based Relication (RBR) is that MySQL master decomposes SQL statements into statements based on Row changes and records them in the bin log, that is, it only records which data has been modified and what is modified. The advantage is that there will be no problem that stored procedures, functions, or trigger calls or triggers cannot be copied correctly under certain specific circumstances. The disadvantage is that a large number of logs will be generated, especially when the table is modified, the log will increase sharply, and the bin log synchronization time will increase. It is also not possible to obtain the executed sql statement through bin log analysis, only the data changes that have occurred can be seen.

l Mixed-format Replication (MBR), the MBR used by MySQL NDB cluster 7.3 and 7.4. It is a mixture of the above two modes. For general replication, use STATEMENT mode to save to binlog. For operations that cannot be replicated in STATEMENT mode, use ROW mode to save. MySQL will choose the log saving method according to the executed SQL statement.

7. GTID copy mode

@ In traditional replication, when a failure occurs, a master-slave switch is required. It is necessary to find the binlog and pos points, and then point the master node to the new master node, which is relatively troublesome and error-prone. In MySQL 5.6, there is no need to look for binlog and pos points. We only need to know the ip, port, and account password of the master node. Because replication is automatic, MySQL will automatically find points for synchronization through the internal mechanism GTID.

@ Multi-threaded replication (based on the library), in versions prior to MySQL 5.6, slave replication was single-threaded. Read application of one event one event. The master writes concurrently, so the delay is unavoidable. The only effective way is to put multiple libraries on multiple slaves, which is a waste of servers. In MySQL 5.6, we can put multiple tables in multiple libraries, so that we can use multi-threaded replication.

  • Working principle based on GTID replication
  • When the master node updates the data, it will generate the GTID before the transaction and record it in the binlog log together.
  • The I/O thread of the slave node writes the changed bin log to the local relay log.
  • The SQL thread obtains the GTID from the relay log, and then compares whether there is a record in the local binlog (so the MySQL slave node must open the binary log).
  • If there is a record, it means that the GTID transaction has been executed and the slave node will ignore it.
  • If there is no record, the slave node will execute the GTID transaction from the relay log and record it in the bin log.
  • In the parsing process, it will be judged whether there is a primary key, if not, use a secondary index, if there is a full scan.

to sum up

Mysql master-slave replication is the foundation of MySQL's high availability and high performance. With this foundation, the deployment of MySQL will become simple, flexible and diverse, so that it can be flexibly adjusted according to different business scenarios.

Guess you like

Origin blog.csdn.net/sanmi8276/article/details/113094650