Detailed explanation of mysql master-slave synchronization-principle

The principle is explained in detail

MySQL's master-slave replication involves three threads, one running on the master node (log dump thread), and the other two (I/O thread, SQL thread) running on the slave node, as shown in the following figure:
Insert picture description here

  • Binary log dump thread of the master node
    When the slave node connects to the master node, the master node will create a log dump thread to send the contents of the bin-log. When reading the operation in the bin-log, this thread will lock the bin-log on the master node. When the reading is completed, the lock will be released even before it is sent to the slave node.
  • Slave node I/O thread
    When the start slave command is executed on the slave node, the slave node will create an I/O thread to connect to the master node and request the updated bin-log in the master library. After the I/O thread receives the update sent by the binlog dump process of the master node, it saves it in the local relay-log.
  • The SQL thread of the slave node SQL thread
    is responsible for reading the content in the relay log, analyzing it into specific operations and executing them, and ultimately ensuring the consistency of the master-slave data.

Description:

  • For each master-slave connection, three processes are required to complete. When the master node has multiple slave nodes, the master node will create a binary log dump
    process for each currently connected slave node , and each slave node has its own I/O process and SQL process.
  • The slave node uses the I/O process and the SQL process to pull updated logs from the main library and replay and execute SQL statements locally, so that the performance of read operations will not be reduced when the task of synchronizing data is executed.
  • To implement replication, the binary log (bin-log) function on the Master must be turned on, otherwise it cannot be implemented.

Illustration of replication execution log:
Insert picture description here
The basic process of replication is as follows:

  • The I/O process on the slave node connects to the master node and requests the log content after the specified location of the specified log file (or from the very beginning log);
  • After the master node receives the I/O request from the slave node, the I/O process responsible for replication reads the log information after the specified location of the specified log according to the request information, and returns it to the slave node. The returned information includes the bin-log file and bin-log position of the information returned this time; after receiving the content from the node's I/O process, the received log content is updated to the local relay log, and read The obtained binary log file name and location are saved in the master-info file ( 此文件好像没有了,有可能被别的文件代替了), so that the next time it is read, the Master can clearly tell the master from which location in a bin-log to read the log content in the future;
  • After the SQL thread of the Slave detects
    that the new content is added to the relay-log , it will parse the content of the relay-log into the operation actually performed on the Zhu node and execute it in the database.

Copy method

MySQL master-slave replication is asynchronous by default. MySQL addition, deletion and modification operations will all be recorded in the binary log. When the slave node connects to the master, it will actively obtain the latest bin log file from the master, and relay the sql in the bin log to the local through I/O relay.

  • Asynchronous mode (mysql async-mode) In
    this mode, the master node will not actively push the bin log to the slave node. This may cause a failover, and the slave node may not synchronize the latest bin log to the local in time.
    Insert picture description here
  • Semi-synchronous mode (mysql semi-sync) In
    this mode, the master node only needs to receive the return information from one of the slave nodes, and it will commit; otherwise, it needs to wait until the timeout period and then switch to asynchronous mode before submitting; the purpose of this The data delay of the master-slave database can be reduced, and the data security can be improved. After the transaction is submitted, the binlog is transmitted to at least one slave node. There is no guarantee that the slave node will update the transaction to the db. There will be a certain reduction in performance, and the response time will be longer. As shown in the figure below: The
    Insert picture description here
    semi-synchronous mode is not built-in mysql. Starting from mysql 5.5, the master and slave need to install plug-ins to enable the semi-synchronous mode.
  • Full synchronization mode
    Full synchronization mode means that the master node and the slave node all execute commit and confirm before returning success to the client. The principle is the same as semi-synchronization.

binlog record format

There are three ways of MySQL master-slave replication:

  • SQL statement-based replication (statement-based replication, SBR)
  • Row-based replication (RBR)
  • Mixed-based replication (MBR)

There are also three corresponding binlog file formats:

  • STATEMENT
  • ROW
  • MIXED

SQL statement-based replication

Statement-base Replication (SBR) is to record sql statements in the bin log. This replication format is used in Mysql 5.1.4 and earlier versions.
Advantages: only need to record the sql statement that will modify the data to the binlog, which reduces the daily quality of the binlog, saves I/O, and improves performance.
Disadvantages: In some cases, the data in the master and slave nodes will be inconsistent (such as sleep(), now(), etc.).

Row-based replication

Row-based Relication (RBR) is that mysql master decomposes SQL statements into statements based on Row changes and records them in the bin log, that is, to record which data has been modified and what is modified.
Advantages: There will be no problem that stored procedures, functions, or trigger calls or triggers cannot be copied correctly under certain specific circumstances.
Disadvantages: A large number of logs will be generated, especially when the table is modified, the logs will increase sharply, and the bin log synchronization time will be increased. It is also not possible to obtain the executed SQL statements through bin log analysis, and only the data changes that have occurred can be seen.

Mixed mode replication

Mixed-format Replication (MBR), the MBR used by MySQL NDB cluster 7.3 and 7.4. It is a mixture of the above two modes. For general replication, use STATEMENT mode to save to binlog. For operations that cannot be replicated in STATEMENT mode, use ROW mode to save. MySQL will select the log saving method according to the executed SQL statement.

GTID replication mode

In traditional replication, when a failure occurs, a master-slave switch is required. It is necessary to find the binlog and pos points, and then point the master node to the new master node, which is relatively troublesome and error-prone. In MySQL 5.6, there is no need to search for binlog and pos points. We only need to know the ip, port, and account password of the master node. Because replication is automatic, MySQL will automatically find points for synchronization through the internal mechanism GTID.

In versions prior to MySQL 5.6, slave replication was single-threaded. One event reads the application, and the master writes concurrently, so the delay is unavoidable. The only effective way is to put multiple libraries on multiple slaves, which is a bit of a waste of servers. In MySQL 5.6, we can put multiple tables in multiple libraries, so that we can use multi-threaded replication.

  • Working principle based on GTID replication
  1. When the master node updates data, it will generate GTID before the transaction and record it in the binlog log together;
  2. The I/O thread of the slave node writes the changed bin log to the local relay log;
  3. The SQL thread obtains the GTID from the relay log, and then compares whether there is a record in the local binlog (so the MySQL slave node must open the binary log);
  4. If there is a record, it means that the GTID transaction has been executed and the slave node will ignore it; if there is no record, the slave node will execute the GTID transaction from the relay log and record it in the bin log;
  5. In the parsing process, it will be judged whether there is a primary key, if not, use a secondary index, if there is a full scan.

to sum up

Mysql master-slave replication is the basis of MySQL's high availability and high performance. With this foundation, the deployment of MySQL will become simple, flexible and diverse, so that it can be flexibly adjusted according to different business scenarios.

It's all here. For more articles, please refer to the personal WeChat public account ALL In Linux, let's scan it!
Insert picture description here

Guess you like

Origin blog.csdn.net/weixin_44729138/article/details/115311756