Monitoring and mitigation of MySQL master-slave replication delay

MySQL 's master-slave replication can cause delays for many reasons. This is recognized. Let's talk about how to monitor the delay of replication and how to solve the problem of delay as much as possible.

 

delayed monitoring

Seconds_behind_master

Execute SHOW SLAVE STATUS on SLAVE to monitor the value of the Seconds_behind_master column. The Seconds_Behind_Master value of the standby database is obtained by adding the current timestamp of the server (there is actually a problem of time difference between the master and slave servers, but in fact, the master and slave will be the master once after connecting It is obtained by comparing the time difference and recording the offset) with the event timestamp in the binary log. If there is no delay in the I/O thread, this is still accurate.

 

Master_Log_Pos

If there is a delay in I/O, then the value of the Seconds_behind_master column is inaccurate. At this time, you should SHOW MASTER STATUS on the master library, record the LogFile and Log Position values, and then SHOW SLAVE STATUS on the slave library. Check Read_Master_Log_Pos and Exec_Master_Log_Pos, see Look at the respective positions of BinLog on the master and slave, you can know whether it is delayed.

 

Use pt-heartbeat or mk-heartbeat monitoring tools

The tool can figure out MySQL replication, it can update the master or monitor replication, and it can read configuration from my.cnf. It is realized by the comparison of timestmp. First, it is necessary to ensure that the time of the master and slave servers must be consistent, and the clock must be synchronized with the same NTP server. It needs to create a heartbeat table on the master library, the timestamp ts in it is the current timestamp now(), and the structure will also be copied to the slave library. After the table is built, it will execute a row update command on the main database in the mode of a background process, and insert data into the table regularly. The default period is 1 second, and the slave library will also execute a monitoring command in the background. The period that is consistent with the main library + 0.5S (default 0.5S delay check) to compare, the ts value copied and recorded is the same ts value on the main library, the difference is 0 means no delay, the larger the difference is The longer the delay is in seconds.

 

Use pt-table-checksum to determine whether the master and backup are consistent

Replication delays or network problems do not always make the primary and secondary data completely inconsistent. Master-slave consistency should be the norm, not the exception, that is, checking your master-slave consistency should be a routine task, especially when using a standby database for backup. pt-table-checksum can be used to confirm whether the data of the active and standby databases are consistent.

 

Delayed mitigation (only mitigation but not complete resolution)

The easiest way is to configure InnoDB

Make it less frequent to flush the disk, so the transaction commits faster. Set innodb_flush_log_at_trx_commit=2 to achieve this. You can also disable binary logging on the standby database, set innodb_locks_unsafe_for_binlog to 1, and set MyISAM's delay_key_write to ALL. But these settings trade speed for safety. If you need to promote the standby database to the primary database, remember to set these options back to safe values.

 

Don't care about the expensive part of the write operation

Refactoring the application or optimizing the query is usually the best way to keep the standby database in sync. If the work can be transferred to the standby database, then only one standby database needs to be executed, and then we can pass the result of the write back to the main database, for example, by executing LOAD DATA INFILE.

  

Restrict packages that are too large in the main library

Modify the max_allowed_packet value of the main library, too large packets will complicate binary transactions.

 

Utilize semi-synchronization above MySQL 5.5

In fact, semi-synchronous replication can indeed provide enough flexibility to improve performance in some scenarios, and it is safer to ensure that the main library turns off sync_binlog. Writing to remote memory (a standby feedback) is faster than writing to local disk (write and flush). Some people have tested, and the performance of using semi-synchronous replication compared to strong persistence on the main library is twice as good. There is no absolute persistence on any system, only higher levels of persistence, and it seems like semi-synchronous replication should be a less expensive method of persisting system data than other alternatives.

 

Parallel writes outside of replication

Should all write operations be passed from primary to standby? If you can determine that some writes can easily be performed outside of replication, you can parallelize these operations to take advantage of the standby's write capacity. For example, some archived data can be archived separately on the primary and secondary.

 

Prefetch cache for replication thread

Through the program implementation, the relay log is read in advance and converted into a SELECT statement for execution before the SQL thread is updated. This causes the server to load the data from disk into memory so that the SQL thread does not need to read the data from disk when it reaches the corresponding statement.

There is already a tool called relayfetch for this method. The idea is to use the program to let him read the query statement in the relay log a little earlier than the sql thread of the slave server and execute it as a select statement. Causes the server to read some data from disk into memory, so when the slave server's sql thread executes commands from the relay log, it doesn't need to wait for data to be read from disk. Select parallelizes the I/O that slaves must process serially. There will be a separate introduction and use of this work later.

 

multi-thread synchronization

This method already has a tool called MySQL-Transefer (hereinafter referred to as Transfer), which is a master-slave synchronization tool based on MySQL+patch. The principle of this tool is to read the relaylog and update SlAVE in a multi-threaded manner. There will be a separate introduction and use of this work later.

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326929142&siteId=291194637