Mysql performance optimization study notes

MySQL performance optimization principles and notes:


1. MySQL will allocate a memory (sort_buffer) for each thread for sorting. The memory size is sort_buffer_size


  1>If the amount of sorted data is less than sort_buffer_size, the sorting will be done in memory
  2>If the amount of sorted data is too large and the memory cannot store so much data, a temporary disk file will be used to assist the sorting, also known as external sorting
  3 >When using external sorting, MySQL will divide into several separate temporary files to store the sorted data, and then merge these files into one large file
 


2. mysql will read the data that meets the conditions to sort_buffer by traversing the index, and quickly sort according to the sort field


1>If the query field is not included in the auxiliary index, you need to return the clustered index to retrieve the required fields according to the primary key of the auxiliary index record.
   2>This method will cause random IO. In MySQL5.6, the MRR mechanism is provided, which will change the auxiliary index. The primary key of the matching record is taken out and sorted in memory, and then back to table
  3>Create a joint index according to the situation to avoid the performance loss caused by sorting. If allowed, you can also build a covering index to avoid back to the table.

 

The principle of two ways of sorting:


Sort all fields


1. Read all the required fields into sort_buffer through the index
2. Sort according to the sort field
3. Return the result set to the client


Disadvantages:


1. As a result, the sort_buffer cannot store a lot of data, because in addition to the sort field, other fields are stored, and the use efficiency of sort_buffer is not high
. 2. When the amount of data to be sorted is large, there will be a lot of temporary files, and the sorting performance will be very high. Poor

Pros: MySQL will prioritize full-field sorting when the memory is large enough, because this method avoids a table-back operation compared to rowid sorting



Sort by rowid


1. By controlling the length of the sorted row data to store as much data as possible in the sort_buffer, max_length_for_sort_data
2. Only the fields and primary keys that need to be sorted are read into the sort_buffer, and sorted according to the sort field
3. According to the sorted Order, take the id to return to the table to retrieve the data you want to obtain
4. Return the result set to the client

Advantages: better use of memory sort_buffer for sorting operations, minimize the access to the disk

Disadvantages: the operation of returning to the table is random IO , Will cause a lot of random reads, not necessarily less than the full-field sorting to reduce the access to the disk


3. Return the number of rows taken by the customer according to the sorted result

 

1. The main and standby delays,

It is the difference between the execution completion time of the same transaction in the standby database and the execution completion time of the main database, including the execution completion time of the main database transaction and the binlog sent to the standby database, the difference between the execution completion time of the standby database transaction . The seconds_behind_master delay time of each transaction, there is a time field in the binlog of each transaction, which is used to record the write time on the main database, and the standby database takes out the value of the time field of the currently executing transaction and calculates it and the current system Time difference.


2. The source of the delay between active and standby:

①First of all, under some deployment conditions, the performance of the machine where the standby database is located is worse than the performance of the machine where the main database is located. The reason is that multiple standby databases are deployed on the same machine. A large number of queries will cause competition for io resources. The solution is configuration "Double 1", both redo log and binlog only write fs page cache

②The pressure of the standby database is high, and the reason is that a large number of query operations are performed in the standby database, which consumes a lot of CPUs, resulting in synchronization delays. The solution is to use one master and multiple slaves, and multiple slaves to reduce the query pressure of the backup

③Large transaction, because if the dml operation of a large transaction causes the execution time to be too long, the transaction binlog is sent to the standby database, and the standby database also needs to be executed for so long, which causes the delay of the main and standby. The solution is to minimize the large transaction, such as The delete operation, using limit to delete in batches, can prevent large transactions and reduce the scope of the lock.
④ The ddl of a large table will cause the main library to send its ddl binlog to the standby database, and the standby database parses the transfer log, synchronizes, and sends the subsequent dml binlog. It is necessary to wait for the mdl write lock of the ddl to be released, which causes the main and standby delays.


3. Reliability priority strategy,

① Determine if the current seconds_behind_master of standby database B is less than a certain value (for example, 5 seconds), continue to the next step, otherwise continue to retry this step

② Change the main library A to read-only status, that is, set readonly to true,

③Judging the value of seconds_behind_master of standby database B until this value becomes 0; Changing standby database B to readable and writable means setting readonly to false; Switching the business request to standby database, personally understand if the binlog sent over There are multiple transactions in the transfer log, and the time when the business is unavailable is the total time that multiple transactions are used. If the main library is powered off under abnormal conditions, it will cause problems. If the delay time between the standby library and the main library is short, the business can be used normally after the transfer log is used. If the transfer log has not been used yet, switch to The backup database will cause the previously completed transaction, "data loss", but it is unacceptable in some business scenarios.


4. Usability strategy, problems:

In double m, and binlog_format=mixed, it will lead to inconsistency of primary and secondary data. When using row format binlog, the problem of data inconsistency is easier to find, because binlog row records all values ​​of the field.

 


Today, the teacher also talked about the need to take precautions first. The prevention is probably through these points:


1. Permission control and distribution (database and server permissions)
2. Make operation specifications
3. Regular training for development
4. Build delayed backup database
5. Do a good job of SQL audit, as long as it is a statement that changes operations on online data (DML And DDL) need to be audited
6. Make a backup. The backup is divided into two points.
(1) If the amount of data is relatively large, use physical backup xtrabackup. Perform full backups of the database regularly, or incremental backups.
(2) If the amount of data is small, use mysqldump or mysqldumper. Then use binlog to restore or build a master-slave way to restore data.
It is also necessary to back up the binlog file regularly. It is
also necessary to regularly check whether the backup file is available. If a misoperation occurs and the data needs to be restored, the backup file is unavailable, which is even more tragic.



If a data deletion operation occurs, it can be recovered from the following points:


1. DML misoperation statements cause data incompleteness or loss. You can use flashback, but we are currently using Meituan's myflash, which is also a good tool, and the essence is the same. Both analyze the binlog event first, and then reverse it. Invert delete to insert, insert to delete, and reverse image before and after update. So you must set binlog_format=row and binlog_row_image=full.
Remember that when restoring data, you should restore to a temporary instance first, and then restore back to the main library.
2. DDL statement misoperation (truncate and drop), because DDL statement no matter whether the binlog_format is row or statement. In the binlog, only the statement is recorded, not the image, so it is relatively more troublesome to restore. Data can only be restored by full backup + application binlog. Once the amount of data is relatively large, the recovery time is particularly long.

 

 

 

Guess you like

Origin blog.csdn.net/m0_46405589/article/details/115261346