Talk about MySQL's flashing process and table data

Let's look at the first problem first, the SQL statement slows down

Cause Analysis

A SQL statement is very fast when it is executed normally, but sometimes it becomes very slow if you don’t know what’s going on, and such a scene is difficult to reproduce. It is not only random, but also has a short duration, like shaking For a moment.

Our usual update statement only performs the disk writing action of updating the memory data page and writing the redo log, but the dirty pages in the memory must be updated to the disk, that is, the flush action. This flush will affect the operation of the SQL statement.

Summarize the scene that triggers flush:

  1.  InnoDB's redo log is full . At this time, the system will stop all update operations, advance the checkpoint, and leave space for redo log to continue writing. To advance the checkpoint position, it is necessary to flush the log between the two points and all the corresponding dirty pages to the disk . After that, from write pos to checkpoint is the area of ​​redo log that can be written again. The time this happens, the entire system can no longer accept updates, all updates must be blocked. If you look at it from monitoring, the number of updates will drop to 0 at this time.
  2. System memory is insufficient. When new memory pages are needed and the memory is not enough, some data pages must be eliminated and the memory is freed for other data pages. If the "dirty pages" are eliminated, the dirty pages must be written to disk first. InnoDB uses a buffer pool to manage memory. The memory pages in the buffer pool have three states: unused, used and clean pages, and used and dirty pages. InnoDB's strategy is to use memory as much as possible, so for a long-running library, there are few unused pages. When the data page to be read is not in the memory, a data page must be applied for in the buffer pool. At this time, only the least-used data page can be eliminated from the memory : if a clean page is to be eliminated, it is directly released for reuse; but if it is a dirty page, the dirty page must be flushed to disk first , It can be reused after it becomes a clean page.
  3.  When MySQL thinks the system is "idle", it flushes some "dirty pages".
  4. If MySQL is shut down normally. MySQL will flush all dirty pages of the memory to the disk, so that the next time MySQL is started, data can be read directly from the disk, and the startup speed will be very fast.

Therefore, although dirty page flushing is normal, the following two situations will significantly affect performance:

  • There are too many dirty pages to be eliminated in a query, which will cause the response time of the query to be significantly longer;
  • The log is full, all updates are blocked, and the write performance drops to 0. This situation is unacceptable for sensitive businesses.

InnoDB flushing dirty pages control strategy

 The parameter innodb_io_capacity tells InnoDB the capacity of the disk. I suggest you set this value to the IOPS of the disk. The IOPS of the disk can be tested with the fio tool:

fio -filename=$filename -direct=1 -iodepth 1 -thread -rw=randrw -ioengine=psync -bs=16k -size=500M -numjobs=10 -runtime=10 -group_reporting -name=mytest 

If the innodb_io_capacity parameter is not set correctly, if the setting is too small, InnoDB thinks that the capacity of the system is so poor, so flushing dirty pages is particularly slow, even slower than the generation of dirty pages, which causes the accumulation of dirty pages , Affecting the query and update performance. It shows that MySQL's writing speed is very slow, TPS is very low, but the IO pressure of the database host is not large.

However, this parameter only indicates the ability to flush dirty pages, but it also needs to serve user requests. InnoDB's flashing speed is based on these two factors: one is the ratio of dirty pages, and the other is the redo log writing speed . InnoDB will first calculate two numbers separately based on these two factors.

  1. The parameter innodb_max_dirty_pages_pct is the upper limit of the proportion of dirty pages, and the default value is 75%. InnoDB will calculate a number ranging from 0 to 100 based on the current dirty page ratio (assuming M). The pseudo code for calculating this number is similar to this
    F1(M)
    {
      if M>=innodb_max_dirty_pages_pct then
          return 100;
      return 100*M/innodb_max_dirty_pages_pct;
    }
  2. Each log written by InnoDB has a serial number. The difference between the serial number currently written and the serial number corresponding to the checkpoint is assumed to be N. InnoDB will calculate a number ranging from 0 to 100 based on this N. This calculation formula can be written as F2(N). The F2(N) algorithm is more complicated, as long as you know that the larger N is, the larger the calculated value is.
  3. Finally, according to the two values ​​of F1(M) and F2(N) calculated above, take the larger value as R. Then the engine can multiply the capacity defined by innodb_io_capacity by R% to control the speed of flushing dirty pages.

Now you know that InnoDB flushes dirty pages in the background, and the process of flushing dirty pages is to write memory pages to disk. Therefore, whether your query statement may require a dirty page to be eliminated when memory is needed, or due to the logic of flushing dirty pages, it will occupy IO resources and may affect your update statement, which may cause you to perceive from the business side The reason for the "shake" to MySQL. In other words, usually pay more attention to the dirty page ratio, and don't let it often approach 75%. The ratio of dirty pages is obtained through Innodb_buffer_pool_pages_dirty/Innodb_buffer_pool_pages_total . For specific commands, refer to the following code:

mysql> select VARIABLE_VALUE into @a from global_status where VARIABLE_NAME = 'Innodb_buffer_pool_pages_dirty';
select VARIABLE_VALUE into @b from global_status where VARIABLE_NAME = 'Innodb_buffer_pool_pages_total';
select @a/@b;

However, another mechanism in MySQL may make your query slower: when preparing to flush a dirty page, if the data page next to the data page happens to be a dirty page, this "neighbor" will also be taken with you Flush together; and for each neighbor data page, if the adjacent data page is still dirty, it will also be flushed together.

In InnoDB, the innodb_flush_neighbors parameter is used to control this behavior. When the value is 1, there will be the above-mentioned "continuous sitting" mechanism, and when the value is 0, it means that you do not find neighbors and brush yourself. This optimization is very meaningful in the era of mechanical hard drives, and can reduce a lot of random IO . The random IOPS of a mechanical hard disk is generally only a few hundred. The reduction of random IO with the same logical operation means a significant improvement in system performance.

If you are using a device with high IOPS such as SSD, I suggest you set the value of innodb_flush_neighbors to 0. Because IOPS is often not the bottleneck at this time, and "only flush yourself" can perform the necessary flushing operations faster and reduce the response time of SQL statements.

In addition to the dirty page speed control mentioned above, the redo log cannot be set too small . Redo log must be written every time a transaction is submitted. If the setting is too small, it will be filled up soon, and write pos will always chase CP. At this time, the system has to stop all updates and advance the checkpoint. Then what you see is that the disk pressure is very small, but the database performance drops intermittently .

Next, let’s look at the second question. If half of the table data is deleted, the table file size remains unchanged

problem analysis

An InnoDB table contains two parts, namely: table structure definition and data. Before MySQL 8.0, the table structure was stored in a file with a suffix of .frm . The MySQL 8.0 version has allowed the table structure definition to be placed in the system data table .

The parameter innodb_file_per_table can control whether the table data is stored in the shared table space or in a separate file:

  • OFF means that the data of the table is placed in the system shared table space, that is, together with the data dictionary;
  •  ON means that each InnoDB table data is stored in a file with a suffix of .ibd.

Starting from MySQL 5.6.6 version, its default value is ON. It is easier to manage such a table separately as a file, and when you don't need the table, the system will delete the file directly through the drop table command. If it is placed in a shared table space, even if the table is deleted, the space will not be reclaimed.

When deleting the entire table, you can use the drop table command to reclaim the table space . However, the more data deletion scenario we encountered was to delete certain rows . At this time, we encountered the problem at the beginning of our article: the data in the table was deleted, but the table space was not reclaimed.

Data deletion process

We know that the data in InnoDB is organized in a B+ tree structure . We want to delete a record, the InnoDB engine will only mark this record as deleted , if you want to insert a record later, it may reuse this position. However, the size of the disk file will not be reduced .

So if we delete all the records on a data page, the entire data page can be reused . However, the multiplexing of data pages is different from the multiplexing of records, because the multiplexing of records is limited to the data that meets the range conditions, and when the entire page is removed from the B+ tree, it can be reused to any position.

If the utilization of two adjacent data pages is very low, the system will merge the data on these two pages onto one of the pages, and the other data page will be marked as reusable. Furthermore, what if we delete the data of the entire table with the delete command? The result is that all data pages will be marked as reusable. But on the disk, the file will not become smaller. In other words, the table space cannot be reclaimed through the delete command . These can be reused, but the unused space looks like "holes."

In fact, not only deleting data will create holes, but also inserting data.

If the data is inserted in ascending order of the index , then the index is compact . But if the data is randomly inserted, it may cause the data page of the index to split .

Assuming that a certain data page of a certain index is full, then I want to insert a row of data in the range, and I have to apply for a new page to save the data. After the page split is completed, a hole is left at the end of the old page , and there may be more than one record with a hole.

In addition, updating the value on the index can be understood as deleting an old value and inserting a new value, which will also cause holes. In other words, tables that have undergone a large number of additions, deletions, and modifications may have holes. Therefore, if these holes can be removed, the purpose of shrinking the table space can be achieved.

Rebuild table

Based on the above problem analysis, solving the void can achieve the purpose of shrinking space, just rebuild the table.

The process of rebuilding the table:

Create a new table with the same structure as the original table, and then read the data row by row from the source table and insert it into the new table in the order of increasing primary key ID. In this way, there are no holes in the primary key index of the old table in the new table. Obviously, the primary key index of the new table is more compact, and the utilization of data pages is also higher . If we use the new table as a temporary table, after the data is imported into the new table, the new table will replace the old table. From the effect point of view, it will shrink the space of the old table.

You can use the alter table A engine=InnoDB command to rebuild the table. Before MySQL 5.5 , the execution process of this command was similar to what we described earlier. The difference is that this temporary table does not need to be created by you. MySQL will automatically complete the operations of dumping data, swapping table names, and deleting old tables.

In this process, the most time-consuming step is the process of inserting data into the temporary table . If there is new data to be written to the old table during this process, it will cause data loss. Therefore, in the entire DDL process, there can be no updates in the old table, that is, this DDL is not Online .

The Online DDL introduced in MySQL 5.6 version optimizes this operation process.

After the introduction of Online DDL, the process of rebuilding the table:

  1. Create a temporary file to scan all data pages of the primary key of the original table;
  2. Use the records of the original table in the data page to generate a B+ tree and store it in a temporary file;
  3. During the process of generating temporary files, record all operations on the original table in a log file (row log);
  4. After the temporary file is generated, apply the operations in the log file to the temporary file to obtain a data file with the same logical data as the original table;
  5. Replace the data files of Table A with temporary files.

Under normal circumstances, the MDL write lock is required before DDL. The alter statement needs to acquire the MDL write lock when it is started , but this write lock degenerates into a read lock before the data is actually copied . Why degenerate? In order to realize Online, MDL read lock will not block addition, deletion and modification operations. But it can't be unlocked directly, to prevent other threads from doing DDL on this table at the same time.

For a large table, the most time-consuming process of Online DDL is the process of copying data to a temporary table. Additions, deletions, and modifications can be accepted during the execution of this step. Therefore, relative to the entire DDL process, the lock time is very short. For business, it can be considered Online.

It needs to be added that the above reconstruction methods will scan the original table data and construct temporary files . For very large tables, this operation consumes IO and CPU resources . Therefore, if it is an online service, you must carefully control the operating time. If you want a safer operation, I recommend you to use GitHub's open source gh-ost.

Online 和 inplace

Speaking of Online, we need to clarify the difference between it and another confusing concept inplace related to DDL.

As mentioned above, before version 5.5, rebuilding tables is to insert data into temporary tables, and after version 5.6, it is to put data into temporary files. The former is done at the server layer, and the latter is done at the InnoDB engine layer. child.

So, for the server layer, the data is not moved to the temporary table, which is an "in-place" operation. This is the source of the "inplace" name, but temporary files also occupy temporary space.

The statement that we rebuild the table alter table t engine=InnoDB actually implies that alter table t engine=innodb, ALGORITHM=inplace;

Corresponding to inplace is the way to copy the table, alter table t engine=innodb, ALGORITHM=copy;

When ALGORITHM=copy is used, it means that the table is forced to be copied, and the corresponding process is the operation process of the temporary table.

Up to this point, at first glance, inplace is also online, but in fact, here is just because the logic of rebuilding the table happens to be inplace and it can also be DML operations.

For example, I want to add a full-text index to a field of an InnoDB table, written as: alter table t add FULLTEXT(field_name); this process is inplace, but it will block the addition, deletion, and modification operations, and it is non-Online.

The relationship between these two logics can be summarized as:

  • If the DDL process is Online, it must be inplace;
  • The reverse is not necessarily true, that is, the DDL of inplace may not be Online. As of MySQL 8.0, this is the case with the addition of full text index (FULLTEXT index) and spatial index (SPATIAL index).

To extend, optimize table, analyze table and alter table are the three ways to rebuild the difference between the table:

  • Starting from MySQL 5.6 version, alter table t engine = InnoDB (that is, recreate) defaults to the process of storing data in the above temporary files;
  • analyze table t is not actually rebuilding the table, but re-stating the index information of the table without modifying the data. MDL read lock is added in this process;
  • optimize table t is equal to recreate+analyze.

Regarding the reconstruction of the table, there is an extreme problem:

Sometimes using alter table t engine=InnoDB will increase the space occupied by a table.

Reason: When rebuilding the table, InnoDB will not fill up the entire table, and 1/16 of each page is reserved for subsequent updates. In other words, it is not the "most" compact after rebuilding the table. If there is a new DML operation before rebuilding again, it will occupy the remaining space of the page. At this time, if you shrink again, you will continue to reserve space by 1/16, so that after shrinking, the file will become larger.

Content source: Lin Xiaobin "45 Lectures on MySQL Actual Combat"

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Guess you like

Origin blog.csdn.net/qq_24436765/article/details/112557501