[MySQL] Storage Engine (2): InnoDB memory structure

Innodb's memory structure is mainly divided into three parts: Buffer Pool, Change Buffer, Adaptive HashIndex, and a (redo) log buffer. We can go to the official website to see the memory structure and disk structure of InnoDB.

[External link image transfer failed. The source site may have an anti-leech link mechanism. It is recommended to save the image and upload it directly (img-iW91BVkp-1603816669105)(Untitled.assets/image-20201027193415658.png)]

1. Buffer Pool (buffer pool)

First of all, InnnoDB data is placed on disk. InnoDB operation data has a smallest logical unit called page (index page and data page). We do not directly manipulate the disk every time we manipulate data, because the speed of the disk is too slow. InnoDB uses a buffer pool technology, which puts the pages read from the disk into a memory area. This memory area is called Buffer Pool.

Insert picture description here

Next time you read the same page, first determine whether it is in the buffer pool. If it is, read it directly without accessing the disk again.

When modifying data, first modify the pages in the buffer pool. When the data page of the memory is inconsistent with the data on the disk, we call it a dirty page. There is a special background thread in InnoDB to write the data of the Buffer Pool to the disk, and write multiple modifications to the disk at once every certain period of time. This action is called flushing.

Buffer Pool caches page information, including data pages and index pages.

SHOW STATUS LIKE '%innodb_buffer_pool%'; -- 查看服务器状态中 Buffer Pool 相关信息
SHOW VARIABLES like '%innodb_buffer_pool%'; -- 查看参数(系统变量)

The detailed meaning of these statuses can be found on the official website , using the search function. The default size of the Buffer Pool is 128M (134217728 bytes), which can be adjusted.

What should I do if the memory buffer pool is full? (What if the memory set by Redis is full?) InnoDB uses the LRU algorithm to manage the buffer pool (linked list implementation, not traditional LRU, divided into young and old), and the eliminated data is hot data.

The memory buffer has a great effect on improving read and write performance.

Consider a question: When a data page needs to be updated, if the data page exists in the Bubble Pool, it will be updated directly. Otherwise, you need to load from the disk to the memory, and then operate on the data pages of the memory. In other words, if there is no hit to the buffer pool, at least one disk IO must be generated. Is there an optimized way? look down…

2.Change Buffer (write buffer)

If the data page is not a unique index (note: a unique index cannot have the same value in the same field), there is no need to load the index page from the disk to determine whether the data is duplicated (uniqueness check). In this case, you can first record the modification in the memory buffer pool to improve the execution speed of the update statement (Insert, Delete, Update).

This area is the Change Buffer. 5.5 It was called Insert Buffer before, but now it also supports delete and update. Finally, the operation of recording the Change Buffer to the data page is called merge.

When does the merge happen? There are several situations: when the data page is accessed, or triggered by a background thread, or when the database is shut down or the redo log is full.

If most of the database indexes are non-unique indexes, and the business is to write more and read less, and will not read the data immediately after writing, you can use Change Buffer (write buffer). For businesses that write more and read less, increase this value:

SHOW VARIABLES LIKE 'innodb_change_buffer_max_size';

[External link image transfer failed. The source site may have an anti-leech link mechanism. It is recommended to save the image and upload it directly (img-kdfHircC-1603816669117)(Untitled.assets/image-20201027225602088.png)]

max_size represents the ratio of Change Buffer to Buffer Pool, and the default is 25%.

3.Adaptive Hash Index

The index should be stored on the disk. Why should a hash index be stored in the memory? I won’t talk about it here, let’s put the answer to the question on the deduction process of the index storage structure in detail ...

4.Log Buffer(Redo log)

When MySQL updates data, in order to reduce the random IO of the disk, it does not directly update the data on the disk, but first updates the data of the cache page in the Buffer Pool, and waits for a suitable time point, and then persists the cache page to Disk. All cache pages in the Buffer Pool are in the memory. When MySQL goes down or the machine is powered off, the data in the memory will be lost. Therefore, in order to prevent the data in the cache pages from being updated, MySQL introduces The redo log mechanism has been implemented.

When performing additions, deletions, and modifications, MySQL will record a redo log of the corresponding operation when updating the cache page data in the Buffer Pool, so that if there is a MySQL down or power failure, if there is not enough time for the cache page data Flashing to the disk, then when MySQL restarts, data can be redo based on the redo log log file to restore the data to the state before the downtime or power failure, ensuring that the updated data is not lost, so redo log is also called re Make a log. Its essence is to ensure that the updated data is not lost after the transaction is committed. -Use it to achieve transaction persistence.

[External link image transfer failed. The source site may have an anti-leech link mechanism. It is recommended to save the image and upload it directly (img-wVoGmzli-1603816669119)(Untitled.assets/image-20201027225952771.png)]
The whole process of this kind of log and disk coordination is actually the WAL technology (Write-Ahead Logging) in MySQL. Its key point is to write the log first, and then write to the disk.

1) The memory part of Redo log: Log Buffer

Of course, the content of redo log is not written directly to the disk every time. There is a memory area (Log Buffer) in the Buffer Pool dedicated to storing the data to be written to the log file. The default size of Innodb_log_buffer is 8M, which can also save Disk IO.
Insert picture description here

SHOW VARIABLES LIKE 'innodb_log_buffer_size';

Insert picture description here
2) When is the Log Buffer refreshed?

So, when will the Log Buffer be written to the log file, or when will the log be flushed? The timing of log buffer writing to disk is controlled by a parameter, which is 1 by default.

Note: When we write data to disk, the operating system itself has a cache. Flush is to write the operating system buffer to the disk.

value meaning
0 (Delayed writing, delayed flushing) The logbuffer will be written to the logfile once per second, and the flush operation of the logfile will be performed at the same time.
In this mode, when the transaction is committed, the write operation to the disk will not be actively triggered.
1 (default, real-time writing, real-time refreshing) Each time a transaction is committed, MySQL will write logbuffer data into logfile and flush it to disk.
2 (real-time writing, delayed brushing) MySQL will write logbuffer data into logfile every time a transaction is committed. But the flush operation is not performed at the same time.
In this mode, MySQL will perform a flush operation every second.

Insert picture description here

SHOW VARIABLES LIKE 'innodb_flush_log_at_trx_commit';

Insert picture description here

3) The same is writing to the disk, why not write the db file directly, but write the log first?

Let's first understand the concepts of random I/O and sequential I/O. The smallest unit of a disk is a sector, which is usually 512 bytes. The smallest unit of the operating system and memory is the Page. The smallest unit of the operating system and the disk, read and write disk, the smallest unit is a block.

[External link image transfer failed. The source site may have an anti-leech link mechanism. It is recommended to save the image and upload it directly (img-vQoHADnt-1603816669121)(Untitled.assets/image-20201027230321063.png)]

If the data we need is randomly scattered in different sectors of different pages, then to find the corresponding data, we need to wait until the magnetic arm rotates to the specified page, and then the disk finds the corresponding sector to find the piece we need. Data, this process is carried out at once until all the data is found. This is random IO, and the speed of reading data is slow.

Assuming that we have found the first piece of data, and other required data is just behind this piece of data, then there is no need to re-address, and we can get the data we need in turn. This is called sequential IO.

Flushing is random I/O, while logging is sequential I/O, which is more efficient. Therefore, first continue to modify the data in the bufferpool in the memory and write it into the log to ensure that it will not be lost, and then wait for an appropriate time (the system is relatively idle) to update the operation record to the disk. To achieve the purpose of delaying the timing of brushing, and then improve the system throughput. The significance of the existence of redo log is mainly to reduce the requirements for data page flushing.

4) What are the characteristics of redo log?

1) Redo log is implemented by the InnoDB storage engine, not all storage engines have it.

2) It is not to record the status after the data page is updated (how a certain row or certain rows is modified), but to record what changes have been made to the current page , which belongs to the physical log. It is used to restore the physical data page after submission (to restore the data page, and can only be restored to the location of the last submission)

Logs are divided into physical logs and logical logs

  • The physical log is to directly record the data and record the offset of the modified page. The advantage is that it does not depend on the content of the original page. The content of the log can be directly overwritten on the disk. The disadvantage is that it takes up too much space, such as adding a new one. Btree index or an update operation.
  • The logical log only records the meta-operations on the relational table, such as update a row of data, delete a row of data, etc. The advantage is that it is more concise and occupies a small space. The disadvantage is that it needs to rely on the original page content, and there will be partial execution and operation consistency. problem.

3) Redo log is divided into two parts: memory and disk:

  • Memory part: Log Buffer, there is a flashing operation.
  • Disk part: corresponding to ib_logfile0 and ib_logfile1 in the /var/lib/mysql/ directory, each 48M.
    Insert picture description here

4) Because the redo log actually records the changes of the data page, and it is not necessary to save all such change records, the size of the redo log is fixed, and the previous content will be overwritten. As shown below:

[External link image transfer failed. The source site may have an anti-leech link mechanism. It is recommended to save the image and upload it directly (img-V6dMX0Ba-1603816669122)(Untitled.assets/image-20201027232221678.png)]

Once again, the content of redo log is mainly used for crash recovery. The data file of the disk, the data comes from the buffer pool. redo log is a log file written to disk, not a data file.

The above is the memory structure of MySQL. To sum up, it is divided into: Buffer pool, change buffer, Adaptive Hash Index, log buffer.

Guess you like

Origin blog.csdn.net/weixin_43935927/article/details/113982766