Detailed explanation of MySQL storage engine (1) - InnoDB architecture

Table of contents

foreword

1. Supported storage engines

2. InnoDB engine

1.Buffer Pool

Traditional LUR algorithm

read ahead

read ahead invalid

2.Log Buffer

3.Adaptive Hash Index

4.Change Buffer

see



foreword

At present, the MySQL 8.x version database already supports many storage engines, but generally we use only a few kinds of them. It is easy to form thinking and will not easily adopt other storage engines, thus missing many functions of optimizing storage. Therefore, it is worth learning to have a clear understanding of the functions of the nine database storage engines currently supported. This article will clarify the functions, roles and usage scenarios of these eight database storage engines.

This series of articles will be included in my column, the article Quickly Learn SQL Database Operations, which basically covers all aspects of using SQL to handle daily business, routine query building, database analysis, and complex operations. It took a lot of time and effort to create from basic database building and table building to dealing with complex operations of various databases, as well as professional explanations of common SQL functions. Learn the most practical and common knowledge in the first time. This blog is long and worth reading and practicing. I will pick out the best part and talk about practice in detail. Bloggers will maintain blog posts for a long time. If you have any mistakes or doubts, you can point them out in the comment area. Thank you for your support.


1. Supported storage engines

Enter the MySQL database to view the storage engine, and you can see all the supported storage engines of the MySQL database:

SHOW ENGINES

 At present, there is an engine Federated does not support, we only need to understand the other eight database storage.

Common database engines in MySQL include MyISAM, InnoDB, and Memory. So let's first understand these three engines.

2. InnoDB engine

InnoDB is the default engine for MySQL, a storage engine that supports transaction safety. The data in mysql is stored on the physical disk, and the real data processing is performed in memory. Since the read and write speed of the disk is very slow, if the disk is read and written frequently for each operation, the performance will be very poor.

In order to solve the above problems, InnoDB divides the data into several pages, and uses the page as the basic unit of interaction between the disk and the memory. The general page size is 16KB. In this case, at least 1 page of data is read into memory at a time or 1 page of data is written to disk. Improves performance by reducing the number of memory-disk interactions.

This is essentially a typical cache design idea. Generally, the cache design is basically considered from the time dimension or the space dimension:

  • Time dimension: If a piece of data is being used, there is a high probability that it will be used again in the next period of time. It can be considered that the hotspot data cache belongs to the realization of this idea.
  • Spatial dimension: If a piece of data is being used, there is a high probability that data stored near it will also be used soon. The data pages of InnoDB and the page cache of the operating system are the embodiment of this idea.

The following is the official InnoDB engine structure diagram, which is mainly divided into two parts: memory structure and disk structure.

 

 

The memory structure mainly includes four components: Buffer Pool, Change Buffer, Adaptive Hash Index and Log Buffer.

1.Buffer Pool

Buffer Pool consists of data, index, insert buffer, adaptive hash index, lock information and data dictionary. Buffer pool, referred to as BP. BP takes Page page as the unit, the default size is 16K, and the bottom layer of BP adopts linked list data structure to manage Page. When InnoDB accesses table records and indexes, it will be cached in the Page page. Later use can reduce disk IO operations and improve efficiency.

The buffer pool is simply a memory area that compensates for the impact of slower disk speed on database performance through the speed of memory. In the operation of reading pages in the database, the pages read from the disk are first stored in the buffer pool. This process is called "FIX" the pages in the buffer pool. The next time the same page is read, first determine whether the page is in the buffer pool. If it is in the buffer pool, the page is said to be hit in the buffer pool. Read the page directly. Otherwise read the page on disk. For page modification operations in the database, the pages in the buffer pool are first modified, and then flushed to the disk at a certain frequency. It should be noted here that the operation of flushing pages from the buffer pool to disk is not triggered every time a page is updated, but is flushed back to disk through a mechanism called Checkpoint. Again this is to improve the overall performance of the database.

Traditional LUR algorithm

The buffer pool is managed by the LRU (Latest Recently Used, least recently used) algorithm, that is, the most frequently used pages are at the front of the LRU list, and the least used pages are at the end of the LRU list. When the buffer pool cannot store When a new page is read, first release the page at the end of the LRU list:

(1) The page is already in the buffer pool , then only the action of "moving" to the head of the LRU is performed, and no page is eliminated;

(2) If the page is not in the buffer pool , in addition to the action of "putting" the LRU header, it is also necessary to "eliminate" the LRU tail page;

But InnoDB's LUR algorithm is not a traditional LUR algorithm.

There are two problems here:

(1) The pre-reading fails;

(2) Buffer pool pollution;

Let's first understand what pre-reading is;

read ahead

Disk read and write is not read on demand, but by page, and at least one page of data (usually 4K) is read at a time. If the data to be read in the future is in the page, subsequent disk IO can be saved. ,Improve efficiency. Data access usually follows the principle of "centralized read and write". When some data is used, nearby data is likely to be used. This is the so-called "locality principle", which indicates that early loading is effective and can indeed reduce disk IO.

read ahead invalid

Due to read-ahead, the page is put into the buffer pool in advance, but MySQL does not read data from the page in the end, which is called read-ahead failure.

To optimize the read-ahead failure, the idea is:

(1) Let the pages that fail to read ahead stay in the buffer pool LRU for as short a time as possible;

(2) Let the pages that are actually read be moved to the head of the buffer pool LRU;

To ensure that the hot data that is actually read stays in the buffer pool for as long as possible.

The specific method is:

(1) Divide the LRU into two parts:

  •     New generation (new sublist)
  •     Old generation (old sublist)

(2) The new and old generations are connected at the end, that is, the tail of the new generation is connected to the head of the old generation;

(3) When a new page (such as a pre-read page) is added to the buffer pool, it is only added to the head of the old generation:

  •     If the data is actually read (pre-reading is successful), it will be added to the head of the new generation
  •     If the data is not read, it will be eliminated from the buffer pool earlier than the "hot data pages" in the young generation

The new and old generation improved LRU still cannot solve the problem of buffer pool pollution.

2.Log Buffer

Log Buffer is used to cache redo logs.

InnoDB has two very important logs: undo log and redo log
(1) Through the undo log, you can see the earlier version of the data, realize MVCC, or roll back the transaction and other functions.
(2) The redo log is used to ensure transaction durability.

 

The redo log buffer is an in-memory storage area that holds data to be written to the log file on disk. The log buffer size is defined by the innodb_log_buffer_size variable, and the default size is 16MB.

The contents of the log buffer are periodically flushed to disk. Larger log buffers can run large transactions without writing redo log data to disk before the transaction commits. Therefore, if there are transactions that update, insert, or delete many rows, increasing the size of the log buffer can save disk I/O.

innodb_flush_log_at_trx_commit : Controls how the contents of the log buffer are written and flushed to disk.
innodb_flush_log_at_timeout : Controls how often the log is flushed.

Transactions need to be observed if disk I/O is causing performance issues, such as those involving many BLOB entries. Whenever the InnoDB log buffer is full, it is flushed to disk, so increasing the buffer size can reduce I/O.

The default number of log files is two: ib_logfile0 and ib_logfile1.
The log has a fixed size, the default size depends on the MySQL version.

3.Adaptive Hash Index

Adaptive Hash Index Adaptive hash index is a key-value pair storage structure, which stores records where hot pages are located. The InnoDB storage engine automatically builds hash indexes for certain pages based on the frequency and pattern of access.

B+ tree index Adaptive Hash Index
query time complexity O (height of the tree) O(1)
Is it persistent Yes, with logging guaranteed for integrity No, only in memory
index object Page The record where the hot page is located

The above figure is to distinguish the difference between B+ tree index and adaptive hash index. This feature is disabled or enabled by the parameter innodb_adaptive_hash_index, which is enabled by default.

4.Change Buffer

Change Buffer: Data in MySQL is divided into two parts: memory and disk; hot data pages and index pages are cached in the buffer pool to reduce disk reads; change buffer is a means to alleviate disk writes.

When a data page needs to be updated, it is updated directly if the data page is in memory. If the data page is not in memory. Without affecting data consistency, InooDB will cache these update operations in the change buffer, so that the data page does not need to be read from disk. When the next query needs to access this data page, read the data page into memory, and then perform the operations related to this page in the change buffer. In this way, the correctness of the data logic can be guaranteed.
Although the name is change buffer, it is actually persistent data. That is, the change buffer has a copy in memory and is also written to disk (ibdata).

 The process of merging the operations in the change buffer into the original data page and getting the latest result is called merge. A merge is triggered when:

  • access this data page;
  • The background master thread will merge regularly;
  • When the database buffer pool is not enough;
  • When the database is shut down normally;
  • When the redo log is full;

The change buffer means that when the non-unique ordinary index page is not in the buffer pool, when the page is written, the record change buffer is first, and when the future data is read, the operation in the change buffer is merged to the original data. page technology. Before MySQL 5.5, it was called insert buffer, which was only optimized for insert; now it is also valid for delete and update, called change buffer.


see

Detailed explanation of mysql storage engine InnoDB, see InnoDB data structure from the bottom

Mysql5.7 Adaptive hash index Adaptive hash index

change buffer (write buffer)

Buffer pool (buffer pool), this time I fully understand! ! !

Guess you like

Origin blog.csdn.net/master_hunter/article/details/127102645