Anhydrous dry goods: InnoDB underlying principle

Many articles start directly to introduce which storage engines are available, without introducing the storage engine itself. So what is a storage engine? I wonder if you have ever thought about how MySQL stores the data we throw in?

Anhydrous dry goods: InnoDB underlying principle Anhydrous dry goods: InnoDB underlying principle

Storage engine

Many articles start directly to introduce which storage engines are available, without introducing the storage engine itself. So what is a storage engine? I wonder if you have ever thought about how MySQL stores the data we throw in?

In fact, the storage engine is also very simple. I think it is a storage solution that implements functions such as adding data, updating data, and indexing.

What are the existing storage engines we can choose from?

InnoDB、MyISAM、Memory、CSV、Archive、Blackhole、Merge、Federated、Example

There are many types, but the only commonly used storage engines are InnoDB and MyISAM. I will also focus on these two storage engines.

InnoDB is currently the most widely used MySQL storage engine. MySQL has been the default storage engine since version 5.5. So do you know why InnoDB is widely used? Put this question aside first, let's first understand the underlying principles of the InnoDB storage engine.

InnoDB's memory architecture is mainly divided into three major blocks, buffer pool (Buffer Pool), redo buffer pool (Redo Log Buffer) and additional memory pool

Buffer pool

InnoDB stores the data on disk in order to make data persistence. But in the face of a large number of requests, the gap between the processing speed of the CPU and the IO speed of the disk is too large. In order to improve the overall efficiency, InnoDB introduces a buffer pool.

When there is a request to query data, if there is no data in the buffer pool, it will search in the disk and put the matched data in the buffer pool. Similarly, if there is a request to modify the data, MySQL will not modify the disk directly, but will modify the data in the pages of the buffer pool, and then flush the data back to the disk. This is the function of the buffer pool to accelerate reading , Speed ​​up writing, reduce IO interaction with disk.

To put it bluntly, the buffer pool is to throw the data in the disk into the memory. Since it is the memory, there will be no memory space to allocate. Therefore, the buffer pool uses the LRU algorithm to eliminate pages when there are no free pages in the buffer pool. However, the use of this algorithm will bring a problem called buffer pool pollution.

When you are performing a batch scan or even a full table scan, all hot pages in the buffer pool may be replaced. This may lead to a cliff-like decline in MySQL performance. So InnoDB has made some optimizations to LRU to avoid this problem.

MySQL uses log first. Before actually writing data, it will first record a log called Redo Log. It will periodically use CheckPoint technology to flush the new Redo Log to disk, which will be discussed later.

In addition to data, it also stores index pages, Undo pages, insert buffers, adaptive hash indexes, InnoDB lock information, and data dictionary. Here are a few more important ones for a brief chat.

Insert buffer

The operation of insert buffer is update or insert, we consider the worst case, that is, the data that needs to be updated is not in the buffer pool. Then there will be the following two options at this time.

When a piece of data comes, it is directly written to the disk
and the data reaches a certain threshold (for example, 50 pieces) before being written to the disk in batches.
Obviously, the second solution is better, reducing the interaction with disk IO.

Write twice

Since I have talked about insert buffering, I have to mention it twice, because I think these two InnoDB features are complementary.

The insert buffer improves the performance of MySQL, and the two writes improve the reliability of the data on this basis. We know that when the data is still in the buffer pool, when the machine is down and a write failure occurs, Redo Log is used to recover. But what if it crashes while flushing data from the buffer pool back to disk?

This situation is called partial write failure, at this time the redo log cannot solve the problem.

Anhydrous dry goods: InnoDB underlying principle Anhydrous dry goods: InnoDB underlying principle

When flushing dirty pages, it is not directly flushed to the disk, but copied to the Doublewrite Buffer in the memory, and then copied to the disk shared table space (you can understand it as a disk), and write 1M each time, etc. After completion, the pages in the Doublewrite Buffer are written to the disk file.

With the double-write mechanism, even if the machine is down when the dirty page is flushed, the copy of the Doublewrite Buffer page can be found in the shared table space when the instance is restored, and the original data page can be directly overwritten.

Adaptive hash index

Adaptive indexing is the same as that when JVM is running, it will dynamically compile certain hot codes into Machine Code. InnoDB will monitor all index queries and build hash indexes for hot pages to improve access speed. .

You may have seen a keyword page many times, then let’s talk about what the page is?

page

Page is the smallest unit of data management in InnoDB. When we query data, it loads the data from the disk into the buffer pool in units of pages. In the same way, updating data is also in units of pages, flushing our changes to the data back to disk. The default size of each page is 16k, and each page contains several rows of data. The page structure is shown in the figure below.

Anhydrous dry goods: InnoDB underlying principle Anhydrous dry goods: InnoDB underlying principle

Don't worry too much about what each area is doing, we just need to know where the benefits of this design are. The data of each page can be used to form a doubly linked list between the previous and next pages in the FileHeader. Because in actual physical storage, data is not stored continuously. You can understand it as the distribution of G1 Region in memory.

The row data contained in a page forms a singly linked list between rows. The row data we save will eventually go to User Records, of course, initially User Records does not occupy any storage space. As we store more and more data, User Records will become larger and larger, and the space of Free Space will become smaller and smaller. Until it is occupied, a new data page will be requested.

The data in User Records is sorted according to the primary key id. When we look up according to the primary key, we will look backward along this singly linked list.

Redo log buffer

As discussed above, the page data update in the buffer pool in InnoDB will be updated before the disk data, and InnoDB will also use the Write Ahead Log strategy to refresh the data. What does it mean? When the transaction starts, it will first record Redo Log to Redo Log Buffer, and then update the buffer pool page data.

The data in the Redo Log Buffer will be written to the redo log at a certain frequency. Pages that have been changed will be marked as dirty pages, and InnoDB will flush dirty pages to disk according to the CheckPoint mechanism.

Log

Redo log is mentioned above. This section will specifically talk about the log. The log is divided into the following two dimensions.

MySQL level

InnoDB level

MySQL log

MySQL logs can be divided into error logs, binary files, query logs, and full query logs.

The error log is well understood, it is the log of serious errors that occurred during the running of the service. When our database cannot be started, you can come here to see the specific reason why it cannot be started.
Binary file has another name that you should be familiar with, called Binlog, which records all changes to the database.
The query log records all statements from the client. The
slow query log records all the SQL statements whose response time exceeds the threshold. We can set this threshold by ourselves. The parameter is long_query_time. The default value is 10s, and the default is closed. Manually Open.

InnoDB log

There are only two InnoDB logs, Redo Log and Undo Log,

Redo Log Redo log is used to record changes in transaction operations and record the modified values. It will be recorded regardless of whether the transaction is committed or not. For example, when updating data, the updated record will be written to Redo Log first, and then the data in the page in the cache will be updated. Then flush the data in the memory back to disk according to the set update strategy.
Undo Log records a version before the start of the recorded transaction, which can be used for rollback that occurs after the transaction fails.
Redo Log records the modification on a specific data page, which can only be used on the current server, while Binlog can be understood as it can be used by other types of storage engines. This is also an important role of Binlog, that is, master-slave replication, and another role is data recovery.

As mentioned above, all modifications to the database are recorded in Binlog, and the log has three formats. They are Statement, Row and MixedLevel.

Statement records all SQL that will modify data. It only records SQL, and does not need to record all rows affected by this SQL, which reduces the amount of logs and improves performance. However, because it only records the executed statement, it cannot guarantee that it can be executed correctly on the Slave node. Therefore, it needs to record some additional context information.
Row only saves the modified records. Compared with Statement only records the executed SQL, Row will generate a lot of logs. But Row does not need to record context information, just pay attention to what is changed.
MixedLevel is the mixed use of Statement and Row.
Which log to use depends on the actual situation. For example, an UPDATE statement updates a lot of data, using Statement will save space, but relatively, Row will be more reliable.

The difference between InnoDB and MyISAM

Since MyISAM is not commonly used, I am not going to go into some of its underlying principles and implementations. We simply compare the difference between these two storage engines here. Let's describe it bit by bit.

Transaction InnoDB supports transaction, rollback, transaction security and crash recovery. MyISAM does not support it, but the query speed is faster than InnoDB. The
primary key InnoDB stipulates that if the primary key is not set, a 6-byte primary key is automatically generated, and MyISAM allows no index and primary key to exist, and the index is the address of the row
Foreign keys InnoDB supports foreign keys, but MyISAM does not support
table locks. InnoDB supports row locks and table locks, while MyISAM only supports table locks
. InnoDB supports full-text indexing. InnoDB does not support full-text indexing, but plug-ins can be used to implement the corresponding functions, while MyISAM itself Support full index
row count When InnoDB gets the row count, it needs to scan the entire table. MyISAM saves the total number of rows in the current table and can be read directly.
So, to briefly summarize, MyISAM is only suitable for scenarios where the query is greater than the update. If your system's query is the vast majority (such as a reporting system), you can use MyISAM to store it. In addition, it is recommended to use InnoDB.

End

Due to time reasons, this article only briefly talked about the overall architecture of InnoDB, and did not discuss some points in depth. For example, how InnoDB is improved to solve buffer pool pollution, what is its algorithm, how does checkpoint work, etc., just do a simple understanding, and then talk in detail if you have time.

This article address: https://www.linuxprobe.com/underlying-principles-innodb.html

Guess you like

Origin blog.csdn.net/u014389734/article/details/108449941