MySQL's three major logs: undo log, redo log, binlog

Table of contents

A SQL execution process

Why do you need undo log?

 How does undo log flush (persist to disk)?

Why do you need Buffer Pool?

What does Buffer Pool cache?

 What does the Undo page record?

When querying a record, do I only need to buffer one record?

Why do we need redo log?

What is redo log?

If the Undo page is modified, do I need to record the corresponding redo log?

What is the difference between redo log and undo log?

The redo log needs to be written to the disk, and the data must also be written to the disk. Why bother?

When will the redo log be flushed?

Why do you need binlog?

Why do we need redo log when we have binlog?

What is the difference between redo log and binlog?

If the entire database data is accidentally deleted, can the data be restored using the redo log file?


A SQL execution process

We know the process that a query statement goes through, which is the process of "reading" a record, as shown below:

UPDATE t_user SET name = 'xiaolin' WHERE id = 1;

The set of processes for query statements and update statements will also go through the same process:

  • The client first establishes a connection through the connector, and the connector will automatically determine the user's identity;
  • Because this is an update statement, it does not need to go through the query cache. However, if there is an update statement on the table, the query cache of the entire table will be cleared. Therefore, the query cache is useless, and this function was removed in MySQL 8.0;
  • The parser will identify the keyword update, table name, etc. through lexical analysis, build a syntax tree, and then perform syntax analysis to determine whether the input statement conforms to MySQL syntax;
  • The preprocessor will determine whether the table and fields exist;
  • The optimizer determines the execution plan. Because the id in the where condition is the primary key index, it decides to use the id index;
  • The executor is responsible for specific execution, finding this row, and then updating it.

However, the process of updating statements will involve three types of logs: undo log (rollback log), redo log (redo log), and binlog (archive log):

  • undo log (rollback log) : It is a log generated by the Innodb storage engine layer, which realizes atomicity in transactions and is mainly used for transaction rollback and MVCC .
  • redo log (redo log) : It is a log generated by the Innodb storage engine layer, which implements persistence in transactions and is mainly used for fault recovery such as power outages ;
  • binlog (archive log) : It is a log generated by the server layer, mainly used for data backup and master-slave replication ;

Why do you need undo log?

When we execute an "add, delete, and modify" statement, although we do not enter begin to start the transaction and commit to submit the transaction, MySQL will implicitly start the transaction to execute the "add, delete, and modify" statement, and automatically commit the transaction after execution. In this way, It is guaranteed that after executing the "add, delete, and modify" statements, we can see the results of "add, delete, and modify" in the database table in time.

Whether to automatically commit a transaction when executing a statement is  autocommit determined by parameters, and is enabled by default. Therefore, executing an update statement will also use transactions.

So, consider a question. During the execution of a transaction, if MySQL crashes before the transaction is committed, how to roll back to the data before the transaction?

If we record the information needed for rollback into a log every time during transaction execution, then if MySQL crashes midway through transaction execution, we don’t have to worry about being unable to roll back to the data before the transaction. We can pass This log is rolled back to the data before the transaction.

The mechanism to implement this is  undo log (rollback log), which guarantees the atomicity (Atomicity) in the ACID of the transaction .

undo log is a log used for undo rollback. Before the transaction is committed, MySQL will first record the pre-updated data into the undo log file. When the transaction is rolled back, the undo log can be used to roll back. As shown below:

Whenever the InnoDB engine operates on a record (modify, delete, add), it must record all the information required for rollback into the undo log, such as:

  • When inserting a record, you need to write down the primary key value of this record, so that when you roll back later, you only need to delete the record corresponding to this primary key value ;
  • When deleting a record, you need to write down all the contents of the record, so that when you roll back later, you can insert the record composed of these contents into the table;
  • When updating a record, record the old values ​​of the updated columns so that these columns can be updated to the old values ​​when rolling back.

When a rollback occurs, the data in the undo log is read, and then the reverse operation is performed. For example, when a record is deleted, the contents of the record will be recorded in the undo log, and then when a rollback operation is performed, the data in the undo log will be read, and then the insert operation will be performed.

Different operations have different contents that need to be recorded, so the format of the undo log generated by different types of operations (modification, deletion, addition) is also different. I will not introduce the specific format of the undo log for each operation in detail. Yes, if you are interested, you can check it out yourself.

The undo log format generated by each update operation of a record has a roll_pointer pointer and a trx_id transaction id:

  • Through trx_id, you can know which transaction modified the record;
  • These undo logs can be strung into a linked list through the roll_pointer pointer. This linked list is called a version chain;

The version chain is as shown below:

In addition, undo log also has a function, which is to implement MVCC (multi-version concurrency control) through ReadView + undo log .

For transactions with "read commit" and "repeatable read" isolation levels, their snapshot reads (ordinary select statements) are implemented through Read View + undo log. The difference lies in the timing of creating Read View:

  • The "Read Commit" isolation level generates a new Read View for each select, which also means that if the same data is read multiple times during a transaction, the data read twice before and after may be inconsistent, because there may be other data during this period. A transaction modified the record and committed the transaction.
  • The "Repeatable Read" isolation level generates a Read View when starting a transaction, and then uses this Read View during the entire transaction. This ensures that the data read during the transaction are all records before the transaction is started.

These two isolation levels are implemented through the comparison of "fields in the transaction's Read View" and "two hidden columns (trx_id and roll_pointer) in the record". If the visible rows are not satisfied, the undo log version chain will be followed. Find the record that satisfies its visibility, thereby controlling the behavior of concurrent transactions when accessing the same record. This is called MVCC (Multiple Version Concurrency Control).

Therefore, undo log has two major functions:

  • Implement transaction rollback and ensure the atomicity of transactions . During transaction processing, if an error occurs or the user executes the ROLLBACK statement, MySQL can use the historical data in the undo log to restore the data to the state before the transaction started.
  • One of the key factors in implementing MVCC (Multiple Version Concurrency Control) . MVCC is implemented through ReadView + undo log. The undo log saves multiple copies of historical data for each record. When MySQL executes a snapshot read (ordinary select statement), it will follow the version chain of the undo log to find records that meet its visibility based on the information in the transaction's Read View.

 How does undo log flush (persist to disk)?

The flush strategy for undo log and data page is the same, and both require redo log to ensure persistence.

There are undo pages in the buffer pool, and modifications to the undo pages will also be recorded in the redo log. The redo log will flush the disk every second, and will also flush the disk when a transaction is submitted. Data pages and undo pages rely on this mechanism to ensure persistence.

Why do you need Buffer Pool?

MySQL data is stored on disk, so when we want to update a record, we must first read the record from the disk and then modify the record in memory. After modifying this record, should I choose to write it directly back to the disk, or choose to cache it?

Of course, it is better to cache it, so that next time a query statement hits this record, the record in the cache can be read directly without the need to obtain data from the disk.

To this end, Innodb storage engine designed a buffer pool (Buffer Pool) to improve the read and write performance of the database.

After having Buffer Poo:

  • When reading data, if the data exists in the Buffer Pool, the client will directly read the data in the Buffer Pool, otherwise it will read it from the disk.
  • When modifying data, if the data exists in the Buffer Pool, directly modify the page where the data in the Buffer Pool is located, and then set its page as a dirty page (the memory data of this page is no longer consistent with the data on the disk). In order to reduce the disk In I/O, dirty pages will not be written to the disk immediately. Subsequently, the background thread will choose an appropriate time to write the dirty pages to the disk.

What does Buffer Pool cache?

InnoDB divides the stored data into several "pages", using pages as the basic unit of interaction between disk and memory. The default size of a page is 16KB. Therefore, the Buffer Pool also needs to be divided by "pages".

When MySQL starts, InnoDB will apply for a continuous memory space for the Buffer Pool, and then 16KBdivide the pages into pages according to the default size. The pages in the Buffer Pool are called cache pages . At this time, these cache pages are free. Later, as the program runs, pages on the disk will be cached in the Buffer Pool.

Therefore, when MySQL first starts, you will observe that the virtual memory space used is large, but the physical memory space used is very small. This is because the operating system will trigger a page fault interrupt only after these virtual memories are accessed. , apply for physical memory, and then establish a mapping relationship between the virtual address and the physical address.

In addition to caching "index pages" and "data pages", the Buffer Pool also includes Undo pages, insertion cache, adaptive hash index, lock information, etc.

 What does the Undo page record?

After the transaction is started, before the InnoDB layer updates the record, the corresponding undo log must first be recorded. If it is an update operation, the old value of the updated column needs to be recorded, that is, an undo log must be generated, and the undo log will be written to the Buffer Pool. Undo page in .

When querying a record, do I only need to buffer one record?

no.

When we query a record, InnoDB will load the entire page of data into the Buffer Pool. After loading the page into the Buffer Pool, it will then locate a specific record through the "page directory" in the page.

Why do we need redo log?

It is true that Buffer Pool improves reading and writing efficiency, but here comes the problem. Buffer Pool is based on memory, and memory is always unreliable. In the event of a power outage and restart, the dirty page data that has not had time to be written to the disk will be lost.

In order to prevent data loss caused by power outages, when a record needs to be updated, the InnoDB engine will first update the memory (and mark it as a dirty page), and then record the modifications to this page in the form of redo log. , the update is completed at this time .

Subsequently, the InnoDB engine will flush the dirty pages cached in the Buffer Pool to the disk by a background thread at the appropriate time. This is  WAL (Write-Ahead Logging) technology .

WAL technology means that MySQL's write operation is not written to the disk immediately, but the log is written first, and then written to the disk at the appropriate time .

What is redo log?

The redo log is a physical log that records the modifications made to a certain data page. For example, an AAA update is made to the ZZZ offset of the YYY data page in the XXX table space . Whenever a transaction is executed, such a message or Multiple physical logs.

When a transaction is committed, you only need to persist the redo log to the disk first. You do not need to wait until the dirty page data cached in the Buffer Pool is persisted to the disk.

When the system crashes, although the dirty page data is not persisted, the redo log has been. After MySQL is restarted, all data can be restored to the latest state based on the contents of the redo log.

If the Undo page is modified, do I need to record the corresponding redo log?

needs.

After the transaction is started, before the InnoDB layer updates the record, the corresponding undo log must first be recorded. If it is an update operation, the old value of the updated column needs to be recorded, that is, an undo log must be generated, and the undo log will be written to the Buffer Pool. Undo page in .

However, after the Undo page is modified in the memory, the corresponding redo log needs to be recorded .

What is the difference between redo log and undo log?

These two types of logs belong to the InnoDB storage engine. Their differences are:

  • The redo log records the data status of the transaction " after completion " and records the updated value ;
  • The undo log records the data status " before the start " of this transaction, and records the value before the update ;

If a crash occurs before the transaction is committed, the transaction will be rolled back through the undo log after the restart. If a crash occurs after the transaction is committed, the transaction will be restored through the redo log after the restart.

Therefore, with redo log and WAL technology, InnoDB can ensure that even if the database restarts abnormally, previously submitted records will not be lost. This capability is called  crash-safe (crash recovery). It can be seen that  redo log ensures the durability of the four major characteristics of transactions .

The redo log needs to be written to the disk, and the data must also be written to the disk. Why bother?

The method of writing redo log uses append operation, so the disk operation is sequential writing , and writing data needs to find the writing location first, and then write it to the disk, so the disk operation is random writing .

"Sequential writing" to disk is much more efficient than "random writing", so the overhead of redo log writing to disk is smaller.

Regarding the question of why "sequential writing" is faster than "random writing", it can be compared to having a notebook. Writing page by page in order is definitely much faster than having to find the corresponding page to write every word.

It can be said that this is another advantage of WAL technology: MySQL's write operation changes from "random write" to "sequential write" on the disk , improving the execution performance of statements. This is because MySQL's write operations are not updated to the disk immediately, but are recorded in the log first and then updated to the disk at the appropriate time.

At this point, we have two answers to the question why redo log is needed:

  • Implement transaction durability and make MySQL crash-safe , ensuring that if MySQL suddenly crashes at any time, previously submitted records will not be lost after restart;
  • Change the write operation from "random write" to "sequential write" to improve the performance of MySQL writing to disk.

When will the redo log be flushed?

Is the redo log cached in the redo log buffer still in memory? When will it be flushed to disk?

There are mainly the following opportunities:

  • When MySQL is shut down normally;
  • When the amount of writes recorded in the redo log buffer is greater than half of the redo log buffer memory space, a disk drop will be triggered;
  • InnoDB's background thread persists the redo log buffer to disk every second.
  • Each time a transaction is committed, the redo log cached in the redo log buffer is directly persisted to the disk (this strategy can be controlled by the innodb_flush_log_at_trx_commit parameter, which will be discussed below).

Why do you need binlog?

The undo log and redo log introduced earlier are both generated by the Innodb storage engine.

After MySQL completes an update operation, the server layer will also generate a binlog. When the transaction is submitted later, all binlogs generated during the execution of the transaction will be written to the binlog file.

The binlog file is a log that records all database table structure changes and table data modifications. It does not record query operations, such as SELECT and SHOW operations.

Why do we need redo log when we have binlog?

This problem is related to the MySQL timeline.

Initially, there was no InnoDB engine in MySQL. MySQL's own engine was MyISAM, but MyISAM did not have crash-safe capabilities, and binlog logs could only be used for archiving.

InnoDB is another company that introduced MySQL in the form of a plug-in. Since relying only on binlog does not have crash-safe capabilities, InnoDB uses redo log to achieve crash-safe capabilities.

What is the difference between redo log and binlog?

There are four differences between these two logs.

1. Different applicable objects:

  • Binlog is a log implemented by the server layer of MySQL and can be used by all storage engines;
  • redo log is the log implemented by Innodb storage engine;

2. Different file formats:

  • Binlog has three format types, namely STATEMENT (default format), ROW, and MIXED. The differences are as follows:
    • STATEMENT: Every SQL that modifies data will be recorded in the binlog (equivalent to recording logical operations, so for this format, the binlog can be called a logical log), and the slave side in the master-slave replication will reproduce it based on the SQL statement. But STATEMENT has the problem of dynamic functions. For example, if you use uuid or now functions, the result you execute on the main library is not the result you execute on the slave library. Such functions that change at any time will cause the copied data to be inconsistent;
    • ROW: records how the row data was finally modified (logs in this format cannot be called logical logs), and there will be no problems with dynamic functions under STATEMENT. However, the disadvantage of ROW is that the change results of each row of data will be recorded. For example, if you execute a batch update statement, as many records will be generated as many rows of data are updated, making the binlog file too large. In the STATEMENT format, only one update statement will be recorded. ;
    • MIXED: Contains STATEMENT and ROW modes, it will automatically use ROW mode and STATEMENT mode according to different situations;
  • The redo log is a physical log, which records modifications made to a certain data page, such as an AAA update at the ZZZ offset of the YYY data page in the XXX table space;

3. Different writing methods:

  • Binlog is append writing. When a file is filled, a new file will be created to continue writing. The previous log will not be overwritten, and the entire log will be saved.
  • Redo log is written in a loop, and the log space size is fixed. When it is full, it starts from the beginning and saves the dirty page log that has not been flushed to the disk.

4. Different uses:

  • binlog is used for backup and recovery, master-slave replication;
  • redo log is used for power failure and other fault recovery.

If the entire database data is accidentally deleted, can the data be restored using the redo log file?

You cannot use the redo log file to restore, you can only use the binlog file to restore.

Because the redo log file is written cyclically, the log is erased while writing. Only the physical log of data that has not been flushed to the disk is recorded. The data that has been flushed to the disk will be erased from the redo log file.

The binlog file saves the full amount of logs, that is, it saves all data changes. In theory, as long as the data recorded in the binlog can be recovered, so if the entire database data is accidentally deleted, you must use the binlog file to restore it. data.

Guess you like

Origin blog.csdn.net/m0_62609939/article/details/132053123