Detailed explanation of MySQL transaction log (redo log and undo log)

foreword

Transactions have four characteristics: atomicity , consistency , isolation , and durability . So what mechanism are the four characteristics of transactions based on?

  • Transaction isolation 锁机制is achieved .
  • The atomicity, consistency and durability of the transaction are guaranteed by the redo log and undo log of the transaction.
    REDO LOG is called 重做日志 to provide rewrite operation and restore the page operation modified by the committed transaction to ensure the durability of the transaction.
    UNDO LOG is called 回滚日志, and the rollback line is recorded to a specific version to ensure the atomicity and consistency of the transaction.

Some DBAs may think that UNDO is the reverse process of REDO, but it is not.

redo log

Why do you need REDO logs

On the one hand, the buffer pool can help us eliminate the gap between the CPU and the disk, and the checkpoint mechanism can guarantee the final placement of data. However, due to the checkpoint, the master thread processes it at intervals 并不是每次变更的时候就触发. So the worst case is that after the transaction is committed, the buffer pool has just been written, and the database is down, then this piece of data is lost and cannot be recovered.

On the other hand, the transaction 持久性contains the characteristics, that is to say, for a committed transaction, even if the system crashes after the transaction is committed, the changes made by the transaction to the database cannot be lost.

So how to ensure this persistence? 一个简单的做法: Before the transaction is committed, all the pages modified by the transaction are flushed to disk, but this simple and crude approach has some problems

另一个解决的思路: We just want the modification made by the committed transaction to the data in the database to take effect permanently. Even if the system crashes later, the modification can be restored after restarting. So we don't actually need to flush all the pages modified by the transaction in memory to disk every time the transaction is committed, just record what 修改has been changed . For example, a transaction changes the value of the byte at offset in 第10号the page in the system tablespace to z. We just need to record: update the value at offset 100 of page 10 of tablespace 0 to 2.10012

image-1670913108219

The benefits and characteristics of REDO logs

  1. Benefits
    Redo logs reduce the frequency of disk flushing
    Redo logs take up very little space
  2. Features
    The redo log is written to the disk sequentially.
    During the execution of the transaction, the redo log keeps recording

Composition of redo

Redo log can be simply divided into the following two parts:
重做日志的缓冲 (redo log buffer), which is stored in memory and is volatile.
Parameter setting: innodb_log_buffer_size:
redo log buffer size, default 16M , the maximum value is 4096M, the minimum value is 1M.

mysql> show variables like '%innodb_log_buffer_size%';
+------------------------+----------+
| Variable_name          | Value    |
+------------------------+----------+
| innodb_log_buffer_size | 16777216 |
+------------------------+----------+

重做日志文件 (redo log file) , stored in the hard disk, is persistent.

The overall process of redo

Taking an update transaction as an example, the redo log flow process is shown in the following figure:
image-1670913363449

Step 1: First read the original data from the disk into the memory, and modify the memory copy of the data.
Step 2: Generate a redo log and write it into the redo log buffer, which records the modified value of the data.
Step 3 : When the transaction commits, refresh the content in the redo log buffer to the redo log file, and write additionally to the redo log file
Step 4: Periodically refresh the modified data in the memory to the disk

Experience:
Write-Ahead Log (pre-log persistence): Before persisting a data page, first persist the corresponding log page in memory.

Redo log flushing strategy

The writing of the redo log is not directly written to the disk. The InnoDB engine will first write the redo log buffer when writing the redo log, and then 一定的频率flush to the real redo log file. What about a certain frequency here? This is what we want to say about the brushing strategy.
image-1670913470846

Note that the process of flushing the redo log buffer to the redo log file is not really flushed to the disk, but just flushed into the file system cache (page cache) (this is done by modern operating systems to improve file writing efficiency An optimization), the actual writing will be left to the system to decide (for example, the page cache is large enough). Then there is a problem for InnoDB. If it is handed over to the system for synchronization, if the system goes down, the data will also be lost (although the probability of the whole system going down is still relatively small).

In response to this situation, InnoDB gives innodb_flush_log_at_trx_commita parameter , which controls how to flush the logs in the redo log buffer to the redo log file when the commit commits the transaction. It supports three strategies:

  • 设置为0: Indicates that no flash operation will be performed each time a transaction is committed. (The system defaults master thread to synchronize redo logs every 1s)
  • 设置为1: Indicates that synchronization will be performed every time a transaction is committed, and the disk operation (default value)
  • 设置为2: Indicates that only the contents of the redo log buffer are written into the page cache each time a transaction is committed, and no synchronization is performed. It is up to the os to decide when to synchronize to disk files.

Demonstration of different brushing strategies

flow chart

image-1670913665943
image-1670913681984
image-1670913693252

Write redo log buffer process

Supplementary concept: Mini-Transaction

A transaction can contain several statements. Each statement is actually mtrcomposed , and each mtrcan contain several redo logs. Draw a picture to show their relationship like this:
image-1670913932243

Redo log is written to log buffer

image-1670913958160
Each mtr will generate a set of redo logs, and use a schematic diagram to describe the log situation generated by these mtrs:

image-1670913990513

Different transactions may be 并发performed , so T1, T2between mtr may be 交替执行.

image-1670914030772

Structure diagram of redo log block

image-1670914058265

image-1670914068800

redo log file

Related parameter settings

  • innodb_log_group_home_dir: Specify the path where the redo log file group is located. The default value is ./, which means it is under the data directory of the database. MySQL's default data directory ( var/lib/mysql ) has two files named ib_logfile0and by default ib_logfile1, and the logs in the log buffer are flushed to these two disk files by default. The location of this redo log file can also be modified.
  • innodb_log_files_in_group: Indicate the number of redo log files, named as: ib_logfile0, iblogfile1...iblogfilen. The default is 2, and the maximum is 100.
mysql> show variables like 'innodb_log_files_in_group';
+---------------------------+-------+
| Variable_name             | Value |
+---------------------------+-------+
| innodb_log_files_in_group | 2     |
+---------------------------+-------+
#ib_logfile0
#ib_logfile1

innodb_flush_log_at_trx_commit: Control the strategy of redo log flushing to disk, the default is 1.
innodb_log_file_size: Set the size of a single redo log file, the default value is 48M . The maximum value is 512G. Note that the maximum value refers to the sum of the entire redo log series files, that is, (innodb_log_files_in_group * innodb_log_file_size) cannot be greater than the maximum value of 512G.

mysql> show variables like 'innodb_log_file_size';
+----------------------+----------+
| Variable_name        | Value    |
+----------------------+----------+
| innodb_log_file_size | 50331648 |
+----------------------+----------+

Modify its size according to the business to accommodate larger transactions. Edit the my.cnf file and restart the database to take effect, as shown below

vim /etc/my.cnf
innodb_log_file_size=200M

log file group

image-1670914301004
The total redo log file size is actually:innodb_log_file_size × innodb_log_files_in_group

If data is written to the redo log file group in a circular manner, will the redo log written later overwrite the redo log written earlier? certainly! So the designers of InnoDB proposed the concept of checkpoint.

checkpoint

image-1670914356093

If the write pos catches up with the checkpoint, it means 日志文件组it is full, and no new redo log records can be written at this time. MySQL has to stop, clear some records, and advance the checkpoint.

image-1670914384820

Undo log

Redo log is a guarantee of transaction persistence, and undo log is a guarantee of transaction atomicity. In the transaction 更新数据, 前置操作 it is actually necessary to write one first undo log.

How to understand Undo logs

Transactions need to be guaranteed 原子性, that is, the operations in the transaction are either all completed or nothing is done. But sometimes there will be some situations in the middle of the execution of the transaction, such as:

  • Situation 1: Various errors may be encountered during transaction execution, such as 服务器本身的错误, 操作系统错误 , or even errors 断电caused .
  • Case 2: Programmers can manually enter the ROLLBACK statement during transaction execution to end the execution of the current transaction.

When the above situation occurs, we need to change the data back to the original state. This process is called 回滚, which can create a false impression: this transaction does not seem to do anything, so it meets 原子性the requirements .

The role of undo logs

  • Function 1: Roll back data
  • Role 2: MVCC

undo storage structure

undo storage structure

InnoDB uses a segmented approach to undo log management, that is 回滚段(rollback segment). 1024Each rollback segment records the applications undo log segmentmade in each undo log segment undo页.

  • In InnoDB1.1版本之前(excluding version 1.1), there is only one rollback segment, so the number of concurrently online transactions supported is limited to 1024 . Although it is sufficient for most applications.
  • Since version 1.1, InnoDB has the largest support 128个rollback segment , so the limit of its simultaneous online transactions has been increased to 128*1024.
mysql> show variables like 'innodb_undo_logs';
+------------------+-------+
| Variable_name    | Value |
+------------------+-------+
| innodb_undo_logs | 128   |
+------------------+-------+

Rollback segments and transactions

  1. Each transaction will only use one rollback segment, and one rollback segment may serve multiple transactions at the same time.
  2. When a transaction starts, a rollback segment is created. During the transaction, when the data is modified, the original data will be copied to the rollback segment.
  3. In the rollback segment, the transaction will continue to fill the extent until the end of the transaction or all the space is used up. If the current extent is not enough, the transaction will request the expansion of the next extent in the segment. If all the allocated extents are used up, the transaction will overwrite the original extent or extend the new extent if the rollback segment allows it. panel to use.
  4. The rollback segment exists in the undo tablespace. There can be multiple undo tablespaces in the database, but only one undo tablespace can be used at a time.
  5. When the transaction is committed, the InnoDB storage engine will do the following two things:
    put the undo log into the list for subsequent purge operations;
    judge whether the page where the undo log is located can be reused, and if it can be allocated to the next transaction

Data Classification in Rollback Segments

  1. 未提交的回滚数据(uncommitted undo information)
  2. 已经提交但未过期的回滚数据(committed undo information)
  3. 事务已经提交并过期的数据(expired undo information)

Types of undo

In the InnoDB storage engine, the undo log is divided into:

  • insert undo log
  • update undo log

The life cycle of undo log

brief generation process

Only the process of Buffer Pool:

image-1670915089128

With Redo Log and Undo Log:

image-1670915122995

Detailed generation process

image-1670915146806

When we do an INSERT:

begin;
INSERT INTO user (name) VALUES ("tom");

image-1670915178958

When we do an UPDATE:

image-1670915194888

UPDATE user SET id=2 WHERE id=1;

image-1670915242452

How undo log is rolled back

Taking the above example as an example, assuming that rollback is executed, the corresponding process should be as follows:

  1. Delete the data with id=2 through the log of undo no=3
  2. Restore the deletemark of the data with id=1 to 0 through the undo no=2 log
  3. Restore the name of the data with id=1 to Tom through the undo no=1 log
  4. Delete the data with id=1 through undo no=0 log

Undo log deletion

  • For the insert undo log
    , because the record of the insert operation is only visible to the transaction itself, it is not visible to other transactions. Therefore, the undo log can be deleted directly after the transaction is committed without purge operation.
  • For the update undo log,
    the undo log may need to provide an MVCC mechanism, so it cannot be deleted when the transaction is committed. Put it into the undo log linked list when submitting, and wait for the purge thread to perform the final deletion.

summary

image-1670915379561
The undo log is a logical log. When a transaction is rolled back, it just logically restores the database to its original state.
The redo log is a physical log, which records the physical changes of the data page. The undo log is not the reverse process of the redo log.

Guess you like

Origin blog.csdn.net/qq_49619863/article/details/128302410