foreword
Transactions have four characteristics: atomicity , consistency , isolation , and durability . So what mechanism are the four characteristics of transactions based on?
- Transaction isolation
锁机制
is achieved . - The atomicity, consistency and durability of the transaction are guaranteed by the redo log and undo log of the transaction.
REDO LOG is called重做日志
to provide rewrite operation and restore the page operation modified by the committed transaction to ensure the durability of the transaction.
UNDO LOG is called回滚日志
, and the rollback line is recorded to a specific version to ensure the atomicity and consistency of the transaction.
Some DBAs may think that UNDO is the reverse process of REDO, but it is not.
redo log
Why do you need REDO logs
On the one hand, the buffer pool can help us eliminate the gap between the CPU and the disk, and the checkpoint mechanism can guarantee the final placement of data. However, due to the checkpoint, the master thread processes it at intervals 并不是每次变更的时候就触发
. So the worst case is that after the transaction is committed, the buffer pool has just been written, and the database is down, then this piece of data is lost and cannot be recovered.
On the other hand, the transaction 持久性
contains the characteristics, that is to say, for a committed transaction, even if the system crashes after the transaction is committed, the changes made by the transaction to the database cannot be lost.
So how to ensure this persistence? 一个简单的做法
: Before the transaction is committed, all the pages modified by the transaction are flushed to disk, but this simple and crude approach has some problems
另一个解决的思路
: We just want the modification made by the committed transaction to the data in the database to take effect permanently. Even if the system crashes later, the modification can be restored after restarting. So we don't actually need to flush all the pages modified by the transaction in memory to disk every time the transaction is committed, just record what 修改
has been changed . For example, a transaction changes the value of the byte at offset in 第10号
the page in the system tablespace to z. We just need to record: update the value at offset 100 of page 10 of tablespace 0 to 2.100
1
2
The benefits and characteristics of REDO logs
- Benefits
Redo logs reduce the frequency of disk flushing
Redo logs take up very little space - Features
The redo log is written to the disk sequentially.
During the execution of the transaction, the redo log keeps recording
Composition of redo
Redo log can be simply divided into the following two parts:
重做日志的缓冲 (redo log buffer)
, which is stored in memory and is volatile.
Parameter setting: innodb_log_buffer_size:
redo log buffer size, default 16M
, the maximum value is 4096M, the minimum value is 1M.
mysql> show variables like '%innodb_log_buffer_size%';
+------------------------+----------+
| Variable_name | Value |
+------------------------+----------+
| innodb_log_buffer_size | 16777216 |
+------------------------+----------+
重做日志文件 (redo log file)
, stored in the hard disk, is persistent.
The overall process of redo
Taking an update transaction as an example, the redo log flow process is shown in the following figure:
Step 1: First read the original data from the disk into the memory, and modify the memory copy of the data.
Step 2: Generate a redo log and write it into the redo log buffer, which records the modified value of the data.
Step 3 : When the transaction commits, refresh the content in the redo log buffer to the redo log file, and write additionally to the redo log file
Step 4: Periodically refresh the modified data in the memory to the disk
Experience:
Write-Ahead Log (pre-log persistence): Before persisting a data page, first persist the corresponding log page in memory.
Redo log flushing strategy
The writing of the redo log is not directly written to the disk. The InnoDB engine will first write the redo log buffer when writing the redo log, and then 一定的频率
flush to the real redo log file. What about a certain frequency here? This is what we want to say about the brushing strategy.
Note that the process of flushing the redo log buffer to the redo log file is not really flushed to the disk, but just flushed into the file system cache (page cache) (this is done by modern operating systems to improve file writing efficiency An optimization), the actual writing will be left to the system to decide (for example, the page cache is large enough). Then there is a problem for InnoDB. If it is handed over to the system for synchronization, if the system goes down, the data will also be lost (although the probability of the whole system going down is still relatively small).
In response to this situation, InnoDB gives innodb_flush_log_at_trx_commit
a parameter , which controls how to flush the logs in the redo log buffer to the redo log file when the commit commits the transaction. It supports three strategies:
设置为0
: Indicates that no flash operation will be performed each time a transaction is committed. (The system defaults master thread to synchronize redo logs every 1s)设置为1
: Indicates that synchronization will be performed every time a transaction is committed, and the disk operation (default value)设置为2
: Indicates that only the contents of the redo log buffer are written into the page cache each time a transaction is committed, and no synchronization is performed. It is up to the os to decide when to synchronize to disk files.
Demonstration of different brushing strategies
flow chart
Write redo log buffer process
Supplementary concept: Mini-Transaction
A transaction can contain several statements. Each statement is actually mtr
composed , and each mtr
can contain several redo logs. Draw a picture to show their relationship like this:
Redo log is written to log buffer
Each mtr will generate a set of redo logs, and use a schematic diagram to describe the log situation generated by these mtrs:
Different transactions may be 并发
performed , so T1
, T2
between mtr
may be 交替执行
.
Structure diagram of redo log block
redo log file
Related parameter settings
innodb_log_group_home_dir
: Specify the path where the redo log file group is located. The default value is./
, which means it is under the data directory of the database. MySQL's default data directory (var/lib/mysql
) has two files namedib_logfile0
and by defaultib_logfile1
, and the logs in the log buffer are flushed to these two disk files by default. The location of this redo log file can also be modified.innodb_log_files_in_group
: Indicate the number of redo log files, named as: ib_logfile0, iblogfile1...iblogfilen. The default is 2, and the maximum is 100.
mysql> show variables like 'innodb_log_files_in_group';
+---------------------------+-------+
| Variable_name | Value |
+---------------------------+-------+
| innodb_log_files_in_group | 2 |
+---------------------------+-------+
#ib_logfile0
#ib_logfile1
innodb_flush_log_at_trx_commit
: Control the strategy of redo log flushing to disk, the default is 1.
innodb_log_file_size
: Set the size of a single redo log file, the default value is 48M
. The maximum value is 512G. Note that the maximum value refers to the sum of the entire redo log series files, that is, (innodb_log_files_in_group * innodb_log_file_size) cannot be greater than the maximum value of 512G.
mysql> show variables like 'innodb_log_file_size';
+----------------------+----------+
| Variable_name | Value |
+----------------------+----------+
| innodb_log_file_size | 50331648 |
+----------------------+----------+
Modify its size according to the business to accommodate larger transactions. Edit the my.cnf file and restart the database to take effect, as shown below
vim /etc/my.cnf
innodb_log_file_size=200M
log file group
The total redo log file size is actually:innodb_log_file_size × innodb_log_files_in_group
If data is written to the redo log file group in a circular manner, will the redo log written later overwrite the redo log written earlier? certainly! So the designers of InnoDB proposed the concept of checkpoint.
checkpoint
If the write pos catches up with the checkpoint, it means 日志文件组
it is full, and no new redo log records can be written at this time. MySQL has to stop, clear some records, and advance the checkpoint.
Undo log
Redo log is a guarantee of transaction persistence, and undo log is a guarantee of transaction atomicity. In the transaction 更新数据
, 前置操作
it is actually necessary to write one first undo log
.
How to understand Undo logs
Transactions need to be guaranteed 原子性
, that is, the operations in the transaction are either all completed or nothing is done. But sometimes there will be some situations in the middle of the execution of the transaction, such as:
- Situation 1: Various errors may be encountered during transaction execution, such as
服务器本身的错误
,操作系统错误
, or even errors断电
caused . - Case 2: Programmers can manually enter the ROLLBACK statement during transaction execution to end the execution of the current transaction.
When the above situation occurs, we need to change the data back to the original state. This process is called 回滚
, which can create a false impression: this transaction does not seem to do anything, so it meets 原子性
the requirements .
The role of undo logs
- Function 1: Roll back data
- Role 2: MVCC
undo storage structure
undo storage structure
InnoDB uses a segmented approach to undo log management, that is 回滚段(rollback segment)
. 1024
Each rollback segment records the applications undo log segment
made in each undo log segment undo页
.
- In
InnoDB1.1版本之前
(excluding version 1.1), there is only one rollback segment, so the number of concurrently online transactions supported is limited to1024
. Although it is sufficient for most applications. - Since version 1.1, InnoDB has the largest support
128个rollback segment
, so the limit of its simultaneous online transactions has been increased to128*1024
.
mysql> show variables like 'innodb_undo_logs';
+------------------+-------+
| Variable_name | Value |
+------------------+-------+
| innodb_undo_logs | 128 |
+------------------+-------+
Rollback segments and transactions
- Each transaction will only use one rollback segment, and one rollback segment may serve multiple transactions at the same time.
- When a transaction starts, a rollback segment is created. During the transaction, when the data is modified, the original data will be copied to the rollback segment.
- In the rollback segment, the transaction will continue to fill the extent until the end of the transaction or all the space is used up. If the current extent is not enough, the transaction will request the expansion of the next extent in the segment. If all the allocated extents are used up, the transaction will overwrite the original extent or extend the new extent if the rollback segment allows it. panel to use.
- The rollback segment exists in the undo tablespace. There can be multiple undo tablespaces in the database, but only one undo tablespace can be used at a time.
- When the transaction is committed, the InnoDB storage engine will do the following two things:
put the undo log into the list for subsequent purge operations;
judge whether the page where the undo log is located can be reused, and if it can be allocated to the next transaction
Data Classification in Rollback Segments
未提交的回滚数据(uncommitted undo information)
已经提交但未过期的回滚数据(committed undo information)
事务已经提交并过期的数据(expired undo information)
Types of undo
In the InnoDB storage engine, the undo log is divided into:
- insert undo log
- update undo log
The life cycle of undo log
brief generation process
Only the process of Buffer Pool:
With Redo Log and Undo Log:
Detailed generation process
When we do an INSERT:
begin;
INSERT INTO user (name) VALUES ("tom");
When we do an UPDATE:
UPDATE user SET id=2 WHERE id=1;
How undo log is rolled back
Taking the above example as an example, assuming that rollback is executed, the corresponding process should be as follows:
- Delete the data with id=2 through the log of undo no=3
- Restore the deletemark of the data with id=1 to 0 through the undo no=2 log
- Restore the name of the data with id=1 to Tom through the undo no=1 log
- Delete the data with id=1 through undo no=0 log
Undo log deletion
- For the insert undo log
, because the record of the insert operation is only visible to the transaction itself, it is not visible to other transactions. Therefore, the undo log can be deleted directly after the transaction is committed without purge operation. - For the update undo log,
the undo log may need to provide an MVCC mechanism, so it cannot be deleted when the transaction is committed. Put it into the undo log linked list when submitting, and wait for the purge thread to perform the final deletion.
summary
The undo log is a logical log. When a transaction is rolled back, it just logically restores the database to its original state.
The redo log is a physical log, which records the physical changes of the data page. The undo log is not the reverse process of the redo log.