You want to know the role of undo log here

Today we will introduce the undo log of mysql to give you a deeper understanding of its function.

0 1. Why do we need undo log?

Consider a question. During the execution of a transaction, before the transaction is committed, if mysql crashes, how to roll back to the data before the transaction?

If we record the information needed for rollback into a log every time during transaction execution, then after a mysql crash occurs in the middle of transaction execution, we don’t have to worry about not being able to roll back to the data before the transaction. We can pass This log rolls back to the data before the transaction. Furthermore, when the user requests a rollback with a rollback statement, the undo information can be used to roll back the data to the state before the modification.

The mechanism to achieve this is the undo log (rollback log), which guarantees the atomicity in the ACID characteristics of the transaction.

In addition to the rollback operation of the undo log, another function is MVCC. When the user reads a row of records, if the record has been occupied by other transactions, the current transaction can read the previous row version information through undo, so as to achieve non Lock read.

MVCC implementation is through read view + undo log, undo log saves multiple copies of historical data for each record, forming a version chain . (read view will talk about MVCC next time).

0 2.Undo log version chain

When there are always transactions changing the row, undo logs will be generated all the time, and an undo log version chain will eventually be formed.

What are the hidden columns in the next row before talking about the version column?

On each row in the database, in addition to storing real data, there are 3 hidden columns row_id, trx_id and roll_pointer .

row_id (row number )

If the current table has a primary key of integer type, row_id is the value of the primary key.

If there is no primary key of integer type, mysql will select a unique index of non-empty integer type as row_id according to the field order.

If mysql does not find it, it will automatically generate an auto-increasing integer as row_id.

trx_id (transaction number )

Before a transaction starts executing, mysql will assign a global self-incrementing transaction id to the transaction. Afterwards, when the transaction adds, deletes, and modifies the current row, it will record its own transaction id in trx_id.

roll_pointer ( rollback pointer )

When a transaction changes the current row, it will write the old data into the undo log, and then write the new data into the current row, and the roll_pointer of the current row points to the undo log just now, so the previous version of the row can be found through the roll_pointer .

We create a person table using the following statement

CREATE TABLE `person` (  `id` int(11) NOT NULL,  `name` varchar(255) DEFAULT NULL,  `age` int(11) DEFAULT NULL,  PRIMARY KEY (`id`)) ENGINE=InnoDB

Now start the first transaction, the transaction id is 1, execute the following insert statement.

INSERT INTO `person`(`id`, `name`, `age`) VALUES (1, '张三', 18);

Then a schematic diagram of the current line is as follows:

Because the data is newly inserted, the undo log pointed to by its roll_pointer is empty.

Then start the second transaction, assign the transaction id to 2, and execute the following modification command.

UPDATE`person` SET `name` = '李四' WHERE `id` = 1

When the third transaction is started and the assigned transaction id is 3, execute the following modification command.

UPDATE`person` SET `age` = '22' WHERE `id` = 1

When each transaction changes the row, an undo log will be generated to save the previous version, and then the roll_pointer of the new version will point to the undo log just generated.

Therefore, roll_pointer can concatenate these different versions of undo logs to form an undo log version chain .

0 3. Is undo a logical log or a physical log?

Users usually have such a misunderstanding about undo. Undo is used to physically restore the database to the state before executing the statement or transaction, but it is not actually true. Undo is a logical log, so it only restores the data logically to the original state . All modifications are logically undone, but the data structures and pages themselves may be very different after the rollback.

For example, a transaction is modifying some records in the current page, while other transactions are modifying other records in the same page. Therefore, it is not possible to roll back a page to the beginning of the transaction, because this will affect the ongoing work of other transactions.

Assuming that the user executes a transaction of inserting 100 records, this transaction will cause a new segment to be allocated, that is, the table space will increase. When the user executes rollback, the inserted transaction will be rolled back, but the size of the table space will not shrink.

So when the transaction is rolled back, it's actually doing the reverse of what it did before.

for example:

  1. For each insert, the InnoDB storage engine completes a delete.

  2. For each delete, the InnoDB storage engine performs an insert.

  3. For each update, the InnoDB storage engine performs a reverse update, putting back the row before the modification.

0 4. How to store undolog?

The InnoDB storage engine also uses segments to manage undo . The InnoDB storage engine has a rollback segment, and 1024 undo log segments are recorded in each rollback segment , and undo page applications are performed in each undo log segment.

Starting from InnoDB1.2, the rollback segment can be further set through parameters. These parameters include:

  1. innodb_undo_directory is used to set the path where the rollback segment file is located.

    The default value of this parameter is ".". The current InnoDB storage engine directory, which is stored in the shared table space (ibdataX), can be set to other locations through this parameter.

  2. innodb_undo_logs is used to set the number of rollback segments, the default value is 128.

  3. innodb_undo_tablespaces is used to set the number of files that make up the rollback segment.

    In this way, the rollback segment can be evenly distributed among multiple files.

    After setting this parameter, you will see a file prefixed with undo in the path innodb_undo_directory, which represents the rollback segment file.

  4. innodb_undo_log_truncate (new in 5.7): Closed by default. If enabled, when undo exceeds innodb_max_undo_log_size, it will be truncate to the initial size, (premise: 1: the undo inside is no longer used; 2: at least 2 undo tablespaces are required) You can adjust the truncate frequency by setting innodb_purge_rseg_truncate_frequency.

The process of writing the transaction to the undo log in the undo log segment allocation page also needs to be written to the redo log . When a transaction commits, the InnoDB storage engine does the following two things:

  1. Put the undo log into the list for later purge operation

  2. Determine whether the page where the undo log is located can be reused, if it can be allocated to the next transaction

After the transaction is committed, the undo log and the page where the undo log is located cannot be deleted immediately . This is because there may be other transactions that need to get the previous version of the row record through the undo log. When the transaction is submitted, the undo log is put into a linked list. Whether the undo log can be deleted and the page where the undo log is located is determined by the purge thread (the purge thread will explain in detail next time).

Undo pages can be reused in the design of the innodb engine. When the transaction is submitted, first put the undo log into the linked list, and then judge whether the used space of the undo page is less than 3/4, if so, it means that the undo page can be reused, and then the new undo log is recorded behind the current undo log , because the list storing undo logs is organized by records, and the undo pages may store undo logs of different transactions, so the purge operation needs to involve discrete read operations on the disk, which is a relatively slow process.

0 5. Does undolog need to be persisted?

The generation of undo log will be accompanied by the generation of redo log, because undo log also needs persistent protection . (The redo log will be detailed next time)

0 6. undolog format

  • insert undo log

  • update undo log

inert undo log refers to the undo log generated in the insert operation. Because the record of the insert operation is only visible to the transaction itself and not to other transactions, the undo log can be deleted directly after the transaction is committed. No purge operation is required.

The update undo log records the undo log generated for delete and update operations. The undo log may need to provide an MVCC mechanism, so it cannot be deleted when the transaction is committed. Put it into the undolog linked list when submitting, and wait for the purge thread to perform the final deletion.

Summarize the two functions of undo log:

Implement transaction rollback to ensure the atomicity of transactions . During transaction processing, if an error occurs or the user executes a rollback statement, mysql can use historical data in the undo log to restore the data to the state before the transaction started .

One of the key factors to realize MVCC (Multi-Version Concurrency Control) . MVCC is implemented through read view+ undo log. The undo log saves multiple copies of historical data for each record. When MySQL executes a snapshot read (ordinary select statement), it will follow the version chain of the undo log to find records that satisfy its visibility according to the information in the read view of the transaction.

END

Guess you like

Origin blog.csdn.net/s827292890/article/details/129463600