Undo, redo logs in mysql, and solutions for distributed transactions

私聊互关,也可在评论区评论一下

undo log and redo log

In the database system, there are files for storing data and files for storing logs. The log also has a cached Log buffer in memory and a disk file log file. There are two types of log files in MySQL that are related to transactions: undo logs and redo logs.

undo log

Database transactions are atomic , and if the transaction fails, the data needs to be rolled back.

Atomicity can be achieved using undo logging. The principle of Undo Log is very simple. In order to satisfy the atomicity of transactions, before operating any data, first back up the data to Undo Log. Then modify the data. If there is an error or the user executes the ROLLBACK statement, the system can use the backup in the Undo Log to restore the data to the state before the transaction started. Before the database writes data to disk, it caches the data in memory first, and then writes it to disk when the transaction is committed. Simplified process of achieving atomic and durable transactions with Undo Log:

假设有A、B两个数据,值分别为1,2。 A. 事务开始. B. 记录A=1到undo log. C. 修改A=3. D. 记录B=2到undo log. E. 修改B=4. F. 将undo log写到磁盘。 G. 将数据写到磁盘。 H. 事务提交

如何保证持久性?

事务提交前,会把修改数据到磁盘前,也就是说只要事务提交了,数据肯定持久化

如何保证原子性?

每次对数据库修改,都会把修改前数据记录在undo log,那么需要回滚时,可以读取undo log,恢复数据。

若系统在G和H之间崩溃

此时事务并未提交,需要回滚。而undo log已经被持久化,可以根据undo log来恢复数据

若系统在G之前崩溃

此时数据并未持久化到硬盘,依然保持在事务之前的状态

Defect: Write data and Undo Log to disk before committing each transaction, which will cause a lot of disk IO, so the performance is very low.

If you can cache data for a period of time, you can reduce IO and improve performance. But this will lose the durability of the transaction. Therefore, another mechanism is introduced to achieve persistence, namely Redo Log .

Contrary to Undo Log, Redo Log records a backup of new data . Before the transaction is committed, as long as the Redo Log is persisted, the data does not need to be persisted, which reduces the number of IOs.

假设有A、B两个数据,值分别为1,2

A. 事务开始. B. 记录A=1到undo log buffer. C. 修改A=3. D. 记录A=3到redo log buffer. E. 记录B=2到undo log buffer. F. 修改B=4. G. 记录B=4到redo log buffer. H. 将undo log写入磁盘 I. 将redo log写入磁盘 J. 事务提交

- 如何保证原子性?

  如果在事务提交前故障,通过undo log日志恢复数据。如果undo log都还没写入,那么数据就尚未持久化,无需回滚

- 如何保证持久化?

  大家会发现,这里并没有出现数据的持久化。因为数据已经写入redo log,而redo log持久化到了硬盘,因此只要到了步骤`I`以后,事务是可以提交的。

- 内存中的数据库数据何时持久化到磁盘?

  因为redo log已经持久化,因此数据库数据写入磁盘与否影响不大,不过为了避免出现脏数据(内存中与磁盘不一致),事务提交后也会将内存数据刷入磁盘(也可以按照固设定的频率刷新内存数据到磁盘中)。

- redo log何时写入磁盘

  redo log会在事务提交之前,或者redo log buffer满了的时候写入磁盘
为什么要使用redo log
- 数据库数据写入是随机IO,性能很差
- redo log在初始化时会开辟一段连续的空间,写入是顺序IO,性能很好
- 实际上undo log并不是直接写入磁盘,而是先写入到redo log buffer中,当redo log持久化时,undo log就同时持久化到硬盘了,因此事务提交前,只需要对redo log持久化即可

-redo log中记录的数据,有可能包含尚未提交事务,如果此时数据库崩溃,那么如何完成数据恢复?

数据恢复有两种策略:

恢复时,只重做已经提交了的事务

恢复时,重做所有事务包括未提交的事务和回滚了的事务。然后通过Undo Log回滚那些未提交的事务

Inodb引擎采用的是第二种方案,因此undo log要在 redo log前持久化

Distributed transaction

After the database is split horizontally and the service is split vertically, a business operation usually spans multiple databases and services to complete. In a distributed network environment, we cannot guarantee that all services and databases are 100% available. Some services and databases will be successfully executed, while others will fail. When some business operations succeed and some fail, business data will be inconsistent.

For example, the payment table and the inventory table are in different databases, and the user fails to deduct the amount when purchasing, and the payment table is rolled back, but because the inventory table is not in the same database as the commodity table, the rollback will not occur.

 

Ideas for Solving Distributed Transactions

CAP theorem and BASE theory can refer to previous articles

Solution 1: Submit in stages

In 1994, the X/Open organization (now the Open Group) defined the DTP model for distributed transaction processing. The model includes the following roles:

  • Application (AP): Our Microservice

  • Transaction Manager ( TM ): Global Transaction Manager

  • Resource Manager (RM): usually a database

  • Communication Resource Manager (CRM): is the communication middleware between TM and RM

In this model, a distributed transaction (global transaction) can be split into many local transactions, running on different APs and RMs. ACID for each local transaction is well implemented, but the global transaction must guarantee that every local transaction contained within it can succeed at the same time, and if one local transaction fails, all other transactions must be rolled back. But the problem is that in the process of local transaction processing, the running status of other transactions is not known. Therefore, it is necessary to notify each local transaction through the CRM to synchronize the status of transaction execution.

two-phase commit

Stage 1: Preparation stage, each local transaction completes the preparation of the local transaction

Stage 2: Execution stage, each local transaction commits or rolls back according to the execution result of the previous stage

This process requires a coordinator (coordinator), as well as transaction participants (voter)

 

Voting phase : The coordination group asks each transaction participant whether the transaction can be executed. Each transaction participant executes the transaction, writes the redo and undo logs, and then feeds back the information about the successful execution of the transaction ( agree), and returns disagree if it fails

Commit phase : The coordination group finds that each participant can execute the transaction ( agree), so it issues an commitinstruction to each transaction participant, and each transaction participant commits the transaction, otherwise it rolls back.

defect:

Single point of failure problem: If the coordination group hangs up, it will not be able to determine whether to commit or roll back next

Blocking problem: In the preparation phase and the commit phase, each transaction participant will lock local resources and wait for the execution results of other transactions. The blocking time is long and the resource locking time is too long, so the execution efficiency is relatively low.

Solution 2: TCC mode

It is essentially a way of compensation. The transaction operation process includes three methods,

  • Try: detection and reservation of resources;

  • Confirm: The executed business operation is submitted; if the Try succeeds, the Confirm must be successful;

  • Cancel: The reserved resources are released.

There are two stages of execution:

  • Preparation phase (try): detection and reservation of resources;

  • Execution stage (confirm/cancel): According to the result of the previous step, determine the following execution method. If all transaction participants in the previous step were successful, then execute confirm here. Otherwise, execute cancel

 try, confirm, cancel are independent transactions, not affected by other participants, and will not block waiting for others

For example: Suppose the original balance of account A is 100, and the balance needs to be deducted by 30 yuan.

 Disadvantages : It is necessary to manually write code to implement try, confirm, and cancel, and there are many code intrusions, which increases the complexity of development. At the same time, if the cancel action fails, the resources cannot be released, and a retry mechanism needs to be introduced, and retry may lead to repeated execution, and the idempotent problem during retry must also be considered

Solution 3: Use MQ

The transaction initiator A executes the local transaction, and sends the transaction information to be executed to the transaction participant B through MQ, and the transaction participant B executes the local transaction after receiving the message. Using the message reliability of mq, transaction participant B must ensure that the message can eventually be consumed. If it fails, it needs to be retried multiple times.

Disadvantages: Depending on the reliability of MQ, the message initiator can roll back, but the message participant cannot cause transaction rollback, and the transaction timeliness is poor, depending on whether the MQ message is sent in a timely manner

Solution four: AT mode

Similar to the TCC mode, it is divided into two stages, but you do not need to write the second stage code yourself, which is implemented by Seata.

In the first stage, Seata will intercept the "business SQL", first parse the SQL semantics, find the business data to be updated in " ", 业务 SQLsave it as " before image" before the business data is updated, and then execute " 业务 SQL" to update the business data. After the data is updated, save it as " after image", and finally acquire the global row lock and commit the transaction. The above operations are all completed within a database transaction, which ensures the atomicity of one-stage operations.

before imageThe sum here is after imagesimilar to the undo and redo logs of the database, but is actually simulated with the database.

 If the second stage is submitted, because " 业务 SQL" has been submitted to the database in the first stage, the Seata framework only needs to delete the snapshot data and row locks saved in the first stage to complete the data cleaning.

If the second stage is a rollback, Seata needs to roll back the " 业务 SQL" that has been executed in the first stage to restore the business data. The rollback method is to use " before image" to restore business data.

Guess you like

Origin blog.csdn.net/weixin_52210557/article/details/124223667