Interview eight-part essay MySQL: (1) Principles of transaction implementation

1. What is a transaction

A transaction is a set of database operations. These operations are atomic (atomic operations) and indivisible. They are either executed all or none of them are executed after rollback. This prevents one operation from succeeding and another from failing, resulting in data inconsistency.

2. Four major characteristics of transactions

Atomicity, Consistency, Isolation, Durabilly, ACID for short.
Insert image description here
Goal of transaction:
Consistency is the ultimate goal of transaction. Atomicity, isolation, and durability are all for achieving consistency.

2.1. Problems caused by concurrent transactions (problems caused by isolation)

Problems may occur if two concurrently executed transactions involve operating the same record. Because concurrent operations will bring data inconsistency, including dirty reads, non-repeatable reads, phantom reads, etc. The database system provides isolation levels to allow us to select the isolation level of transactions in a targeted manner to avoid data inconsistency.

Insert image description here

2.2. Transaction isolation level (solve the problems caused by isolation)

There are levels of transaction isolation. At certain levels, in the case of concurrent transactions, there is no guarantee that the execution of a transaction will not be interfered by other transactions. Since it will be interfered with, some problems will arise.

Insert image description here
In MySQL, if you use InnoDB, the default isolation level is Repeatable Read.

3. Database log

1. redo log redo log

Content: The log in physical format records the modified information of the physical data page. The redo log is sequentially written to the physical file of the redo log file.
The log file mainly consists of two parts: the redo log buffer (redo log buffer) and the redo log file (redo log). The former is stored in memory and the latter is stored on disk. In order to improve performance, mysql will not synchronize every modification to the disk in real time. Instead, it will first save it in the Buffer Pool and use this as a cache. Then use a background thread to synchronize the buffer pool and disk.

Role: Ensure transaction durability. This prevents dirty pages from being written to the disk at the time of failure. When the mysql service is restarted, redo is performed based on the redo log to achieve transaction durability.

2. undo log rollback log

Content: The logical format log, when executing undo, only restores the data logically to the state before the transaction, rather than operating from the physical page. This is different from the redo log.

Function: It saves a version of the data before the transaction, which can be used for rollback. It can also provide multi-version concurrency controlled reading (MVCC), that is, non-locking reading.

3. bin log archive log (binary log)

Content: The log in logical format can be simply thought of as the SQL statement in the executed transaction.
But it is not as simple as a sql statement, but includes the reverse information of the executed sql statement (add, delete, modify), which means that delete corresponds to delete itself and its reverse insert; update corresponds to the information before and after update is executed. Version information; insert corresponds to the information of delete and insert itself.

Function: Used for replication. In master-slave replication, the slave library uses the binlog on the master library to replay to achieve master-slave synchronization.
Used for point-in-time restore of databases.

Binlog has three modes: Statement (SQL statement-based replication), Row (row-based replication) and Mixed (mixed mode)

4. How to implement transactions

The implementation of services relies on innodb's redo log, undo log and locks.

4.1 Atomicity

The implementation of atomicity mainly relies on Undo log. Atomicity is mainly reflected when an error occurs during the execution of SQL and rollback occurs. Rollback is to return to a state before execution, so how to return to the state before execution? Do we have to record the status before execution? Therefore, if you roll back due to system errors or rollback operations, you can roll back to the state before being modified based on the undo log information.

4.2 Persistence

We need to first understand how InnoDB reads and writes data. We know that database data is stored on disk, but the cost of disk I/O is very high. If the disk needs to be accessed every time to read and write data, the efficiency of the database will be very low. In order to solve this problem, InnoDB provides Buffer Pool as a buffer for accessing database data.

The Buffer Pool is located in memory and contains a mapping of some data pages on the disk. When data needs to be read, InnoDB will first try to read from the Buffer Pool. If it cannot be read, it will read it from the disk and put it into the Buffer Pool; when writing data, it will first write the page of the Buffer Pool. Mark such pages as dirty and put them on a special flush list. These modified data pages will be flushed to the disk at a later time (this process is called dirty flushing and is handled by other background threads).

From the previous introduction, we know that InnoDB uses Buffer Pool to improve read and write performance. However, the Buffer Pool is in memory and is volatile. If MySQL suddenly crashes after a transaction is submitted, and the modified data in the Buffer Pool has not been refreshed to the disk at this time, data will be lost. , the durability of the transaction cannot be guaranteed. In order to solve this problem, InnoDB introduced redo log to achieve persistence of data modifications. According to the WAL mechanism we introduced above, the log is written first and then the disk. With redo log, InnoDB can ensure that even if the database restarts abnormally, the previously submitted records will not be lost. This capability is called crash-safe.

4.3 Isolation

Database isolation is achieved through locking and MVCC.
The isolation level of repeatable read will cause the problem of phantom reading, but the default isolation level of MySQL is repeatable read and solves the problem of phantom reading. Simply put, MySQL's default isolation level solves the problems of dirty reads, phantom reads, and non-repeatable reads.

There are three database concurrency scenarios:
read-read: there are no problems and no concurrency control is required.
Read-write: there are thread safety issues, which may cause transaction isolation issues, and may encounter dirty reads, phantom reads, and non-repeatable
reads and writes -Write: There are thread safety issues, and there may be update loss issues, such as the first type of update loss, the second type of update loss

The thread safety of write-write operations is achieved through locking, but the locking operation will seriously affect the performance and concurrency of the database, so MVCC - multi-version concurrency control appeared. MVCC is a lock-free concurrency control used to resolve read-write conflicts. The implementation of MVCC in the database is to resolve read (snapshot read) and write conflicts. Its implementation principle mainly relies on three implicit fields in the record. , undo log, Read View to achieve. MVCC can solve the following problems for databases:

  • When reading and writing the database concurrently, it can be done without blocking the write operation during the read operation, and the write operation does not need to block the read operation, which improves the performance of concurrent reading and writing of the database.
  • At the same time, it can also solve transaction isolation problems such as dirty reads, phantom reads, and non-repeatable reads, but it cannot solve the problem of update loss.

5. Summary

Redo logs, rollback logs and lock technology are the basis for implementing transactions.

  • The atomicity of transactions is achieved through undo log;
  • Transaction durability is achieved through redo log;
  • Transaction isolation is achieved through (read-write lock + MVCC);
  • Transaction consistency is achieved through atomicity, durability, and isolation.

In short, ACID is just a concept. Atomicity, durability, and isolation are all for achieving data consistency. The ultimate goal of transactions is to ensure data consistency.

Guess you like

Origin blog.csdn.net/weixin_42774617/article/details/132217209
Recommended