ACID characteristics and implementation of InnoDB transactions

1. ACID characteristics

2. Redo log

3. Undo log

4. 4 isolation levels of transactions


1. ACID characteristics

1.1 atomicity

    A transaction can be composed of very simple SQL statements or a set of complex SQL statements. In the operations that require atomicity, either all or none of the transactions are performed. Any SQL statement in the transaction fails and has been executed. Successful SQL statements must also be cancelled, and the state of the database should return to the state before the transaction was executed. Atomicity is achieved by redo logs .

1.2 Consistency

    Transaction consistency refers to the transition of the database from one state to the next consistent state. Before the start of the transaction and after the end of the transaction, the integrity constraints of the database have not been destroyed. Consistency is usually achieved by undo logs .

1.3 isolation

    The isolation of transactions requires that the objects of each read-write transaction can be separated from the operation objects of other transactions, that is, the transaction is invisible to other transactions before the transaction is submitted. Isolation is usually achieved by using locks .

1.4 Durability

    The durability of the transaction requires that once the transaction is committed, the result is permanent, and the database can restore the data even if a failure such as a downtime occurs. Persistence is usually achieved using redo logs .

2. Redo log

    The redo log is used to achieve transaction durability. It consists of two parts: one is the redo log buffer in memory, which is volatile; the other is the redo log file, It is persistent. Each time a transaction is committed, all logs of the transaction must be written to the redo log file for persistence, and the commit operation of the transaction is completed. Each time the redo log buffer is written to the redo log file, the InnoDB engine needs to call an fsync operation. Because when the redo log buffer is written to the redo log file, it is written to the file system cache first. In order to ensure that the redo log is written to the disk, an fsync operation must be performed.

2.1 log block

    Redo logs are stored in 512B, which means that the redo log buffer and redo log files are stored in blocks, called redo log block (redo log block), if the redo generated in a page If the number of logs is greater than 512B, it needs to be divided into multiple redo log blocks for storage. Since the redo log block is the same size as the disk sector, both are 512 bytes, the redo log write can guarantee atomicity and does not require doublewrite.

    log block is composed of three parts, namely, the log header (log block header) , log block tail (log block Tailer) , the log itself . The log header occupies 12 bytes, and the log tail occupies 8 bytes. Therefore, the actual log size of each block is 492 bytes

2.2 LSN

    LSN is the abbreviation of log sequence number, which represents the log sequence number. In the InnoDB storage engine, LSB occupies 8 bytes and increases monotonically. The meaning of LSN is as follows:

  • The total amount of redo log writes
  • checkpoint location
  • Page version

    LSN is not only recorded in the redo log, but also exists in every page. At the head of each page, there is a value FIL_PAGE_LSN, which records the LSN of the page. In the page, LSN represents the size of the LSN when the page was last refreshed. Because the redo log records the log of each page, the LSN in the page is used to determine whether the page needs to be restored.    

2.3 Recovery

    When the database is started, no matter whether the database was shut down normally the last time it was running, it will try to perform a recovery operation. For example, the LSN of page P1 is 1000, and when the database is started, it is detected that the LSN of the countersunk log is 1300, and the transaction has been committed, then the database needs to perform a recovery operation to apply the redo log to this page. Similarly, for the LSN in the redo log that is less than the LSN of the P1 page, there is no need to redo, because the LSN of the P1 page indicates that the page has been refreshed to that location.

3. Undo log

    The undo log is used to help transaction rollback and MVCC functions. When InnoDB rolls back, it actually does the opposite of the previous work. For each insert, the engine will complete a delete, for each delete, the engine will execute an insert, and for each update, the engine will perform a reverse Update.

3.1 undo storage management

    The redo is stored in the redo log file, and the undo is stored in a special segment inside the database. This segment is called the undo segment, and the undo segment is located in the table space. The InnoDB storage engine has rollback segments, and each rollback segment records 1024 undo log segments. Before InnoDB1.0, there was only one rollback segment, so the number of simultaneous online transactions supported was limited to 1024. Starting from 1.1, InnoDB supports a maximum of 128 roll back segments, so its support for simultaneous online transactions has been increased to 128 * 1024. It should be noted that during the process of writing the undo log segment to the undo log, the redo log also needs to be written. When the InnoDB transaction commits, it will do two things:

  • Put the undo log in the list for subsequent purge operations;
  • Determine whether the page where the undo log is located can be reused, and if so, allocate it to the next transaction.

    After the transaction is committed, the undo log and the page where the undo log is located cannot be deleted immediately, because there may be other transactions that need to obtain the previous version of the row record through the undo log. Therefore, when the transaction is committed, the undo log is put into a linked list, and whether it is finally deleted is determined by the purge thread.

3.2 undo log format

    In InnoDB, undo log is divided into:

  • insert undo log: The undo log is directly deleted after the transaction is committed, without the need for purge operation
  • update undo log: Record the undo log generated for delete and update. The undo log may need to provide an MVCC mechanism, so it cannot be deleted when the transaction is committed.

3.3 purge

    The purge operation is the delete and update operations before cleaning up, and the above operations are finally completed.

4. 4 isolation levels of transactions

4.1 Problems that occur under different isolation levels

   Before introducing the four isolation levels of transactions, we first introduce a few concepts, namely dirty read, non-repeatable read, and phantom read.

  • Dirty read:

    Dirty data refers to the modification of the row record in the buffer pool by the transaction, and it has not yet been committed. Dirty read refers to reading dirty data, that is, one transaction reads uncommitted data in another transaction, which obviously violates the isolation of the database.

  • Non-repeatable reading:

    Non-repeatable reading refers to reading the same data set multiple times in the same transaction. Before this transaction is over, another transaction has also accessed the same data set and performed some DML operations. Therefore, between the two read data in the first transaction, the data read twice in the first transaction may be different due to the modification of the second transaction. In this way, a situation where the data read twice in a transaction is not the same occurs. This situation is called non-repeatable read. The focus of non-repeatable reading lies in delete and updata operations.

  • Phantom Problem:

    It means that under the same transaction, two consecutive executions of the same SQL statement may lead to different results, and the result of the second time may return rows that did not exist before. The focus of phantom reading lies in the insert operation.

4.2 InnoDB lock types

    The isolation of InnoDB is mainly achieved through the lock and MVCC mechanism. Different row lock algorithms and MVCC strategies can be used to achieve different isolation levels. InnoDB has two main types of locks, namely, shared locks (S Lock) and exclusive locks (X Lock). InnoDB has the following row lock algorithms: record lock, gap lock, and next-key lock

  • Shared lock : allows the transaction to read a row of data, select only explicitly declares to add a shared lock, the default is not to add a shared lock.
  • Exclusive locks : allow transactions to delete or update a row of data. Among the four operations of database addition, deletion, modification, and check, insert, delete, and update will all add exclusive locks (Exclusive Locks)
  • Consistent non-locking read : Consistent locking read refers to InnoDB reading the data of the row in the current time database through MVCC. MVCC stands for Multi Versison Concurrency Control, which is a multi-version concurrency control technology. Under this technology, a row record may have more than one snapshot data. If the read row is performing a delete or update operation, then the read operation will not wait for the lock on the row to be released. On the contrary, the InnoDB engine will read a snapshot data, which is also called consistent non-locking read . In the isolation level of transaction isolation level read commited and repeatable read, InnoDB uses non-locking consistent read, but the definition of snapshot data is different. Under the isolation level of commit read, for snapshot data, the locked row is always read The latest snapshot data. Under the repeatable read isolation level, for snapshot data, the row data version at the beginning of the transaction is read.
  • Consistent lock read : By default, InnoDB's select operation uses consistent non-locking read, but in some cases, users need to explicitly lock the database read operation to ensure the logical consistency of the data. The database is required to support the lock statement for the read-only operation of select. InnoDB supports two consistent lock read operations for select:
select ... for update     # 对读取的行加一个X锁
select ... lock in share mode # 对读取的行加一个S锁
  • record lock : a single row record is locked.
  • gap lock : gap lock, lock a range, but does not lock the record itself.
  • next-key lock : gap lock + record lock. InnoDB queries all use this lock algorithm. For example, if an index has four values ​​of 10, 11, 13, and 20, then the index may be locked by the next-key interval (-oo, 10), (10, 11], (12, 13], (13, 20], (20, +oo). The next-key lock is designed to solve the problem of phantom reading. With this locking technology, the lock is not a single value, but a Range. However, when the query index has unique attributes , InnoDB will optimize the index and downgrade it to record lock, that is, only lock the record itself, not the range. For example, if there are the following 5 rows of data in a table, They are (1, 1), (3, 1), (5, 3), (7, 6), (10, 8). The first column is the clustered index and the second column is the auxiliary index. When using SQL Statement:
select * from z where b = 3 for update

    When querying, for the clustered index, only the first column is equal to 5 plus the record lock, and for the auxiliary index, the next-key lock is added, and the lock range is (1, 3). In addition, InnoDB will Add gap lock to the next key value of the auxiliary index, that is, there is another auxiliary index in the range (3, 6). Therefore, if you run the SQL statement in the new session B, it will be blocked.

select * from z where a = 5 lock in share mode
insert into z select 4,2
insert into z select 6,5

    In the above example, the function of the gap lock is to prevent multiple transactions from inserting records into the same range, which will lead to phantom reading problems. In the InnoDB storage engine, for the Insert operation, it checks whether the next record of the inserted record is locked. If it is already locked, query is not allowed.

  • Lock escalation : The InnoDB storage engine does not have the problem of lock escalation, because it does not generate locks based on each record. On the contrary, it manages locks based on each transaction access to each page, using a bitmap method, so Regardless of whether a transaction locks one record or multiple records in the page, the overhead is usually the same.

4.3 InnoDB's 4 isolation levels and implementation

  • Read uncommited

    Meaning: At this isolation level, the select statement is not locked, and all transactions can see the execution results of other uncommitted transactions. This isolation level is rarely used in practical applications, because its performance is not much better than other levels. Reading uncommitted data is also called Dirty Read.

    Realization: read data without lock, write data with exclusive lock. All write operations will add an exclusive lock, then how to read uncommitted? When we introduced exclusive locks earlier, there was this explanation: Exclusive locks will prevent other transactions from adding read or write locks to the locked data , but it will not work for unlocked reads .

  • Read commited

    Meaning: This is the default isolation level for most database systems (but not the default for MySQL). It satisfies the simple definition of isolation: a transaction can only see the changes made by the committed transaction. This isolation level also supports the so-called non-repeatable read (Nonrepeatable Read), because other instances of the same transaction may have new commits during the processing of the instance, so the same select may return different results.

    Realization: When writing data, use exclusive lock, read data without locking but use MVCC mechanism. Therefore, at the read submitted level, the latest snapshot of the current data will be obtained through MVCC , without any locks, and ignoring any locks (because historical data is constructed, there can be no locks on the body). However, this still left the level of non-repeatable read and phantom read problem:  MVCC generation version of the timing : every time select time. This means that if we perform multiple selects in transaction A, and other transactions update the data we read and submit between each select, there will be non-repeatable reads , that is: repeated reads , There will be data inconsistencies, and we will explain the overspending phenomenon later, which is caused by this.

  • Repeatable read

    Meaning: This is the default transaction isolation level of MySQL. It ensures that multiple instances of the same transaction will see the same data row when reading data concurrently. But in theory, this leads to another tricky problem: Phantom Read. Simply put, phantom reading means that when the user reads a range of data rows, another transaction inserts a new row in the range. When the user reads the data rows in the range again, they will find new " Phantom" line. The InnoDB storage engine uses the next-key lock algorithm at the repeatable read transaction isolation level to avoid the generation of phantom reads. Therefore, the InnoDB storage engine's transaction isolation level under the default isolation level has fully guaranteed the transaction isolation requirements, that is, it reaches the SQL serializable isolation level.

    Realization: Use exclusive lock when writing data, use MVCC when reading data. The difference between this level and the commit read is the generation timing of the MVCC version , that is, the version is only generated during the first select in a transaction, and subsequent queries are performed on this version, thereby achieving repeatable reading . However, a select without a read lock will cause a large number of phantom reads, which can be solved by next-key lock.

  • Serializable

    Meaning: This is the highest isolation level. It solves the problem of phantom reading by forcing transaction ordering to make it impossible to conflict with each other. In short, it adds a shared lock on each row of data read. At this level, it may lead to a large number of timeouts and lock contention. Serialization is usually only used in distributed transactions.

    Realization: add exclusive lock when writing data, and automatically add shared lock when reading data, that is, automatically add lock in share mode after each select statement. In this case, the read occupies the lock, and consistent non-locking reads are no longer supported, and transactions can basically be executed serially.

Guess you like

Origin blog.csdn.net/MOU_IT/article/details/113730714