MYSQL's redolog, binlog, undolog and MVCC

Pre-knowledge

Important concepts:

Logical log: It can be simply understood as recording SQL statements.
Physical log: Recording the actual change of data.
Crash-safe: Crash-safe. When the database encounters extreme situations such as crashes and power outages, it can restore the memory that has not been flushed to the disk. data.
WAL: write-ahead logging, write the log first, and then write to the disk.

When Innodb wants to update the data, it first loads the data into the Buffer pool of the memory, updates the data in the Buffer pool, and finally finds a suitable time to refresh it to the disk. If there is a power failure crash and there is no log at this time, the update will be lost (the data has not been flushed to the disk).

Binlog

logical log. Mainly used for master-slave replication and data recovery. For master-slave replication: enable binlog on the master side, then copy the binlog to the slave side, and replay the binlog on the slave side.

Redolog

physical log. Because in InnoDB, data is interacted in units of pages, and sometimes updating only a few field values ​​is too time-consuming. Redolog only records data updates, and uses sequential IO. Sequential IO is much faster than random IO. Every time when the database crash needs to be recovered, the log is read directly. Redolog takes up very little space, and every time it is updated, several redo logs will be generated to record the updated content. In InnoDB, in order to solve the problem of slow disk speed, Buffer Pool is introduced. The same is true for redolog. A log buffer needs to be set. The log buffer is divided into several redolog blocks. The default size of the log buffer is 16MB. The redo log will be written to the redo buffer first, and the data in the redolog buffer will be flushed to disk when the following situations occur:

  1. The log buffer has written 50% of the data
  2. transaction commit
  3. shut down the server gracefully
  4. checkpoint
  5. The background thread flushes the data in the buffer to disk about once a second.

Two-phase commit:

Writing redolog is divided into two phases: prepare and commit, which is to make redolog and binlog consistent.

  1. Updated buffer pool
  2. Write redolog, at this time redolog is set to prepare state.
  3. write to binlog
  4. Set redolog to commit state

In this way, if the redolog is in the prepare state, then check whether there is a corresponding log in the binlog, and if so, change the state to commit, otherwise it is determined that the log is invalid.

Redolog and binlog comparison

  1. Redolog is implemented by innodb. Binlog is implemented by the server layer.
  2. Redolog writes in a loop. Binlog additional write.
  3. Redolog is suitable for data recovery. Binlog is suitable for master-slave replication and data recovery.
  4. Redolog is a physical log. Binlog is a logical log.

undolog

Undolog is unique to innodb like redolog, but it is a logical log. Every time a record is changed, an anti-operation undo log will be recorded. Each undo log has a roll_pointer, which is connected to form a version chain for rollback.

MVCC

Prerequisite knowledge:

Each record has two hidden columns:

  1. Trx_id: the id of the transaction that most recently modified this record
  2. Roll_pointer: Link the version chain. When the record is updated, the roll_pointer will be written into the undo log, and the undo log can be found through it to roll back the version.

Readview

That is to say, "consistency view". The readview of each transaction is independent (it has its own readview). The purpose is to judge whether the current transaction can read the records of this version. The specific implementation is as follows, it includes four attributes:

  1. M_ids: Active transaction id collection when the Readview is created.
  2. Min_trx_id: When creating the readview, the smallest transaction id, that is, the minimum value in M_ids.
  3. Max_trx_id: The id to assign to the next transaction. Not the current largest transaction ID, it should be the current largest transaction (not necessarily active, it may have ended) id+1
  4. Creator_trx_id: Create the transaction id of the readview, that is, the owner of the readview.

When a transaction T has readview, it can be judged whether the record of this version can be read by the following methods:

  1. Creator_trx_id = trx_id for this record. It means that this record was modified by transaction T and can be read.
  2. The record's trx_id < min_trx_id. It means that the record has been submitted before generating readview and can be read.
  3. min_trx_id <= trx_id of this record < max_trx_id. At this time, you need to search in m_ids. If trx_id is in m_ids, it means that the transaction to modify this record has not been submitted. According to the requirements of read committed and non-repeatable read, you cannot read this record. You need to go back to the version through the version chain. roll. Otherwise, it means that the transaction to modify this record has been committed and can be read.
  4. The record's trx_id >= max_trx_id. It shows that the record of this version is updated after the readview is created, so it is impossible for us to read it (this situation can only happen in the repeatable read isolation state, because in the read committed isolation state, every time the data is read Generate readview, max_trx_id<=trx_id will not happen).

We only need to roll back to the first version that we can read according to the above rules, combined with the version chain.

It should be noted that the isolation level requirements for read committed and non-repeatable read are different. The timing of creating readview is also different:

  • Read committed: We need to ensure that the read data is committed by other transactions, rather than updating uncommitted, so we need to create a new readview every time we read data.

  • Repeatable reading: We need to ensure that the data read each time is consistent with the first data, so we only need to create a readview when the data is read for the first time.

The readview is only created during the read operation, the update record is updated directly, and the undo log is added and the version chain is updated.

おすすめ

転載: blog.csdn.net/hesorchen/article/details/124012764