Three major mysql logs-binlog, redo log, undo log

Reprinted from: https://juejin.cn/post/6860252224930070536

For personal backup only, please see the original text for browsing

 

table of Contents

binlog

binlog usage scenarios

Timing of binlog flashing

binlog log format

redo log

Why do we need redo log

Basic concepts of redo log

redo log record form

The difference between redo log and binlog

undo log


The log is mysqlan important part of the database, which records various status information during the operation of the database. mysqlLogs mainly include error logs, query logs, slow query logs, transaction logs, and binary logs. As a development, what we need to focus on is the binary log ( binlog) and transaction log (including redo logand undo log). This article will introduce these three logs in detail in the following.

binlog

binlogIt is used to record the write operation (not including query) information performed by the database, and it is stored in the disk in binary form. binlogYes mysql, the logical log Serveris recorded by the layer , and the mysqldatabase using any storage engine will record the binloglog.

Logical log: It can be simply understood as the sql statement that is recorded .

Physical log: Because the mysqldata is ultimately stored in the data page, the physical log records data page changes .

binlogIt is written by appending. The size of max_binlog_sizeeach binlogfile can be set by parameters . When the file size reaches a given value, a new file will be generated to save the log.

binlog usage scenarios

In practical applications, binlogthere are two main usage scenarios, namely master-slave replication and data recovery .

  1. Master-slave replication : Masteropen at the end binlog, and then binlogsend to each Slaveend, Slaveend replay binlogto achieve the master-slave data consistency.
  2. Data recovery : Use mysqlbinlogtools to recover data.

Timing of binlog flashing

For the InnoDBstorage engine, it will only record when the transaction is committed biglog. At this time, the record is still in the memory, so biglogwhen is it flushed to the disk? mysqlThe timing of flashing disk sync_binlogcontrolled by parameters biglog, the value range is 0-N:

  • 0: No mandatory requirement, the system will determine when to write to the disk by itself;
  • 1: Write to disk every committime binlog;
  • N: Will be binlogwritten to disk every N transactions .

As can be seen from the above, sync_binlogthe safest thing is to set it 1, which is also MySQL 5.7.7the default value for later versions. But setting a larger value can improve database performance. Therefore, in actual situations, you can also increase the value appropriately, sacrificing certain consistency to obtain better performance.

binlog log format

binlogThe log has three formats, namely STATMENT, ROWand MIXED.

In the MySQL 5.7.7before, the default format is STATEMENT, MySQL 5.7.7after default is ROW. The log format is binlog-formatspecified by.

  • STATMENT Based on SQLstatement replication ( statement-based replication, SBR), every sql statement that will modify data will be recorded binlogin it . Advantages: does not require changes in the recording of each line, reducing the binlogamount of log, saving IO, thereby improving the performance ; disadvantages: in some cases lead to inconsistent data from the master, such execution sysdate(), slepp()and the like .
  • ROW Row-based replication ( row-based replication, RBR) does not record the context information of each sql statement, only which data has been modified . Advantages: There will be no problem that stored procedures, or functions, or trigger calls and triggers cannot be copied correctly under certain circumstances ; Disadvantages: a large number of logs will be generated, especially alter tablewhen the logs will skyrocket
  • MIXED Based STATMENTand ROWhybrid modes of replication ( mixed-based replication, MBR), the general use of copy STATEMENTmode to store binlog, for the STATEMENTpattern can not be copied using the operating ROWmode to storebinlog

redo log

Why do we need redo log

We all know that one of the four major characteristics of transactions is persistence . Specifically, as long as the transaction is successfully submitted, the changes made to the database are permanently saved, and it is impossible to return to the original state for any reason . So mysqlhow to ensure durability? The easiest way is to flush all data pages involved in the transaction to disk every time a transaction is committed. But doing so will have serious performance problems, which are mainly reflected in two aspects:

  1. Because Innodbbased disk interactive units, while a transaction is likely to modify only a few bytes of data pages which, at this time the full data pages to disk brush, then too a waste of resources!
  2. A transaction may involve modifying multiple data pages, and these data pages are not physically continuous, using random IO write performance is too poor!

Therefore, the mysqldesign redo log, specifically, only records what changes the transaction makes to the data page , so that the performance problem can be solved perfectly (relatively speaking, the file is smaller and sequential IO).

Basic concepts of redo log

redo logIt consists of two parts: one is the log buffer in the memory ( redo log buffer), and the other is the log file ( redo log file) on the disk . mysqlEach time a DMLstatement is executed , the record is written first redo log buffer, and then multiple operation records are written at one time at a later point in time redo log file. This technique of writing the log first and then writing to the disk is a technique MySQLoften mentioned in this WAL(Write-Ahead Logging)article.

In the computer operating system, user spacethe buffer data in the user space ( ) generally cannot be written directly to the disk, and the kernel space ( kernel space) buffer ( OS Buffer) must pass through the operating system kernel space ( ) in the middle . Accordingly, redo log bufferthe writing redo log fileis actually written first OS Buffer, and then through system calls fsync()to the brush redo log file, the process is as follows:

mysqlIt supports three kinds of timings to be redo log bufferwritten redo log file, which can be innodb_flush_log_at_trx_commitconfigured through parameters, and the meaning of each parameter value is as follows:

Parameter value meaning
0 (delayed write) When the transaction commits, it will not redo log bufferwrite to the middle log os buffer, but writes os bufferand calls fsync()write to every second redo log file. That is to say, when it is set to 0, it is refreshed and written to the disk every second (approximately). When the system crashes, 1 second of data will be lost.
1 (real-time writing, real-time brushing) Each transaction commits will redo log bufferwrite the log in os bufferand call fsync()flush to redo log file. This method will not lose any data even if the system crashes, but because each commit is written to the disk, the IO performance is poor.
2 (real-time writing, delayed brushing) Each commit is only written to os buffer, and then the log in fsync()will be os bufferwritten to every second redo log file.

redo log record form

As mentioned earlier, in redo logfact , it records the changes of the data page, and it is not necessary to save all the change records. Therefore, the redo logimplementation adopts a fixed size and circular writing method. When the writing reaches the end, it will return to the beginning to write the log circularly. . As shown below:

At the same time, it is easy to know that in innodb redo log, there 数据页is a need to brush the disk, as well as the need to brush the disk. redo logThe main meaning of existence is to reduce 数据页the requirements for the disk . In the figure, write posit represents redo logthe current record LSN(logical serial number) position, check pointrepresents data page changes brush-hours corresponds redo loglocated LSN(logical serial number) position. write posThe check pointpart between the redo logempty portion for recording a new record; check pointto write posbetween is redo logto be off the disc changes the data page. When write poscatching up check point, it will push check pointforward first , vacate the position and then record a new log.

When starting innodb, no matter whether it was shut down normally or abnormally last time, it will always resume operation. Because it redo logrecords the physical changes of the data page, the recovery speed is much faster than the logical log (for example binlog). When restarting innodb, the data page in the disk will be checked first LSN, if the data page is LSNsmaller than the log LSN, it will be checkpointrestored from the beginning. There is also a situation in which checkpointthe disk flushing process is in progress before the downtime , and the flushing progress of the data page exceeds the flushing progress of the log page. At this time, the record in the data page is LSNgreater than that in the log, and the log is LSNexceeded at this time. The progress part will not be redone, because this in itself means that something has already been done and does not need to be redone.

The difference between redo log and binlog

  redo log binlog
File size redo logThe size is fixed. binlogThe size of max_binlog_sizeeach binlogfile can be set through configuration parameters .
Method to realize redo logIt is InnoDBimplemented by the engine layer, not all engines have it. binlogA Serverlayer implementation, all engines can use binlogthe log
Recording method The redo log uses circular writing to record. When writing to the end, it will return to the beginning to circularly write the log. Binlog is recorded by appending. When the file size is larger than the given value, subsequent logs will be recorded on a new file
Applicable scene redo logSuitable for crash recovery (crash-safe) binlogSuitable for master-slave replication and data recovery

It made binlogand redo logshows the difference: binloglog only for archiving, relying only binlogthere is no crash-safepower. But only redo logdoes not work, because it redo logis InnoDBunique, and the record on the log will be overwritten after it is placed on the disk. It is necessary binlogand redo logsimultaneously record both, in order to ensure when to restart the database downtime occurs, the data will not be lost.

undo log

There are four characteristics of a database transaction is atomic , specifically, atom refers to the series of operations on the database, either all succeed, or all fail, some successful cases impossible . In fact, the underlying atomicity is achieved through undo log. undo logThe main data is recorded logical changes, such as a INSERTsentence, a corresponding one DELETEof undo log, for each UPDATEsentence, the corresponding one of the opposite UPDATEof undo log, so that when an error occurs, the data can be rolled back to the state before the transaction. At the same time, it undo logis also MVCCthe key to (multi-version concurrency control) realization.

Guess you like

Origin blog.csdn.net/chushoufengli/article/details/115084782