Binlog format of MySQL log → Discussion on MySQL default isolation level

Happy moment

image

image

When the product has not been tested and put into production directly, this Nima...

Background question

Before talking about binlog, let's review the default isolation level of mainstream relational databases. It is the default isolation level. It is not the isolation level of transactions. Don't get the wrong idea.

1. What is the default isolation level of Oracle and SQL Server, and what about MySQL?

2. Why the default isolation level of MySQL is RR?

This question is actually not very rigorous. We know that UMySQL 5.5 replaced MyISAM with InnoDB as the default storage engine for MySQL, and transactions only have an isolation level. If MyISAM does not support transactions, then this problem did not hold true before MySQL 5.5.

Strictly speaking, it should be: why the default transaction isolation level of MySQL 5.5 and later is RR, or: why the default transaction isolation level of InnoDB is RR

Regarding question 1, I believe everyone can answer. The default isolation level of Oracle and SqlServer is Read Commited (RC), while the default isolation level of MySQL is Repeatable Read (RR)

But for question 2, I believe that many friends will hesitate: uh..., this..., ang ang ang ang ang, my memory is not very good after too long...

The naughty buddies may start to divert the topic: if you talk about binlog, just talk about binlog. What is the default isolation level? Is MySQL's default isolation level still related to binlog?

Want to know? Then have to add money

image

image

Specifically if they are related or not, the host doesn’t know, let’s look down together

binlog format

The full name of binlog: binary log, or binary log, sometimes also called archive log. It records all operations that have been changed to the MySQL database, including table structure changes (CREATE, ALTER, DROP TABLE...) and table data modifications (INSERT, UPDATE) , DELETE...), but does not include operations such as SELECT and SHOW, because these operations do not modify the data itself; if the change operation does not cause the database to change, then the operation will also be written to the binlog, for example

create table tbl_t1(name varchar(32));
insert into tbl_t1 values('zhangsan');
update tbl_t1 set name = 'lisi' where name = '123';
show master status\G;
show binlog events in 'mysql-bin.000002'\G;

At this time: update tbl_t1 set name ='lisi' where name = '123'; did not cause changes to the database, but it was still recorded in the binlog

There are three formats of binlog: STATEMENT, ROW, MIXED. At the beginning there was only STATEMENT. Later, ROW and MIXED were gradually derived.

Before MySQL 5.1.5, the format of binlog is only STATEMENT. From 5.1.5, it supports ROW format binlog. Starting from 5.1.8, MySQL begins to support MIXED format binlog.

Before MySQL 5.7.7, the default format of binlog is STATEMENT. In 5.7.7 and later versions, the default value of binlog_format is ROW

What are the three formats of binlog, what are their differences, and what are the advantages and disadvantages of each, let’s look down

STATEMENT

From the first version of MySQL to the latest version of 8.0.x, STATEMENT has been firmly in the binlog format, but starting from 5.7.7, it retreated behind the scenes and gave the top spot to ROW

Binglog is different from the code log we are developing. It contains two types of files

Index file: file name.index, which records which log files are being used, the content is as follows

image

image

Log file: file name.00000*

image

image

Records all operations that have performed changes to the MySQL database

Because the log file of binlog is a binary file, it cannot be opened directly with a text editor. It needs to be opened with a specific tool. MySQL provides mysqlbinlog to help us view the contents of the log file.

mysqlbinlog has many optional parameters, mysqlbinlog.exe - help

mysqlbinlog.exe Ver 3.3 for Win64 at x86
Copyright (c) 2001, 2010, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Dumps a MySQL binary log in a format usable for viewing or for piping to
the mysql command line client.

Usage: mysqlbinlog.exe [options] log-files
  -?, --help          Display this help and exit.
  --base64-output[=name]
                      Determine when the output statements should be
                      base64-encoded BINLOG statements: 'never' disables it and
                      works only for binlogs without row-based events;
                      'decode-rows' decodes row events into commented SQL
                      statements if the --verbose option is also given; 'auto'
                      prints base64 only when necessary (i.e., for row-based
                      events and format description events); 'always' prints
                      base64 whenever possible. 'always' is deprecated, will be
                      removed in a future version, and should not be used in a
                      production system.  --base64-output with no 'name'
                      argument is equivalent to --base64-output=always and is
                      also deprecated.  If no --base64-output[=name] option is
                      given at all, the default is 'auto'.
  --character-sets-dir=name
                      Directory for character set files.
  -d, --database=name List entries for just this database (local log only).
  --debug-check       Check memory and open file usage at exit .
  --debug-info        Print some debug info at exit.
  -D, --disable-log-bin
                      Disable binary log. This is useful, if you enabled
                      --to-last-log and are sending the output to the same
                      MySQL server. This way you could avoid an endless loop.
                      You would also like to use it when restoring after a
                      crash to avoid duplication of the statements you already
                      have. NOTE: you will need a SUPER privilege to use this
                      option.
  -F, --force-if-open Force if binlog was not closed properly.
                      (Defaults to on; use --skip-force-if-open to disable.)
  -f, --force-read    Force reading unknown binlog events.
  -H, --hexdump       Augment output with hexadecimal and ASCII event dump.
  -h, --host=name     Get the binlog from server.
  -l, --local-load=name
                      Prepare local temporary files for LOAD DATA INFILE in the
                      specified directory.
  -o, --offset=#      Skip the first N entries.
  -p, --password[=name]
                      Password to connect to remote server.
  -P, --port=#        Port number to use for connection or 0 for default to, in
                      order of preference, my.cnf, $MYSQL_TCP_PORT,
                      /etc/services, built-in default (3306).
  --protocol=name     The protocol to use for connection (tcp, socket, pipe,
                      memory).
  -R, --read-from-remote-server
                      Read binary logs from a MySQL server.
  -r, --result-file=name
                      Direct output to a given file.
  --server-id=#       Extract only binlog entries created by the server having
                      the given id.
  --set-charset=name  Add 'SET NAMES character_set' to the output.
  --shared-memory-base-name=name
                      Base name of shared memory.
  -s, --short-form    Just show regular queries: no extra info and no row-based
                      events. This is for testing only, and should not be used
                      in production systems. If you want to suppress
                      base64-output, consider using --base64-output=never
                      instead.
  -S, --socket=name   The socket file to use for connection.
  --start-datetime=name
                      Start reading the binlog at first event having a datetime
                      equal or posterior to the argument; the argument must be
                      a date and time in the local time zone, in any format
                      accepted by the MySQL server for DATETIME and TIMESTAMP
                      types, for example: 2004-12-25 11:25:56 (you should
                      probably use quotes for your shell to set it properly).
  -j, --start-position=#
                      Start reading the binlog at position N. Applies to the
                      first binlog passed on the command line.
  --stop-datetime=name
                      Stop reading the binlog at first event having a datetime
                      equal or posterior to the argument; the argument must be
                      a date and time in the local time zone, in any format
                      accepted by the MySQL server for DATETIME and TIMESTAMP
                      types, for example: 2004-12-25 11:25:56 (you should
                      probably use quotes for your shell to set it properly).
  --stop-position=#   Stop reading the binlog at position N. Applies to the
                      last binlog passed on the command line.
  -t, --to-last-log   Requires -R. Will not stop at the end of the requested
                      binlog but rather continue printing until the end of the
                      last binlog of the MySQL server. If you send the output
                      to the same MySQL server, that may lead to an endless
                      loop.
  -u, --user=name     Connect to the remote server as username.
  -v, --verbose       Reconstruct SQL statements out of row events. -v -v adds
                      comments on column data types.
  -V, --version       Print version and exit.
  --open-files-limit=#
                      Used to reserve file descriptors for use by this program.

Variables (--variable-name=value)
and boolean options {FALSE|TRUE}  Value (after reading options)
--------------------------------- ----------------------------------------
base64-output                     (No default value)
character-sets-dir                (No default value)
database                          (No default value)
debug-check                       FALSE
debug-info                        FALSE
disable-log-bin                   FALSE
force-if-open                     TRUE
force-read                        FALSE
hexdump                           FALSE
host                              (No default value)
local-load                        (No default value)
offset                            0
port                              3307
read-from-remote-server           FALSE
server-id                         0
set-charset                       (No default value)
shared-memory-base-name           (No default value)
short-form                        FALSE
socket                            E:/soft/mysql5.5.8/tmp/mysql.sock
start-datetime                    (No default value)
start-position                    4
stop-datetime                     (No default value)
stop-position                     18446744073709551615
to-last-log                       FALSE
user                              (No default value)
open-files-limit                  18432

View Code

These parameters will not be detailed, and those who are interested can check it out by themselves. We will focus on the content of the log file and execute mysqlbinlog.exe .. / data / mysql-bin. 000004

As you can see, changes have been made to the database

insert tbl_t1 values ('aaa'),('bbb');
update tbl_t1 set name = 'a1' where name = 'aaa';
delete from tbl_t1 where name = 'bbb';

All are recorded in the log file in plain text SQL. As for the advantages and disadvantages, we will compare the other two formats after reading the other two formats.

ROW

For MySQL 5.7.7 and later versions, the default format of binlog is ROW. Based on version 5.7.30, let’s see what the content of binlog in ROW format looks like.

Generate database change operation first

Change operations are

create table tbl_row(
    name varchar(32),
    age int
);
insert into tbl_row values('qq',23),('ww',24);
update tbl_row set age = 18 where name = 'aa';
update tbl_row set age = 18 where name = 'qq';
delete from tbl_row where name = 'aa';
delete from tbl_row where name = 'ww';

The binlog file currently being written by the master: mysql-bin. 000002, position is busy from 2885 to 3929

Next, let's look at how to record in the log file, execute mysqlbinlog.exe - start-position=2885 --stop-position=3929 ../data/mysql-bin.000002

As you can see, the table structure change operation is recorded in the log file in plaintext SQL (same as STATEMENT), but the table data change operation is recorded in the log file in the form of ciphertext, which is not convenient for us read

Fortunately, mysqlbinlog provides parameters-v or-vv to decrypt and view, execute mysqlbinlog.exe - base64-output=decode-rows -v --start-position=2885 --stop-position=3929 ../data/ mysql-bin.000002

INSERT is nothing to pay attention to, each column inserts the corresponding value

insert into tbl_row values('qq',23),('ww',24);

对应

### INSERT INTO `my_project`.`tbl_row`
### SET
###   @1='qq'
###   @2=23
### INSERT INTO `my_project`.`tbl_row`
### SET
###   @1='ww'
###   @2=24

View Code

UPDATE requires attention. Although we have only one column for modifying the column and only one for the condition column, the log records are: the modified column is all the columns, the condition column is also all the columns, and the column value is a specific value, and There is no such function as NOW(), UUID()

update tbl_row set age = 18 where name = 'qq';

对应

### UPDATE `my_project`.`tbl_row`
### WHERE
###   @1='qq'
###   @2=23
### SET
###   @1='qq'
###   @2=18

View Code

The table does not explicitly specify the primary key, and there is only one record that meets the update conditions. You can try: If there are multiple records that clearly specify the primary key and meet the update conditions, see how the binlog log records.

DELETE is the same as UPDATE. Although there is only one condition column, all the columns recorded in the log are indeed

delete from tbl_row where name = 'ww';

对应

### DELETE FROM `my_project`.`tbl_row`
### WHERE
###   @1='ww'
###   @2=24

View Code

Compared with STATEMENT, it is more complicated and has a lot more content. What are the advantages of specific ROW? Let’s look down.

MIXED

Literal meaning: mixing, then who does it mix? Who else can you mix? Intelligent hybrid STATEMENT and ROW

In most cases, the binlog log is recorded in the STATEMENT format (because the MySQL default isolation level is RR, and few people modify the default isolation level). When the isolation level is RC mode, change it to ROW mode.

Some special scenes are also recorded in ROW format, so there is no distinction between RR and RC. (Excerpted from: Something about binary log-a long article for serious code)

image

image

Of course, there is also a NOW(). To put it bluntly, only specific values ​​are the most reliable. Other functions and system variables that depend on context and environment are unreliable because they will change due to context and environment.

This will not show the specific log content, interested friends will run the results by themselves

Summary of advantages and disadvantages

The three formats have been introduced. In contrast, I believe that everyone has a certain understanding of their respective characteristics, advantages and disadvantages.

One of the uses based on binlog: master-slave replication (three uses: master-slave replication, data recovery, and auditing), the host will summarize their advantages and disadvantages for everyone

image

image

MIXED's vision is good: combine the advantages of STATEMENT and ROW to produce a perfect format, but it is counterproductive and it will still have some problems

Compared to accuracy, the performance priority will be lower (with the development of technology, hardware performance is no longer an unacceptable bottleneck), so it is recommended to use the ROW format

The relationship between MySQL binlog and its default isolation level RR

Judging from the content of the binlog format above, it seems that it has nothing to do with the default isolation level RR. Don’t worry, and look down slowly.

Under RC, STATEMENT, each version of MySQL performs table data modification operations

The table engine is InnoDB, the isolation level is RC, and binlog_format=STATEMENT. Under the unified premise of binlog_format=STATEMENT, let's take a look at the execution of table data modification operations in MySQl5.0.96, MySQL5.1.30, MySQL5.5.8, and MySQL5.7.30 respectively.

MySQl5.0.96 can be executed normally

MySQL5.1.30 execution error, prompt

ERROR 1598 (HY000): Binary logging not possible. Message: Transaction level 'READ-COMMITTED' in InnoDB is not safe for binlog mode 'STATEMENT'

Both MySQL5.5.8 and MySQL5.7.30 report errors when they execute

ERROR 1665 (HY000): Cannot execute statement: impossible to write to binary log since BINLOG_FORMAT = STATEMENT and at least one table uses a storage engine limited to row-based logging. InnoDB is limited to row-logging when transaction isolation level is READ COMMITTED or READ UNCOMMITTED.

In other words, MySQL 5.1.30 and later, InnoDB with RC isolation level has restrictions on binlog_format, which cannot be STATEMENT, otherwise the table data cannot be modified

For MySQL 4.x series, since the official download is not available, it cannot be tested. If there is a 4.x version (or 5.1.x version before 5.1.21), you can send me a private message, thank you!

The order in which operations of different sessions are recorded in binlog

We use two sessions to perform the update operation, and see what is the order in which the operations of different sessions are recorded in the binlog

It can be seen that it is too busy update tbl_rr_test set age = 20 where id = 1; execute first, then commit, update tbl_rr_test set age = 21 where id = 2; execute later, commit first, the log records are: The record of commit first comes first , The record of post-commit is at the back, and has nothing to do with the execution time; for a single session, it is easy to understand, the order of execution is the order of recording; among multiple sessions, the first commit is recorded first

The main library changes the database in the order of execution time, while the binlog is recorded in the order of commit. In theory, the example problem in MySQL Bug23051 will appear.

The relationship between the default isolation level RR and binlog

Let's take a look at MySQL Bug23051. It is mentioned that in the early version of MySQL 5.1, when the isolation level is RC and the binlog format is STATEMENT, InnoDB's master-slave replication has a bug (fixed in 5.1.21), but 5.0.x No problem, we run the example in Bug23051 on 5.0.96

It can be seen that InnoDB under 5.0.96, at the RC level, when binlog_format=STATEMENT, the transaction of UPDATE t1 SET a = 11 where b = 2; is not committed, then the transaction of UPDATE t1 SET b = 2 where b = 1; Will be blocked, so when copying from the library, the data is okay

Therefore, from the previous point of view, starting from MySQL5.0, InnoDB is at the RC level, binlog_format=STATEMENT Poet master-slave replication is no bug (5.0 is no problem, 5.1.x before 5.1.21 has a problem, but the official does not provide Downloaded, 5.1.21 and later versions do not support setting binlog to STATEMENT under RC isolation level)

Then the relationship between binlog and the default level RR is clear, which is what is said in the [original] Internet project which transaction isolation level should MySQL choose:

那Mysql在5.0这个版本以前,binlog只支持STATEMENT这种格式!而这种格式在读已提交(Read Commited)这个隔离级别下主从复制是有bug的,因此Mysql将可重复读(Repeatable Read)作为默认的隔离级别!

In other words, before MySQL 5.0, RR was used as the default isolation level to avoid most of the master-slave replication bugs (for specific bugs, please refer to the case in Bug23051, or [Original] Internet projects should choose mysql. What is the case in the transaction isolation level), and it has been used all the time; why not avoid all the master-slave replication bugs, because under the RR isolation level, binlog_format=STATEMENT, use system functions (NOW(), UUID(), etc.) ), it will still cause inconsistent master-slave data

to sum up

1. Three formats of binlog

There are currently three mainstream MySQL binlog formats: STATEMENT, ROW, MIXED. Considering the accuracy of data, the ROW format is recommended

2. Binlog default format

Prior to MySQL 5.1.5, only binlog in STATEMENT format was supported. Binlog_format=ROW was supported in 5.1.5. Before MySQL 5.7.7, the default format of binlog was STATEMENT. In 5.7.7 and later versions, the default value of binlog_format is only Is ROW

3. Master-slave replication bug (InnoDB engine)

MySQL 5.1.30 and later, under InnoDB, if the RC isolation level is enabled, binlog_format=STATEMENT cannot be enabled

RC, RR isolation level, binlog_format=MIXED, there will still be data inconsistencies in master-slave replication (affected by system functions)

RR isolation level, binlog_format=STATEMENT, there will still be data inconsistencies in master-slave replication (affected by system functions)

binlog_format=ROW, whether it is RC isolation level or RR isolation level, master-slave replication will still not have data inconsistencies

4. Why the default isolation level of MySQL is RR

In order to avoid the problem of master-slave replication in previous versions of MySQL 5.0, it has been used all the time.

5. Engine selection problem

MySQL 5.6 and later, InnoDB has done a lot of optimizations, and the performance is not lower than MyISAM. If there is no special reason, you can basically give up MyISAM.

Guess you like

Origin blog.csdn.net/AI_mashimanong/article/details/109220591