This article takes you to understand the InnoDB record structure of MySQL [Part 1]

foreword

So far, MySQL is still a black box for us. We are only responsible for using the client to send requests and waiting for the server to return the results. Where is the data in the table stored? In what format is it stored? How does MySQL access these data? We don't know all these questions? In the previous article, we generally know that the part of the MySQL server responsible for reading and writing data in the table is the storage engine, and the server supports different types of storage engines, such as InnoDB, MyISAM, Memory, etc. Storage engines are generally developed by different people to achieve different characteristics. The formats of real data stored in different storage engines are generally different. Even some storage engines such as Memory do not use disks to store data. , that is to say, the data in the table disappears after the server is shut down. Since InnoDB is the default storage engine of MySQL, it is also the most commonly used storage engine, so this article will take you to understand the storage structure of InnoDB. When we are familiar with the data storage structure of a storage engine, other storage engines follow suit.

1. Introduction to InnoDB page

InnoDB is a storage engine that stores the data in the table on disk, so our data still exists even after shutdown and restart. The actual process of data processing occurs in memory, so the data in the disk needs to be loaded into the memory, and if it is processing a write or modification request, the content in the memory needs to be refreshed to the disk. And we know that the speed of reading and writing disks is very slow, which is several orders of magnitude worse than reading and writing memory, so when we want to get some records from the table, the way InnoDB adopts is: divide the data into several pages , with the page as the basic unit of interaction between the disk and the memory, the size of a page in InnoDB is generally 16K, that is, under normal circumstances, at least 16KB of content is read from the disk into the memory at a time, and once Flush at least 16KB of memory to disk.

2. InnoDB row format

We usually insert data into tables in units of records. The way these records are stored on disk is also called row format or record format. Designing the InnoDB storage engine So far, four different types of row formats have been designed, namely:

  • Compact
  • Redundant
  • Dynamic
  • Compressed

As time goes by, they may design more row formats, but no matter how they change, the principle is roughly the same

Third, specify the syntax of the row format

create table表名 (列的信息) row_format=⾏格式名称
or
alter table 表名 row_format=⾏格式名称
For example, if we create a learning table demo in the testdb library, we can specify its row format like this

mysql> use testdb;
Database changed
mysql> create table demo1( c1 varchar(10), c2 varchar(10) not null, c3 char(10), c4 varchar(10), c5 varchar(1024)) charset=ascii row_format=compact;
Query OK, 0 rows affected (0.01 sec)

You can see that the row format of the table we just created is compact, and we also specified the character set of this table as ascii, because the ascii character set only includes spaces, punctuation marks, numbers, uppercase and lowercase letters, and some invisible characters. We now insert two records into this table

mysql> insert into demo1 values('aaaaa','bbbb','ccc','dd','e');
Query OK, 1 row affected (0.00 sec)

mysql> insert into demo1 values('eeeee','ffff',null,null,'abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz');
Query OK, 1 row affected (0.01 sec)

The records in the table now look like this

mysql> select * from demo1;
+-------+------+------+------+------------------------------------------------------------------------------------------------------------------------------------+
| c1    | c2   | c3   | c4   | c5                                                                                                                                 |
+-------+------+------+------+------------------------------------------------------------------------------------------------------------------------------------+
| aaaaa | bbbb | ccc  | dd   | e                                                                                                                                  |
| eeeee | ffff | NULL | NULL | abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz |
+-------+------+------+------+------------------------------------------------------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)

The content of the demo table is also filled, now let's take a look at the differences in the storage methods of each row format.

4. Compact line format

insert image description here
As shown in the figure, a complete record can actually be divided 记录的额外信息into 记录的真实数据two parts, and we will look at the composition of these two parts in detail below.

4.1 Additional information recorded

This part of information is some information added by the server to describe this record. These additional information are divided into 3 categories, namely 变⻓字段⻓度列表, , NULL值列表and 记录头信息, let’s take a look at them respectively.

4.1.1 Variable length field length list

We know that MySQL supports some variable-length data types, such as varchar(M), varbinary(M), 各种TEXT类型, , 各种BLOB类型and we can also call columns with these data types variable-length fields. The number of bytes of data stored in variable-length fields is not fixed. Therefore, when we store real data, we need to store the number of bytes occupied by these data, so as not to confuse the MySQL server, so the storage space occupied by these variable-length fields is divided into two parts:

  • real data content
  • Bytes occupied

In the Compact row format, the byte lengths occupied by the real data of all variable-length fields are stored at the beginning of the record, thus forming a list of variable-length field lengths, and the byte lengths occupied by each variable-length field data The section numbers are in column order 逆序存放.

Let's take demo1the first record in the table as an example. Because demo1the , , c1, c2and c4columns c5of the table are all variable-length data types, the byte lengths of the values ​​of these four columns need to be stored at the beginning of the record. Let’s take a look at the content of each variable-length field in the first record Length in bytes:

column name storage content byte length (decimal) Content-Length (hex)
c1 'aaaaaa' 5 0x05
c2 'bbbb' 4 0x04
c4 ‘dd’ 2 0x02
c5 ‘e’ 1 0x01

And because these length values ​​need to be stored in the reverse order of the columns, the byte string of the final variable-length field length list is expressed in hexadecimal, and the effect is: , fill in the variable-length field length list composed of this byte 01 02 04 05string The effect of entering the schematic diagram above is as follows:

insert image description here
We can also view the underlying storage file: demo1.ibd, open it with a hex editor, I use Notepad++ and its HEX-Editor plug-in here. You can find the following data fields (the row data generated by mysql may be different, but the content of the row data we created should be the same, and the data length should be exactly the same, you can search for characters to find these data) 01 02 04 05:

insert image description here
Since the number of bytes occupied by the content of the first line is relatively small, it can be represented by 1 byte, but if the content of the variable-length column occupies a large number of bytes, it may be necessary to use 2 bytes to represent express. Specifically, use 1 or 2 bytes to represent the number occupied by real data. InnoDB has its own set of rules. Here we first declare the meanings of W, M, and L:

Assume that the maximum number of bytes required to represent a character in a character set is W

That is to use the columns show charsetin the result of the statement Maxlen. For example, W in the utf8mb4 character set is 4, W in the utf8 character set is 3, W in the gbk character set is 2, and W in the ascii character set is 1

For the variable-length type varchar(M), this type means that it can store up to M characters, so the maximum number of bytes that this type can represent for a character string is M×W.

Assume that the number of bytes occupied by the string it actually stores is L

So the rule for determining whether to use 1 byte or 2 bytes to represent the number of bytes occupied by a real string is this:

  • If M×W <= 255, then use 1 byte to represent the number of bytes occupied by the real string

  • If M×W > 255, there are two cases:

    • If L <= 127, use 1 byte to represent the number of bytes occupied by the real string
    • If L > 127, use 2 bytes to represent the number of bytes occupied by the real string

That is, if the maximum number of bytes allowed to be stored in the variable field (M×W) exceeds 255 bytes and the 并且actual number of bytes stored (L) exceeds 127 bytes, use 2 bytes, otherwise use 1 bytes. Another thing to note is that only the length occupied by the content of the column whose value is , is
stored in the variable-length field length list , and the length of the column whose value is NULL is not stored⾮NULL

We look at the second line of data, the second line of data c5, the maximum number of bytes its string occupies is 1024, the actual storage of the string occupies bytes 130, so two bytes are used to represent the length, and stored in reverse order that is

insert image description hereSo the schematic diagram of the two rows of data is as follows:

insert image description here

4.1.2 List of NULL values

We know that some columns in the table may store NULL values. If these NULL values ​​are stored in the real data of the record, it will take up a lot of space. Therefore, the Compact row format manages these columns with NULL values ​​and stores them to the list of NULL values, its processing is like this

  • First of all, which columns are allowed to store NULL in the statistics table?

As we said earlier, primary key columns and columns modified by NOT NULL cannot store NULL values, so these columns will not be counted in the statistics. For example, demo1the four columns of the table c1、c3、c4、c5allow the storage of NULL values, while the c2 column is decorated with NOT NULL and does not allow the storage of NULL values.

  • If there is no column that allows NULL to be stored in the table, the list of NULL values ​​does not exist. Otherwise, each column that allows NULL to be stored corresponds to a binary bit, and the binary bits are arranged in reverse order according to the order of the columns. The meaning of the bit representation is as follows

    • When the value of the binary bit is 1, it means that the value of the column is NULL.
    • When the value of the binary bit is 0, it means that the value of the column is not NULL

    Because the table demo1has 4 columns whose values ​​are allowed to be NULL, the corresponding relationship between these 4 columns and binary digits is as follows:

insert image description here

  • MySQL stipulates that the list of NULL values ​​must be represented by an integer number of bytes. If the number of binary digits used is not an integer number of bytes, 0 is added to the high-order bits of the byte.

For the first record, the values ​​of the four columns c1, c3, c4, and c5 are not NULL, so their corresponding binary digits are less than one byte, so adding 0 to the high bit of the byte, the effect is so:

insert image description here

So the hexadecimal representation of the NULL value list of the first record is: 0x00, and so on, if there are 9 NULL values ​​allowed in a table, then the NULL value list part of this record needs 2 bytes to represent up

insert image description here
The values ​​of the second piece of data c3 and c4 are both NULL, so the binary digits corresponding to these 4 columns are:

insert image description here

So the NULL value list of the second record is expressed in hexadecimal: 0x06, view the file

insert image description here
So the schematic diagram of these two records after filling the NULL value list is like this:

insert image description here

4.1.3 Record header information

In addition to the variable length field length list and NULL value list, there is also a record header information used to describe the record, which is composed of fixed 5 bytes. 5 bytes are 40 binary digits, and different digits represent different meanings, as shown in the figure: The
insert image description here
detailed information represented by these binary digits is as follows:

name size (unit: bit) describe
Reserved bit 1 1 not used
Reserved 2 1 not used
delete_mask 1 Mark whether the record is deleted
min_rec_mask 1 This mark will be added to the smallest record in each non-leaf node of the B+ tree
n_owned 4 Indicates the number of records owned by the current record
heap_no 13 Indicates the position information of the current record in the record heap
record_type 3 Indicates the type of the current record, 0 indicates an ordinary record, 1 indicates a B+ tree non-leaf node record, 2 indicates the smallest record, and 3 indicates the largest record
next_record 16 Indicates the relative position of the next record

Let's look at the demo1.ibd storage file:

insert image description here
According to reasoning, it is easy to get demo1the record header information of the two records in the data table as follows

第一行:00 00 10 00 3a
第二行:00 00 18 ff bb

According to the composition of the record header, convert the above record from hexadecimal to binary analysis
第一行:00000000 00000000 00010000 00000000 00110111
第二行:00000000 00000000 00011000 11111111 10111011

According to these binary data, the following information can be obtained by dividing the data according to the record header structure

Reserved bit 1(1b) Reserved bit 2(1b) delete_mask(1b) min_rec_mask(1b) n_owned(4b) heap_no(13) record_type(3b) next_record(16b)
first row 0 0 0 0 0000 00000000 00010(2) 000 00000000 00110111(55)
second line 0 0 0 0 0000 00000000 00011(3) 000 11111111 10111011(-69)

The last 16 bytes of the second line 11111111 10111011 are negative numbers, complement +1, 1000101, the decimal is-69

So the schematic diagram of these two records after filling the record header information is like this:

insert image description here

4.2 Real data recorded

For demo1the table, besides the data of columns c1, c2, c3, c4, and c5 that we define ourselves, MySQL will add some columns (also called hidden columns) to each record by default. ), the specific columns are as follows:

column name Is it necessary take up space describe
row_id no 6 bytes Row ID, which uniquely identifies a record
transaction_id yes 6 bytes Transaction ID
roll_pointer yes 7 bytes rollback pointer

Tip:
In fact, the real names of these columns are: DB_ROW_ID DB_TRX_ID, , and DB_ROLL_PTRare written for aesthetics row_id. Here we need to mention the primary key generation strategy of InnoDB tables: user-defined primary keys are preferred as primary keys , if the user does not define a primary key, a Unique key is selected as the primary key. If the table does not even have a Unique key defined, InnoDB will add a hidden column named row_id to the table as the primary key by default. So we can see from the above table that the InnoDB storage engine will add the two columns DB_TRX_ID and DB_ROLL_PTR for each record, but row_id is optional (this will only be added if there is no custom primary key and Unique key List). We don't need to worry about the values ​​of these hidden columns, the InnoDB storage engine will generate them for us.transaction_idroll_pointer

Because the table demo1does not define a primary key, the MySQL server will add the above three columns to each record.

insert image description here
Let's look at the file with the first line of content:

insert image description here
The real data we add to the first row of records is as follows

insert image description here
Refer to the ASCII code comparison table: 16进制的61对应a 62对应b 63对应c 20对应空格(char以外的7个字节的统统都用空格字符填充) 64对应d 65对应e, are you confused?

4.3 Storage format of CHAR(M) column

demo1The types of columns c1, c2, c4, c5of the table are VARCHAR(10), while the type of column c3 is CHAR(10), demo1the table uses the ascii character set, which is a fixed-length character set, which means that a character uses a fixed byte. Variable-length character set (that is, the number of bytes required to represent a character is uncertain, such as gbk represents a character requires 1 to 2 bytes, utf8 represents a character requires 1 to 3 bytes, etc.), c3 column The length is also stored in the variable-length field-length list.

For columns of CHAR(M) type, when the column uses a fixed-length character set, the number of bytes occupied by the column will not be added to the variable-length field length list, and if the variable-length character set is used, the The number of bytes occupied by the column is also added to the variable-length field-length list.

Tip:
CHAR(M) type columns of variable-length character sets require at least M bytes, while VARCHAR(M) does not have this requirement. For example, for a CHAR(10) column using the utf8 character set, the length of the data stored in the column ranges from 10 to 30 bytes. Even if we store an empty string in this column, it will take up 10 bytes. This is because the byte length of the value updated in the future is greater than the byte length of the original value and less than 10 bytes. The record is directly updated instead of reallocating a new record space in the storage space, causing the original record space to become so-called fragmentation.

Let's create a demo2table to see:

mysql> create table demo2( c1 varchar(10), c2 varchar(10) not null, c3 char(10), c4 varchar(10), c5 varchar(1024)) charset=utf8 row_format=compact;
Query OK, 0 rows affected, 1 warning (0.03 sec)

mysql> insert into demo2 values('aaaaa','bbbb','ccc','dd','e');
Query OK, 1 row affected (0.01 sec)

I won’t be nagging here, just open the demo2.ibd file to view

insert image description here
The effect in the diagram is as follows:

insert image description herewe are inserting a piece of data

mysql> insert into demo2 values('一一一一一','一一 一','测试测试测试测测试','dd','e');

The number of bytes of this data is as follows: 5*3=15 3*3+1=10 3*9=27 2 1Corresponding to the hexadecimal system stored in reverse order 01 02 1b oa of, we will not draw a picture here, let’s directly look at the demo2.ibd file

insert image description here
This is the end of this document, and we will talk about other row formats and row overflow later. This article is a bit conceptual, and everyone can understand it slowly.

So far, today's study is over, I hope you will become an indestructible self
~~~

You can’t connect the dots looking forward; you can only connect them looking backwards. So you have to trust that the dots will somehow connect in your future.You have to trust in something - your gut, destiny, life, karma, whatever. This approach has never let me down, and it has made all the difference in my life

If my content is helpful to you, please 点赞, 评论,, 收藏creation is not easy, everyone's support is the motivation for me to persevere

insert image description here

Guess you like

Origin blog.csdn.net/liang921119/article/details/130521668