foreword
So far, MySQL is still a black box for us. We are only responsible for using the client to send requests and waiting for the server to return the results. Where is the data in the table stored? In what format is it stored? How does MySQL access these data? We don't know all these questions? In the previous article, we generally know that the part of the MySQL server responsible for reading and writing data in the table is the storage engine, and the server supports different types of storage engines, such as InnoDB, MyISAM, Memory, etc. Storage engines are generally developed by different people to achieve different characteristics. The formats of real data stored in different storage engines are generally different. Even some storage engines such as Memory do not use disks to store data. , that is to say, the data in the table disappears after the server is shut down. Since InnoDB is the default storage engine of MySQL, it is also the most commonly used storage engine, so this article will take you to understand the storage structure of InnoDB. When we are familiar with the data storage structure of a storage engine, other storage engines follow suit.
Table of contents
1. Introduction to InnoDB page
InnoDB is a storage engine that stores the data in the table on disk, so our data still exists even after shutdown and restart. The actual process of data processing occurs in memory, so the data in the disk needs to be loaded into the memory, and if it is processing a write or modification request, the content in the memory needs to be refreshed to the disk. And we know that the speed of reading and writing disks is very slow, which is several orders of magnitude worse than reading and writing memory, so when we want to get some records from the table, the way InnoDB adopts is: divide the data into several pages , with the page as the basic unit of interaction between the disk and the memory, the size of a page in InnoDB is generally 16K, that is, under normal circumstances, at least 16KB of content is read from the disk into the memory at a time, and once Flush at least 16KB of memory to disk.
2. InnoDB row format
We usually insert data into tables in units of records. The way these records are stored on disk is also called row format or record format. Designing the InnoDB storage engine So far, four different types of row formats have been designed, namely:
Compact
Redundant
Dynamic
Compressed
As time goes by, they may design more row formats, but no matter how they change, the principle is roughly the same
Third, specify the syntax of the row format
create table表名 (列的信息) row_format=⾏格式名称
or
alter table 表名 row_format=⾏格式名称
For example, if we create a learning table demo in the testdb library, we can specify its row format like this
mysql> use testdb;
Database changed
mysql> create table demo1( c1 varchar(10), c2 varchar(10) not null, c3 char(10), c4 varchar(10), c5 varchar(1024)) charset=ascii row_format=compact;
Query OK, 0 rows affected (0.01 sec)
You can see that the row format of the table we just created is compact
, and we also specified the character set of this table as ascii
, because the ascii character set only includes spaces, punctuation marks, numbers, uppercase and lowercase letters, and some invisible characters. We now insert two records into this table
mysql> insert into demo1 values('aaaaa','bbbb','ccc','dd','e');
Query OK, 1 row affected (0.00 sec)
mysql> insert into demo1 values('eeeee','ffff',null,null,'abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz');
Query OK, 1 row affected (0.01 sec)
The records in the table now look like this
mysql> select * from demo1;
+-------+------+------+------+------------------------------------------------------------------------------------------------------------------------------------+
| c1 | c2 | c3 | c4 | c5 |
+-------+------+------+------+------------------------------------------------------------------------------------------------------------------------------------+
| aaaaa | bbbb | ccc | dd | e |
| eeeee | ffff | NULL | NULL | abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz |
+-------+------+------+------+------------------------------------------------------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)
The content of the demo table is also filled, now let's take a look at the differences in the storage methods of each row format.
4. Compact line format
As shown in the figure, a complete record can actually be divided 记录的额外信息
into 记录的真实数据
two parts, and we will look at the composition of these two parts in detail below.
4.1 Additional information recorded
This part of information is some information added by the server to describe this record. These additional information are divided into 3 categories, namely 变⻓字段⻓度列表
, , NULL值列表
and 记录头信息
, let’s take a look at them respectively.
4.1.1 Variable length field length list
We know that MySQL supports some variable-length data types, such as varchar(M)
, varbinary(M)
, 各种TEXT类型
, , 各种BLOB类型
and we can also call columns with these data types variable-length fields. The number of bytes of data stored in variable-length fields is not fixed. Therefore, when we store real data, we need to store the number of bytes occupied by these data, so as not to confuse the MySQL server, so the storage space occupied by these variable-length fields is divided into two parts:
- real data content
- Bytes occupied
In the Compact row format, the byte lengths occupied by the real data of all variable-length fields are stored at the beginning of the record, thus forming a list of variable-length field lengths, and the byte lengths occupied by each variable-length field data The section numbers are in column order 逆序存放
.
Let's take demo1
the first record in the table as an example. Because demo1
the , , c1
, c2
and c4
columns c5
of the table are all variable-length data types, the byte lengths of the values of these four columns need to be stored at the beginning of the record. Let’s take a look at the content of each variable-length field in the first record Length in bytes:
column name | storage content | byte length (decimal) | Content-Length (hex) |
---|---|---|---|
c1 | 'aaaaaa' | 5 | 0x05 |
c2 | 'bbbb' | 4 | 0x04 |
c4 | ‘dd’ | 2 | 0x02 |
c5 | ‘e’ | 1 | 0x01 |
And because these length values need to be stored in the reverse order of the columns, the byte string of the final variable-length field length list is expressed in hexadecimal, and the effect is: , fill in the variable-length field length list composed of this byte 01 02 04 05
string The effect of entering the schematic diagram above is as follows:
We can also view the underlying storage file: demo1.ibd
, open it with a hex editor, I use Notepad++ and its HEX-Editor plug-in here. You can find the following data fields (the row data generated by mysql may be different, but the content of the row data we created should be the same, and the data length should be exactly the same, you can search for characters to find these data) 01 02 04 05
:
Since the number of bytes occupied by the content of the first line is relatively small, it can be represented by 1 byte, but if the content of the variable-length column occupies a large number of bytes, it may be necessary to use 2 bytes to represent express. Specifically, use 1 or 2 bytes to represent the number occupied by real data. InnoDB has its own set of rules. Here we first declare the meanings of W, M, and L:
Assume that the maximum number of bytes required to represent a character in a character set is W
That is to use the columns
show charset
in the result of the statementMaxlen
. For example, W in the utf8mb4 character set is 4, W in the utf8 character set is 3, W in the gbk character set is 2, and W in the ascii character set is 1
For the variable-length type varchar(M)
, this type means that it can store up to M characters, so the maximum number of bytes that this type can represent for a character string is M×W.
Assume that the number of bytes occupied by the string it actually stores is L
So the rule for determining whether to use 1 byte or 2 bytes to represent the number of bytes occupied by a real string is this:
-
If M×W <= 255, then use 1 byte to represent the number of bytes occupied by the real string
-
If M×W > 255, there are two cases:
- If L <= 127, use 1 byte to represent the number of bytes occupied by the real string
- If L > 127, use 2 bytes to represent the number of bytes occupied by the real string
That is, if the maximum number of bytes allowed to be stored in the variable field (M×W) exceeds 255 bytes and the
并且
actual number of bytes stored (L) exceeds 127 bytes, use 2 bytes, otherwise use 1 bytes. Another thing to note is that only the length occupied by the content of the column whose value is , is
stored in the variable-length field length list , and the length of the column whose value is NULL is not stored⾮NULL
We look at the second line of data, the second line of data c5
, the maximum number of bytes its string occupies is 1024
, the actual storage of the string occupies bytes 130
, so two bytes are used to represent the length, and stored in reverse order that is
So the schematic diagram of the two rows of data is as follows:
4.1.2 List of NULL values
We know that some columns in the table may store NULL values. If these NULL values are stored in the real data of the record, it will take up a lot of space. Therefore, the Compact row format manages these columns with NULL values and stores them to the list of NULL values, its processing is like this
- First of all, which columns are allowed to store NULL in the statistics table?
As we said earlier, primary key columns and columns modified by NOT NULL cannot store NULL values, so these columns will not be counted in the statistics. For example,
demo1
the four columns of the tablec1、c3、c4、c5
allow the storage of NULL values, while the c2 column is decorated with NOT NULL and does not allow the storage of NULL values.
-
If there is no column that allows NULL to be stored in the table, the list of NULL values does not exist. Otherwise, each column that allows NULL to be stored corresponds to a binary bit, and the binary bits are arranged in reverse order according to the order of the columns. The meaning of the bit representation is as follows
- When the value of the binary bit is 1, it means that the value of the column is NULL.
- When the value of the binary bit is 0, it means that the value of the column is not NULL
Because the table
demo1
has 4 columns whose values are allowed to be NULL, the corresponding relationship between these 4 columns and binary digits is as follows:
- MySQL stipulates that the list of NULL values must be represented by an integer number of bytes. If the number of binary digits used is not an integer number of bytes, 0 is added to the high-order bits of the byte.
For the first record, the values of the four columns c1, c3, c4, and c5 are not NULL, so their corresponding binary digits are less than one byte, so adding 0 to the high bit of the byte, the effect is so:
So the hexadecimal representation of the NULL value list of the first record is: 0x00, and so on, if there are 9 NULL values allowed in a table, then the NULL value list part of this record needs 2 bytes to represent up
The values of the second piece of data c3 and c4 are both NULL
, so the binary digits corresponding to these 4 columns are:
So the NULL value list of the second record is expressed in hexadecimal: 0x06, view the file
So the schematic diagram of these two records after filling the NULL value list is like this:
4.1.3 Record header information
In addition to the variable length field length list and NULL value list, there is also a record header information used to describe the record, which is composed of fixed 5 bytes. 5 bytes are 40 binary digits, and different digits represent different meanings, as shown in the figure: The
detailed information represented by these binary digits is as follows:
name | size (unit: bit) | describe |
---|---|---|
Reserved bit 1 | 1 | not used |
Reserved 2 | 1 | not used |
delete_mask | 1 | Mark whether the record is deleted |
min_rec_mask | 1 | This mark will be added to the smallest record in each non-leaf node of the B+ tree |
n_owned | 4 | Indicates the number of records owned by the current record |
heap_no | 13 | Indicates the position information of the current record in the record heap |
record_type | 3 | Indicates the type of the current record, 0 indicates an ordinary record, 1 indicates a B+ tree non-leaf node record, 2 indicates the smallest record, and 3 indicates the largest record |
next_record | 16 | Indicates the relative position of the next record |
Let's look at the demo1.ibd storage file:
According to reasoning, it is easy to get demo1
the record header information of the two records in the data table as follows
第一行:00 00 10 00 3a
第二行:00 00 18 ff bb
According to the composition of the record header, convert the above record from hexadecimal to binary analysis
第一行:00000000 00000000 00010000 00000000 00110111
第二行:00000000 00000000 00011000 11111111 10111011
According to these binary data, the following information can be obtained by dividing the data according to the record header structure
Reserved bit 1(1b) | Reserved bit 2(1b) | delete_mask(1b) | min_rec_mask(1b) | n_owned(4b) | heap_no(13) | record_type(3b) | next_record(16b) | |
---|---|---|---|---|---|---|---|---|
first row | 0 | 0 | 0 | 0 | 0000 | 00000000 00010(2) | 000 | 00000000 00110111(55) |
second line | 0 | 0 | 0 | 0 | 0000 | 00000000 00011(3) | 000 | 11111111 10111011(-69) |
The last 16 bytes of the second line 11111111 10111011 are negative numbers, complement +1, 1000101, the decimal is
-69
So the schematic diagram of these two records after filling the record header information is like this:
4.2 Real data recorded
For demo1
the table, besides the data of columns c1, c2, c3, c4, and c5 that we define ourselves, MySQL will add some columns (also called hidden columns) to each record by default. ), the specific columns are as follows:
column name | Is it necessary | take up space | describe |
---|---|---|---|
row_id | no | 6 bytes | Row ID, which uniquely identifies a record |
transaction_id | yes | 6 bytes | Transaction ID |
roll_pointer | yes | 7 bytes | rollback pointer |
Tip:
In fact, the real names of these columns are:DB_ROW_ID
DB_TRX_ID
, , andDB_ROLL_PTR
are written for aestheticsrow_id
. Here we need to mention the primary key generation strategy of InnoDB tables: user-defined primary keys are preferred as primary keys , if the user does not define a primary key, a Unique key is selected as the primary key. If the table does not even have a Unique key defined, InnoDB will add a hidden column named row_id to the table as the primary key by default. So we can see from the above table that the InnoDB storage engine will add the two columns DB_TRX_ID and DB_ROLL_PTR for each record, but row_id is optional (this will only be added if there is no custom primary key and Unique key List). We don't need to worry about the values of these hidden columns, the InnoDB storage engine will generate them for us.transaction_id
roll_pointer
Because the table demo1
does not define a primary key, the MySQL server will add the above three columns to each record.
Let's look at the file with the first line of content:
The real data we add to the first row of records is as follows
Refer to the ASCII code comparison table: 16进制的61对应a 62对应b 63对应c 20对应空格(char以外的7个字节的统统都用空格字符填充) 64对应d 65对应e
, are you confused?
4.3 Storage format of CHAR(M) column
demo1
The types of columns c1
, c2
, c4
, c5
of the table are VARCHAR(10)
, while the type of column c3 is CHAR(10)
, demo1
the table uses the ascii character set, which is a fixed-length character set, which means that a character uses a fixed byte. Variable-length character set (that is, the number of bytes required to represent a character is uncertain, such as gbk represents a character requires 1 to 2 bytes, utf8 represents a character requires 1 to 3 bytes, etc.), c3 column The length is also stored in the variable-length field-length list.
For columns of CHAR(M) type, when the column uses a fixed-length character set, the number of bytes occupied by the column will not be added to the variable-length field length list, and if the variable-length character set is used, the The number of bytes occupied by the column is also added to the variable-length field-length list.
Tip:
CHAR(M) type columns of variable-length character sets require at least M bytes, while VARCHAR(M) does not have this requirement. For example, for a CHAR(10) column using the utf8 character set, the length of the data stored in the column ranges from 10 to 30 bytes. Even if we store an empty string in this column, it will take up 10 bytes. This is because the byte length of the value updated in the future is greater than the byte length of the original value and less than 10 bytes. The record is directly updated instead of reallocating a new record space in the storage space, causing the original record space to become so-called fragmentation.
Let's create a demo2
table to see:
mysql> create table demo2( c1 varchar(10), c2 varchar(10) not null, c3 char(10), c4 varchar(10), c5 varchar(1024)) charset=utf8 row_format=compact;
Query OK, 0 rows affected, 1 warning (0.03 sec)
mysql> insert into demo2 values('aaaaa','bbbb','ccc','dd','e');
Query OK, 1 row affected (0.01 sec)
I won’t be nagging here, just open the demo2.ibd file to view
The effect in the diagram is as follows:
we are inserting a piece of data
mysql> insert into demo2 values('一一一一一','一一 一','测试测试测试测测试','dd','e');
The number of bytes of this data is as follows: 5*3=15
3*3+1=10
3*9=27
2
1
Corresponding to the hexadecimal system stored in reverse order 01 02 1b oa of
, we will not draw a picture here, let’s directly look at the demo2.ibd file
This is the end of this document, and we will talk about other row formats and row overflow later. This article is a bit conceptual, and everyone can understand it slowly.
So far, today's study is over, I hope you will become an indestructible self
~~~
You can’t connect the dots looking forward; you can only connect them looking backwards. So you have to trust that the dots will somehow connect in your future.You have to trust in something - your gut, destiny, life, karma, whatever. This approach has never let me down, and it has made all the difference in my life
If my content is helpful to you, please 点赞
, 评论
,, 收藏
creation is not easy, everyone's support is the motivation for me to persevere