Byte Interview: Do you know how the NULL value of MySQL is stored?

Hello everyone, when a reader was facing Byte, he was asked this question:

If you know the storage structure of a row of MySQL records, then this question will not be difficult for you.

It doesn't matter if you don't know, this time I will talk to you about how a row of records in MySQL is stored?

After knowing this, in addition to unlocking the previous interview questions, you will also unlock these interview questions:

  • Do MySQL's NULL values ​​take up space?

  • How does MySQL know the size of data actually occupied by varchar(n)?

  • What is the maximum value of n in varchar(n)?

  • How does MySQL handle row overflow?

These problems seem to be irrelevant, but they all revolve around the knowledge point of "the storage structure of a row of MySQL records". Therefore, after breaking through this knowledge point, these problems can be easily solved.

Well, without further ado, let's go!

Which file is MySQL data stored in?

Everyone knows that MySQL data is stored on the disk, so which file is it stored in?

The behavior of MySQL storage is implemented by the storage engine. MySQL supports multiple storage engines, and the files saved by different storage engines are naturally different.

InnoDB is our commonly used storage engine and the default storage engine for MySQL. Therefore, this article mainly discusses the InnoDB storage engine.

Let's take a look at which directory the files of the MySQL database are stored in?

mysql> SHOW VARIABLES LIKE 'datadir'; +---------------+-----------------+ | Variable_name | Value | +---------------+-----------------+ | datadir | /var/lib/mysql/ | +---------------+-----------------+ 1 row in set (0.00 sec) 复制代码

Every time we create a database (database), a directory named database will be created in the /var/lib/mysql/ directory, and then the files for saving the table structure and table data will be stored in this directory.

For example, I have a database named my_test, and there is a database table named t_order in this database.

Then, we enter the /var/lib/mysql/my_test directory to see what files are in it?

[root@xiaolin ~]#ls /var/lib/mysql/my_test db.opt t_order.frm t_order.ibd Copy code

As you can see, there are three files in total, which represent:

  • db.opt, used to store the default character set and character verification rules of the current database.

  • t_order.frm, the table structure of t_order will be saved in this file. Creating a table in MySQL will generate a .frm file, which is used to save the metadata information of each table, mainly including the table structure definition.

  • t_order.ibd, the table data of t_order will be saved in this file. Table data can be stored in a shared tablespace file (file name: ibdata1) or in an exclusive tablespace file (file name: tablename.idb). This behavior is controlled by the parameter innodb_file_per_table. If the parameter innodb_file_per_table is set to 1, the stored data, indexes and other information will be stored separately in an exclusive table space. Starting from MySQL 5.6.6, its default value is 1. , so after this version, the data of each table in MySQL is stored in a separate .idb file.

Ok, now we know that the data of a database table is stored in the file "table name.idb", which is also called an exclusive table space file.

What is the structure of the table space file?

A tablespace consists of segments, extents, pages, and rows. The logical storage structure of the InnoDB storage engine is roughly as follows:

Let's take a look at them one by one from bottom to top.

1. Row

The records in the database table are stored in rows, and each row of records has a different storage structure according to different row formats.

Later, we will introduce the row format of the InnoDB storage engine in detail, which is also the focus of this article.

2. Page

Records are stored in rows, but the reading of the database does not take "rows" as the unit, otherwise one reading (that is, one I/O operation) can only process one row of data, and the efficiency will be very low.

Therefore, InnoDB's data is read and written in units of "pages". That is to say, when a record needs to be read, it is not to read the row record from the disk, but to read and write the entire row record in units of pages. read into memory.

The default size of each page is 16KB, that is, a maximum of 16KB of continuous storage space can be guaranteed.

A page is the smallest unit of InnoDB storage engine disk management, which means that each read and write of the database is in units of 16KB. At least 16K of content is read from the disk into the memory at a time, and at least 16K of the content in the memory is refreshed to the memory at a time. disk.

There are many types of pages, common ones are data pages, undo log pages, overflow pages, and so on. The row records in the data table are managed by "data page". I won't go into details about the structure of the data page here. I have mentioned it in the previous article. If you are interested, you can read this article: Look at B+ from another angle Tree

In short, it is enough to know that the records in the table are stored in the "data page".

3. Extent

We know that the InnoDB storage engine uses B+ trees to organize data.

Each layer in the B+ tree is connected by a doubly linked list. If the storage space is allocated in units of pages, the physical positions between two adjacent pages in the linked list are not continuous, and may be very far apart. , then there will be a lot of random I/O during disk query, and random I/O is very slow.

It is also very simple to solve this problem, that is, let the physical positions of adjacent pages in the linked list be adjacent, so that sequential I/O can be used, and the performance will be very high when range query (scanning leaf nodes).

How to solve it?

When the amount of data in the table is large, when allocating space for an index, it is no longer allocated in units of pages, but in units of extents. The size of each area is 1MB. For 16KB pages, 64 consecutive pages will be divided into one area, so that the physical positions of adjacent pages in the linked list are also adjacent, and sequential I/O can be used. up.

4. Segment

A table space is made up of segments, and a segment is made up of multiple extents. Segments are generally divided into data segments, index segments, and rollback segments.

  • Index segment: a collection of areas that store non-leaf nodes of the B + tree;

  • Data segment: a collection of areas that store the leaf nodes of the B + tree;

  • Rollback segment: It stores a collection of rollback data areas. When we talked about transaction isolation, we introduced that MVCC uses rollback segments to realize multi-version query data.

Well, finally finished the structure of the table space. Next, let's talk about the row format of InnoDB in detail.

The reason why we have to make a big circle before talking about the format of row records is mainly to let everyone know which file the row records are stored in, and which area in the tablespace file the row records are stored in. There is a perspective from top to bottom , so that it will not feel very abstract when understood.

What are the InnoDB row formats?

The row format (row_format) is the storage structure of a record.

InnoDB provides four row formats, namely Redundant, Compact, Dynamic and Compressed row formats.

  • Redundant is a very old row format. The row format used before MySQL 5.0 is basically not used now.

  • Since Redundant is not a compact row format, the compact row record storage method was introduced after MySQL 5.0. Compact is a compact row format. The original intention of the design is to allow more row records to be stored in a data page, from After MySQL 5.1, the row format is set to Compact by default.

  • Both Dynamic and Compressed are compact line formats, and their line formats are similar to Compact, because they are based on Compact and improve a little bit. After MySQL5.7, the Dynamic row format is used by default.

I will not talk about the redundant line format here, because almost no one uses it now. This time I will focus on the compact line format, because the two line formats Dynamic and Compressed are very similar to Compact.

Therefore, after you understand the Compact row format, you can understand other row formats later, and you will be able to understand it soon.

What does the COMPACT row format look like?

Get familiar with the Compact line format first, it looks like this:

It can be seen that a complete record is divided into two parts: "recorded additional information" and "recorded real data".

Next, let's talk about each in detail.

Additional information recorded

The additional information of the record includes 3 parts: variable-length field length list, NULL value list, and record header information.

1. Variable length field length list

What is the difference between varchar(n) and char(n), I believe everyone is very clear, char is fixed-length, varchar is variable-length, and the length (size) of the data actually stored in the variable-length field is not fixed.

Therefore, when storing data, the size occupied by the data should also be stored in the "variable field length list". When reading data, the corresponding length can be read according to this "variable field length list". data. The same is true for other variable-length fields such as TEXT and BLOB.

In order to show how the "variable-length field length list" specifically saves the "number of bytes occupied by the real data of the variable-length field", we first create such a table, the character set is ascii (so each character occupies 1 byte) , the row format is Compact, and the name and phone fields in the t_user table are both variable-length fields:

CREATE TABLE `t_user` ( `id` int(11) NOT NULL, `name` VARCHAR(20) NOT NULL, `phone` VARCHAR(20) DEFAULT NULL, `age` int(11) DEFAULT NULL, PRIMARY KEY (`id`) USING BTREE ) ENGINE = InnoDB DEFAULT CHARACTER SET = ascii ROW_FORMAT = COMPACT; 复制代码

Now there are these three records in the t_user table:

Next, let's take a look at how the "variable field length list" in the row format of these three records is stored.

Let's look at the first record first:

  • The value of the name column is a, and the number of bytes occupied by the real data is 1 byte, which is 0x01 in hexadecimal;

  • The value of the phone column is 123, and the number of bytes occupied by the real data is 3 bytes, which is 0x03 in hexadecimal;

  • The age column and id column are not variable-length fields, so don't worry about them here.

The number of bytes occupied by the real data of these variable-length fields will be stored in reverse order according to the order of the columns (we will explain why this is designed later), so the content in the "variable-length field length list" is "03 01" instead of " 01 03".

In the same way, we can also conclude that in the row format of the second record, the content in the "variable field length list" is "04 02", as shown in the following figure:

The value of the phone column in the third record is NULL, and NULL will not be stored in the real data part of the record in the row format, so the "variable-length field length list" does not need to save the length of the variable-length field whose value is NULL .

Why should the information in the "Variable Field Length List" be stored in reverse order?

This design is thoughtful, mainly because the pointer to the next record in the "record header information" points to the position between the "record header information" and "real data" of the next record. Reading to the left is to record the header information, and reading to the right is the real data, which is more convenient.

The reason why the information in the "variable-length field length list" should be stored in reverse order is that the real data of the records at the front and the field length information corresponding to the data can be stored in one CPU Cache Line at the same time, which can improve the CPU performance. Cache hit rate.

For the same reason, the information of the NULL value list also needs to be stored in reverse order.

If you don't know what CPU Cache is, you can read this article, which belongs to the knowledge of computer composition.

Does the row format of each database table have a "variable-length field byte count list"?

In fact, the variable-length field byte list is not necessary.

When the data table has no variable-length fields, such as all fields of int type, the row format in the table will not have a "variable-length field length list", because it is unnecessary, it is better to remove it to save space.

Therefore, the "variable-length field length list" only appears when the data table has variable-length fields.

2. List of NULL values

Some columns in the table may store NULL values. If these NULL values ​​are placed in the real data of the record, it will be a waste of space, so the Compact row format stores these columns with NULL values ​​in the NULL value list.

If there are columns that allow NULL values, each column corresponds to a binary bit (bit), and the binary bits are arranged in the reverse order of the column.

  • When the value of the binary bit is 1, it means that the value of the column is NULL.

  • When the value of the binary bit is 0, it means that the value of the column is not NULL.

In addition, the NULL value list must be represented by an integer number of bytes (1 byte is 8 bits). If the number of binary bits used is less than an integer number of bytes, 0 is added to the high bit of the byte.

Still take these three records of the t_user table as an example:

Next, let's take a look at how the list of NULL values ​​in the row format of these three records is stored.

Let’s look at the first record first. All the columns of the first record have values, and there is no NULL value, so it’s purple in binary:

But InnoDB uses the binary bits of integer bytes to represent the NULL value list, which is less than 8 bits now, so it is necessary to fill in the high bits with 0, and finally use binary to represent the NULL value list:

So, for the first piece of data, the list of NULL values ​​is 0x00 in hexadecimal notation.

Next, look at the second record. The age column of the second record is a NULL value. Therefore, for the second data, the NULL value list is 0x04 in hexadecimal notation.

The last third record, the phone column and age column of the third record are NULL values, so, for the third data, the NULL value list is 0x06 in hexadecimal notation.

After we have filled the NULL value lists of the three records, their row format is as follows:

Does the row format of each database table have a "NULL value list"?

A list of NULL values ​​is also not required.

When the fields of the data table are all defined as NOT NULL, the row format in the table will not have a list of NULL values. Therefore, when designing a database table, it is usually recommended to set the field to NOT NULL, which can save 1 byte of space (the list of NULL values ​​occupies 1 byte of space).

Is the "NULL value list" a fixed 1-byte space? If this is the case, a record has 9 field values ​​that are all NULL, how to express it at this time?

The space of "NULL value list" is not fixed at 1 byte.

When a record has 9 field values ​​that are all NULL, a 2-byte space "NULL value list" will be created, and so on.

3. Record header information

There are many contents contained in the record header information, so I will not list them one by one. Here are some important ones:

  • delete_mask : Indicates whether this piece of data is deleted. From this we can know that when we execute delete to delete a record, we will not actually delete the record, but just mark the delete_mask of this record as 1.

  • next_record: The position of the next record. From here we can know that records are organized through linked lists. As I mentioned earlier, it points to the position between the "record header information" and "real data" of the next record. The advantage of this is that reading to the left is the record header information, and reading to the right is the real data, which is more convenient .

  • record_type: indicates the type of the current record, 0 indicates a normal record, 1 indicates a B+ tree non-leaf node record, 2 indicates the minimum record, and 3 indicates the maximum record

real data recorded

In addition to the fields we defined, there are three hidden fields in the record real data part, namely: row_id, trx_id, roll_pointer, let's see what these three fields are.

  • row_id

If we specify a primary key or a unique constraint column when we create a table, then there is no row_id hidden field. If neither a primary key nor a unique constraint is specified, InnoDB will add a row_id hidden field to the record. row_id is not required and occupies 6 bytes.

  • trx_id

Transaction id, indicating which transaction generated this data. trx_id is required and occupies 6 bytes.

  • roll_pointer

A pointer to the previous version of this record. roll_pointer is required and takes 7 bytes.

If you are familiar with the MVCC mechanism, you should know the role of trx_id and roll_pointer. If you don’t know the MVCC mechanism, you can read this article and be sure to master it. Interviews often ask how MVCC is implemented.

What is the maximum value of n in varchar(n)?

We need to be clear that MySQL stipulates that except for large object types such as TEXT and BLOBs, the total length of bytes occupied by all other columns (excluding hidden columns and record header information) cannot exceed 65535 bytes.

That is to say, except for columns of TEXT and BLOBs types, the maximum limit for a row of records is 65535 bytes. Note that it is the total length of a row, not a column.

After knowing this premise, let's take a look at this question again: "What is the maximum value of n in varchar(n)?"

The n of the varchar(n) field type represents the maximum number of characters stored, not the byte size.

To calculate the maximum number of bytes that varchar(n) can store, it also depends on the character set of the database table, because the character set represents how many bytes a character takes up. For example, in the ascii character set, one character occupies one word section, then varchar(100) means that the maximum allowed storage of 100 bytes of data.

The case of a single field

As we know earlier, a row of records can only store 65535 bytes of data at most.

Suppose the database table has only one column of type varchar(n) and the character set is ascii. In this case, is the maximum value of n in varchar(n) 65535?

Don't worry about the conclusion, let's do an experiment to verify it first.

We define a field of type varchar(65535) and a database table whose character set is ascii.

CREATE TABLE test ( `name` VARCHAR(65535) NULL ) ENGINE = InnoDB DEFAULT CHARACTER SET = ascii ROW_FORMAT = COMPACT; 复制代码

See if you can successfully create a table:

As you can see, the creation failed.

From the error message, we can know that the maximum number of bytes in a line of data is 65535 (excluding large object types such as TEXT and BLOBs), which includes storage overhead.

Here comes the question, what is this storage overhead? In fact, it is "variable length field length list" and "NULL value list", that is to say, the maximum number of bytes of a row of data is 65535, which actually includes the number of bytes occupied by "variable length field length list" and "NULL value list" of. Therefore, when we calculate the maximum value of n in varchar(n), we need to subtract the number of bytes occupied by storage overhead.

This is because when we store data whose field type is varchar(n), it is actually divided into three parts for storage:

  • real data

  • The number of bytes occupied by real data

  • NULL flag, if NULL is not allowed, this part does not need

In this case, what is the number of bytes occupied by the "NULL value list"?

When I created the table earlier, the field is allowed to be NULL, so 1 byte will be used to represent the "NULL value list".

In this case, what is the number of bytes occupied by the "variable field length list"?

The number of bytes occupied by the "variable field length list" = the sum of the number of bytes occupied by all "variable field lengths".

Therefore, we must first know how many bytes are required to represent the "variable length field length" of each variable length field? The specific situations are divided into:

  • Condition 1: If the maximum number of bytes allowed to be stored in the variable-length field is less than or equal to 255 bytes, 1 byte will be used to represent the "length of the variable-length field";

  • Condition 2: If the maximum number of bytes allowed to be stored in the variable-length field is greater than 255 bytes, 2 bytes will be used to indicate the "length of the variable-length field";

The field type here is varchar(65535), and the character set is ascii, which means that the maximum number of bytes allowed to be stored in a variable-length field is 65535, which meets the second condition, so 2 bytes are used to represent the "variable-length field length".

Because our case has only 1 variable-length field, so "variable-length field length list" = the number of bytes occupied by 1 "variable-length field length", that is, 2 bytes.

Because when we calculate the maximum value of n in varchar(n), we need to subtract the number of bytes occupied by the "variable field length list" and "NULL value list". Therefore, in the case that the database table has only one varchar(n) field and the character set is ascii, the maximum value of n in varchar(n) = 65535 - 2 - 1 = 65532.

Let's test first to see if varchar(65533) is feasible?

It can be seen, or not, let's see if varchar(65532) is feasible?

As you can see, the creation was successful. It shows that our deduction is correct. When calculating the maximum value of n in varchar(n), it is necessary to subtract the number of bytes occupied by the "variable field length list" and "NULL value list".

Of course, the example above is for the case where the character set is ascii. If UTF-8 is used, the calculation method of the maximum data that varchar(n) can store is different:

  • Under the UTF-8 character set, a string requires at most three bytes, and the maximum value of n in varchar(n) is 65532/3 = 21844.

What is said above is only for the calculation method of a field.

The multi-field case

If there are multiple fields, ensure that the length of all fields + the number of bytes occupied by the variable-length field byte list + the number of bytes occupied by the NULL value list <= 65535.

Here is an example of a multi-field situation (thanks to @Emoji for the example provided)

How does MySQL handle row overflow?

The basic unit of interaction between disk and memory in MySQL is the page. The size of a page is generally 16KB, which is 16384 bytes, and a column of type varchar(n) can store up to 65532 bytes. Some large objects such as TEXT and BLOB may To store more data, a page may not be able to store a record at this time. At this time, row overflow will occur, and more data will be stored in another "overflow page".

If a data page cannot store a record, the InnoDB storage engine will automatically store the overflowed data in the "overflow page". In general, InnoDB data is stored in "data pages". But when a row overflow occurs, the overflowed data will be stored in the "overflow page".

When a row overflow occurs, only a part of the data of the column will be saved in the real data of the record, and the remaining data will be placed in the "overflow page", and then the real data will use 20 bytes to store the address pointing to the overflow page, so that The page where the remaining data can be found. Roughly as shown in the figure below.

The above is the processing of the Compact row format after row overflow occurs.

The two row formats, Compressed and Dynamic, are very similar to Compact, the main difference is that there are some differences in handling row overflow data.

These two formats adopt a complete row overflow method. Part of the data of the column will not be stored in the real data of the record, and only a 20-byte pointer is stored to point to the overflow page. The actual data is stored in the overflow page, which looks like this:

Summarize

How is the NULL value stored in MySQL?

MySQL's Compact row format will use "NULL value list" to mark columns with NULL values, and NULL values ​​will not be stored in the real data part of the row format.

The NULL value list will occupy 1 byte of space. When all fields in the table are defined as NOT NULL, there will be no NULL value list in the row format, which can save 1 byte of space.

How does MySQL know the size of data actually occupied by varchar(n)?

In MySQL's Compact row format, the "variable-length field length list" is used to store the actual data size occupied by variable-length fields.

What is the maximum value of n in varchar(n)?

A row of records can store a maximum of 65535 bytes of data, but this includes "the number of bytes occupied by the variable-length field byte list" and "the number of bytes occupied by the NULL value list". Therefore, when we calculate the maximum value of n in varchar(n), we need to subtract the number of bytes occupied by these two lists.

If a table has only one varchar(n) field, and NULL is allowed, the character set is ascii. The maximum value of n in varchar(n) is 65532.

Calculation formula: 65535 - the number of bytes occupied by the variable-length field byte list - the number of bytes occupied by the NULL value list = 65535 - 2 - 1 = 65532.

If there are multiple fields, ensure that the length of all fields + the number of bytes occupied by the variable-length field byte list + the number of bytes occupied by the NULL value list <= 65535.

How does MySQL handle row overflow?

If a data page cannot store a record, the InnoDB storage engine will automatically store the overflowed data in the "overflow page".

The compact row format handles row overflow as follows: when a row overflow occurs, only a part of the data in the column will be saved at the real data of the record, and the remaining data will be placed in the "overflow page", and then the real data will be stored in the "overflow page". Use 20 bytes to store the address pointing to the overflow page, so that the page where the remaining data is located can be found.

These two formats, Compressed and Dynamic, adopt a complete row overflow method. Part of the data of the column will not be stored in the real data of the record, and only a 20-byte pointer is stored to point to the overflow page. The actual data is stored in overflow pages.

 

 

Guess you like

Origin blog.csdn.net/Javatutouhouduan/article/details/128150631