Mysql compact row format

 

 

 

 

 

InnoDB row format type (Compact format)

Schematic diagram of Compact row format:

 

 

 

 

How to store variable length fields in Mysql

 

MySQL supports variable length data types, varchar (m), varbinary (m), text, blob types, etc. The storage space occupied by these variable length data types is divided into two parts:

   1. The real data content

   2. Declare the number of bytes occupied

If the number of bytes occupied by the real data is not saved, the mysql server cannot determine how long the real data is, resulting in the inability to accurately fetch the data. So when we store real data, we need to store the number of bytes occupied by these data by the way.

In the compact row format, the length of all side length types is stored at the beginning of the row record to form a list, which is stored in the reverse order of the column .

CHAR is a fixed-length type, VARCHAR is a variable-length type VARCHAR(M), and M represents the maximum number of characters that can be stored. (MySQL5.0.3 used to be bytes before, and later it is characters)

 

Basic storage method

变长字段的长度列表,null值列表,数据头,column01的值,column02的值,column0n的值...
  •  

How to store variable length fields

  • There is a variable length field in a row
    # 假如有三个字段 id,name,age其中name是变长类型(Varchar)
    |id|name|age|
    |1|wang|18|
    
    磁盘里的存储为:
    0x04 null值列表 数据头 1 wang 18
    
    # 其中0x04表示name长度为4
  • Multiple rows of data are also stored next to each other
    # 假如有三个字段 id,name,age其中name是变长类型(Varchar)
    |id|name|age|
    |1|wang|18|
    |2|li|20|
    
    磁盘里的存储为:
    0x04 null值列表 数据头 1 wang 18 0x02 null值列表 数据头 2 li 20
  • How to store multiple variable length fields in a single row
    # 假如有三个字段 id,name,desc,age其中name,desc是变长类型(Varchar)
    |id|name|desc|age|
    |1|wang|shuaige|18|
    
    磁盘里的存储为:
    0x07 0x04 null值列表 数据头 1 wang shuaige 18
    
    # 其中0x04表示name长度为4,0x07表示desc的长度为7

 

 

 

 

How the list of NULL values ​​is stored

The Compact row format will uniformly manage the columns that can be NULL, and store one marked as being in the NULL value list. If there is no column that allows NULL to be stored in the table, the NULL value list does not exist.
When the value of the binary bit is 1, it means that the value of the column is NULL.
When the value of the binary bit is 0, it means that the value of the column is not NULL.

 

    # 假如有三个字段 id,name,age其中name是变长类型(Varchar)
    |id|name|age|
    |1|wang|18|
    
    磁盘里的存储为:
    0x04 00 数据头 1 wang 18
    
    # 其中00表示name,age都不为空,当然这里id是主键,肯定不为空,所以没记录

 

 

 

Record header information

In addition to the variable-length field length list and the NULL value list, there is also a record header information used to describe the record, which is composed of a fixed 5 bytes. 5 bytes are 40 binary bits, and different bits have different meanings, as shown in the figure:
Insert picture description here

Real data recorded

In addition to the data in the columns defined by ourselves, the real data recorded will also have three hidden columns: in
Insert picture description here
fact, the real names of these columns are actually: DB_ROW_ID, DB_TRX_ID, DB_ROLL_PTR.

If a table does not manually define a primary key, a Unique key will be selected as the primary key. If even the Unique key is not defined, a hidden column named row_id will be added to the table as the primary key by default. So row_id only exists when there is no custom primary key and unique key.

Row overflow data

VARCHAR(M) type column can occupy up to 65535 bytes. Among them, M represents the maximum number of characters stored in this type. If we use the ascii character set, one character represents one byte. Let's see if VARCHAR(65535) is available: the
Insert picture description here
error message means: MySQL occupies a record The maximum storage space is limited, except for BLOB or TEXT type columns, all other columns (excluding hidden columns and record header information) occupy a total length of bytes that cannot exceed 65535 bytes. In addition to the data of the column itself, this 65535 bytes also includes some other data. For example, in order to store a VARCHAR(M) type column, we actually need to occupy 3 parts of storage space:

  1. real data
  2. The length of the real data of the variable length field
  3. NULL value identification

If the VARCHAR type column does not have the NOT NULL attribute, it can only store up to 65532 bytes of data, because the length of the variable-length field occupies 2 bytes, and the NULL value identifier needs to occupies 1 byte.
Insert picture description here
Insert picture description here

Overflow caused by too much data in the record

The size of a page is generally 16KB, which is 16384 bytes, and a VARCHAR(M) type column can store up to 65533 bytes, so it may happen that a page cannot store a record.

In Compact and Reduntant row formats, for columns that occupy a very large storage space, only a part of the column data will be stored in the real data of the record, and the remaining data will be stored in several other pages, and then the real data will be recorded. The data location uses 20 bytes to store the addresses pointing to these pages (of course, these 20 bytes also include the number of bytes occupied by the data scattered in other pages), so that the remaining data can be found

 

Overflow caused by too much data in the record

The size of a page is generally 16KB, which is 16384 bytes, and a VARCHAR(M) type column can store up to 65533 bytes, so it may happen that a page cannot store a record.

In Compact and Reduntant row formats, for columns that occupy a very large storage space, only a part of the column data will be stored in the real data of the record, and the remaining data will be stored in several other pages, and then the real data will be recorded. The data location uses 20 bytes to store the addresses pointing to these pages (of course, these 20 bytes also include the number of bytes occupied by the data scattered in other pages), so that the remaining data can be found

 

 

Dynamic and Compressed formats

These two row formats are similar to the COMPACT row format, but they are a bit different when dealing with row overflow data. They will not store part of the data in the actual data of the record, but store all the data in other pages. Store the addresses of other pages in the recorded real data. In addition, the Compressed line format uses a compression algorithm to compress the page.

 

 

 

Reference materials:
How to store variable length fields in Mysql      https://blog.csdn.net/weixin_29491885/article/details/104846592

Analysis of innodb row format and data page and underlying principle of index   https://blog.csdn.net/java_eehehe/article/details/105529353           https://www.cnblogs.com/qcfeng/p/7325307.html

Guess you like

Origin blog.csdn.net/liuming690452074/article/details/113820877