Index creation error scenarios in MySQL

A colleague reported that there was an error when creating an index in a certain MySQL database. The simulated error was as follows:

CREATE INDEX t_reg_code_idx USING BTREE ON t(reg_code)
BLOB/TEXT column 'reg_code' used in key specification without a key length

From this prompt, we can know that a BTREE index is created for the reg_code field of the T table, and the field type of this reg_code column is BLOB or TEXT. The error message says that a length definition needs to be included in the key description. What does this mean?

This library is MySQL 8.0. From the official manual, you can find this description of Index Prefixes (as shown below), which means that if you create an index on a BLOB or TEXT column, you must specify the prefix length of the index. For InnoDB tables using REDUNDANT or COMPACT row format, the index prefix can be up to 767 bytes. For InnoDB tables using DYNAMIC or COMPRESSED row format, the index prefix can be up to 3072 bytes. For MyISAM tables, the prefix length can be up to reaches 1000 bytes.

https://dev.mysql.com/doc/refman/8.0/en/column-indexes.html

Index Prefixes

With col_name(N) syntax in an index specification for a string column, you can create an index that uses only the first N characters of the column. Indexing only a prefix of column values in this way can make the index file much smaller. When you index a BLOB or TEXT column, you must specify a prefix length for the index. For example:

CREATE TABLE test (blob_col BLOB, INDEX(blob_col(10)));

Prefixes can be up to 767 bytes long for InnoDB tables that use the REDUNDANT or COMPACT row format. The prefix length limit is 3072 bytes for InnoDB tables that use the DYNAMIC or COMPRESSED row format. For MyISAM tables, the prefix length limit is 1000 bytes.

Note:Prefix limits are measured in bytes, whereas the prefix length in CREATE TABLE, ALTER TABLE, and CREATE INDEX statements is interpreted as number of characters for nonbinary string types (CHAR, VARCHAR, TEXT) and number of bytes for binary string types (BINARY, VARBINARY, BLOB). Take this into account when specifying a prefix length for a nonbinary string column that uses a multibyte character set.

If a search term exceeds the index prefix length, the index is used to exclude non-matching rows, and the remaining rows are examined for possible matches.

49c34f2b524929adc0f9198135a7e60e.png

In the MySQL 5.7 official manual, the restrictions on index prefixes are different. The index prefix of the InnoDB table can reach up to 1000 bytes (combined with the names and experiments in other chapters, I think it is wrong and should be 3072 bytes. ), but only if innodb_large_prefix is ​​set (this parameter only takes effect for DYNAMIC or COMPRESSED row format, and is invalid for REDUNDANT or COMPACT row format), otherwise it can only reach 767 bytes.

https://dev.mysql.com/doc/refman/5.7/en/column-indexes.html

Index Prefixes

With col_name(N) syntax in an index specification for a string column, you can create an index that uses only the first N characters of the column. Indexing only a prefix of column values in this way can make the index file much smaller. When you index a BLOB or TEXT column, you must specify a prefix length for the index. For example:

CREATE TABLE test (blob_col BLOB, INDEX(blob_col(10)));

Prefixes can be up to 1000 bytes long (767 bytes for InnoDB tables, unless you have innodb_large_prefix set).

Note:Prefix limits are measured in bytes, whereas the prefix length in CREATE TABLE, ALTER TABLE, and CREATE INDEX statements is interpreted as number of characters for nonbinary string types (CHAR, VARCHAR, TEXT) and number of bytes for binary string types (BINARY, VARBINARY, BLOB). Take this into account when specifying a prefix length for a nonbinary string column that uses a multibyte character set.

If a search term exceeds the index prefix length, the index is used to exclude non-matching rows, and the remaining rows are examined for possible matches.

0bd382c8086ef499ad7bd8d95cb8ebf6.png

Therefore, you can see that MySQL 5.7 and 8.0 have made some adjustments in the setting of the index prefix length limit for InnoDB tables, but the limit is still there. This is a feature that is different from Oracle and others.

You can verify the limitations of MySQL 8.0 on prefix length through experiments. For example, create an InnoDB table with row format COMPACT and specify the prefix length 10000. It will prompt that the maximum key length can only be 767 bytes.

create table test01 (
id int(30) not null auto_increment,
t_a text,
primary key(id),
index idx_t_a(t_a(10000))
) COLLATE='gbk_chinese_ci' ENGINE=InnoDB ROW_FORMAT=COMPACT;


SQL 错误 [1071] [42000]: Specified key was too long; max key length is 767 bytes

Create an InnoDB table whose row format is COMPRESSED, specify the prefix length 10000, and prompt that the maximum key length can only be 3072 bytes.

create table test01 (
id int(30) not null auto_increment,
t_a text,
primary key(id),
index idx_t_a(t_a(10000))
) COLLATE='gbk_chinese_ci' ENGINE=InnoDB ROW_FORMAT=COMPRESSED;


SQL 错误 [1071] [42000]: Specified key was too long; max key length is 3072 bytes

Putting aside the technical issues, I asked my colleagues about the background of this operation. The original requirement was that a certain manufacturer's ETL task needed to import data from the source database into the target database. The source database field was of VARCHAR type, and the target database was defined as TEXT, which indirectly caused the problem. of this question. One possible reason is speculated. Because both VARCHAR and TEXT can store string type data, there is no distinction between them. It only needs to be able to store the required data. Another possibility is that in order to save trouble, there is no need to pay attention to the source library and The length defined by the target library string type is directly set to the TEXT type, ensuring that it can be saved.

Regardless of the reason, large field types such as TEXT are generally not recommended as index search fields, because they often store a lot of characters, which will take up more index storage space and affect the index's discrimination.

Therefore, although this problem appears to be a technical problem, it actually stems from unreasonable design. When we design applications and databases, if we can consider more rationality and avoid some so-called trouble-free things, it may be a problem in the actual use process. It will be smoother and get twice the result with half the effort.

If you think this article is helpful, please feel free to click "Like" and "Reading" at the end of the article, or forward it directly to pyq,

01c283e4c8cd835f6f829534774edd34.png

Recently updated articles:

" MySQL character set conversion operation scenario "

" Financial Knowledge - Secondary Market "

" Introduction to the poweroff command "

" Detailed Scenario Explanation of MySQL 8.0 New Password Policy "

" Troubleshooting and Solution Paths for Several Data Leakage Scenarios "

Recent hot articles:

" Recommend a classic paper on Oracle RAC Cache Fusion "

" The shock that the open source code of the "Red Alert" game brings to us "

Article classification and indexing:

Classification and indexing of 1,300 articles on public accounts

Guess you like

Origin blog.csdn.net/bisal/article/details/133053232