How to create indexes for long byte fields in MySQL


Preface

If the number of bytes of the field data to be indexed is too long, then some unnecessary performance will be consumed during the search process. Therefore, this article describes how to reasonably index long byte fields


Tip: The following is the content of this article, the following cases are for reference

Indexing method

1. Prefix index

Given that MySQL supports prefix indexes, we can create a prefix index for long fields. The syntax is as follows:

alter table T add index index01(email(6));

The above SQL statement is the prefix index. Imagine that if it is a QQ mailbox, the first six digits are part of the QQ number, which takes up little space.

But what if it is not a QQ mailbox? If the first 6 digits of the mailbox are strings, it is very likely to cause duplication. For example, [email protected]and [email protected], the first 6 characters are all zhangs, but this will increase the number of searches.

So how to choose the prefix digits?: It depends on the discrimination of the index. If the discrimination of the index is large, the prefix digits can be taken less

Here is an algorithm for reference

先计算出索引的区分度
select count(distinct 字段名) as dis from T\
然后依次尝试不同的前缀长度
select 
	count(distinct left(字段名,4)) as dis4,
	count(distinct left(字段名,5)) as dis5,
	count(distinct left(字段名,6)) as dis6,
	count(distinct left(字段名,7)) as dis7,
from T
由于使用前缀索引必然导致区分度下降 , 因此可以设定一个可接受范围,5%
然后计算出dis4~dis7中不低于dis*0.95的值,选中最小的长度做为前缀索引长度

Note: Prefix index and covering index cannot be used at the same time

2. Reverse order storage

We have not used the ID card as an index before, but using flashback storage to index the last six digits of the ID card can provide a greater degree of discrimination.

3. Hash field

To use the hash field, you need to create another integer field in the table, and then perform a hash ( crc32()) operation on a field to get the result.
This can reduce the length of the field and improve the discrimination. But after the hash, you can only Equivalent search on the field

alter table T add id_crc int unsigned ,add index(id_crc) //建立hash字段
select field from T where id_crc = crc32('目标字段')

Guess you like

Origin blog.csdn.net/qq_45596525/article/details/115056777