Index model and design considerations under the Innodb storage engine

index model

Under the innodb storage engine, only the leaf nodes of the primary key index (dinner index) actually store the record data. Other non-clustered index leaf nodes only store the value of the index column, and the leaf node stores the leaf node value and the primary key value of the record.

  • After the id is set as the primary key, the primary key index (clustered index) will be automatically added to the id column. If there is no primary key, the storage engine will find a column with a unique constraint to generate a clustered index. If not, the storage engine will generate it by itself An implicit column is used as the primary key for this data. The BTree diagram of the clustered index is as follows:

insert image description here

In the clustered index data structure, the leaf nodes store data, sort them according to the primary key, and build the index. At this time, the length of the primary key should not be too long, because the size of a directory page is generally 16kb, and the shorter the length of the primary key, the more a directory The more primary keys a page can store, the fewer directory page nodes will be in the future, which is easier to maintain. Note here that the root directory page is resident in memory, because every time you query based on the primary key, the root directory page must be loaded first, and as long as a directory page is loaded, an io will occur, which reduces performance, so the root directory page is resident. The memory is reduced by one io. At the same time, the root directory page saves the minimum primary key value of each directory page of the secondary node, so that it can be determined which directory page this primary key is on. The primary key and the corresponding data record, due to the uniqueness of the primary key, can no longer be searched if a record is determined at this time.

Example:

select * from tb where id = 20

Find the data with id equal to 20, first go to the root directory page, and find that the minimum value of 25 is 28, then go directly to find page 19, and page 19 finds that the minimum record of page 73 is 24, then directly search for page 24, Finally found it.

  • If an index is added to the name field, the index BTree diagram is as follows:

insert image description here
The leaf nodes of the non-clustered index BTree do not store real data, but store the primary key value. When a certain primary key value is determined, it will perform a back-to-table query, that is, take the id to the clustered index query.

sql example:

select * from tb where name = 'aabbcc'

At this time, the query is still based on name, from page 19 to page 24, and then the data of name = 'aabbcc' is determined, and then according to the corresponding primary key, take the primary key to the clustered index for back table lookup.

Why does the directory node save the primary key in addition to the name field? Because if the query condition has a primary key at this time, you can also compare whether the primary keys are equal.

  • If it is a joint index of the name age class_id field, the BTree diagram is as follows:
    insert image description here

At this point, the order in which the joint index is created should be

create index mul_name_age_class_id on tb(name,age,class_id);

Then in this case, it will be sorted by name first, if the name is the same, then sorted by age, if the age is the same, then sorted by class_id, that is, if you don't look at the name field, then age and classid are out of order of. This requires that the name field must be used when querying, otherwise the index will not be used.

select * from tb where age = 18 and name = 'zs' and class_id

The above sql can use the upper index, although the order of the index is name, age, classid, but the query conditions can be written in different order, and the optimizer will optimize it.

Index Design Specifications

  1. Suitable for indexing
  • Large amount of table data
  • Fields that often appear after where
  • Column value with high hash degree
  • Small additions, deletions and modifications
  • Column value using the same value as lookup
  1. Not suitable for indexing
  • The amount of data in the table is less than 1000. When the amount of data in the table is relatively small, do not create an index. The query speed has almost no impact. You still want to maintain the index, which will affect the performance.
  • Not suitable for low hash rate.
  • Do not use indexes for fields that are not used in where
  • Do not define redundant or duplicate indexes
  • Avoid using too many indexes on frequently updated tables
  1. index failure
  • There is an implicit conversion when the where condition filters

  • where function is used when filtering

  • not in the order of the union index

  • is not null or != missing index

  • The or keyword will lose the index (unless the columns on both sides of the or have indexes)

  • Like condition left fuzzy will lose index

  • The field after the range lookup of the joint index will lose the index

  1. Index Design Considerations
  • If it is a single-column index, try to select a column with a high hash degree to build the index

  • The primary key should be monotonically increasing as much as possible and kept in order

  • If it is a joint index, try to put the fuzzy search and range search at the back (such as the creation time), and put the high query degree and the equivalent search at the front.

  • Try not to have more than 6 indexes in a table

  • If you are indexing a column of type string, consider indexing only the first fixed number of characters in the field.

  • For example, a field that may be null in the table can be given a default value, the varchar type can be an empty string by default, and the int type can be 0 by default, so that there is no where is not null condition, and the index is lost.

Guess you like

Origin blog.csdn.net/qq_43750656/article/details/124448420