MySQL database index, the principle of index, practical index creation, addition, deletion, modification and query of index

Table of contents

1. What is a database index?

1. The role of index

2. Classification of indexes

2. Principle of indexing

① Index structure: B-tree index, balanced tree

② New addition of btree

③ btree reading process

④ Comparison between B-tree and B+tree

3. How to create an index

1. Create a test table

2. Clustered index (primary key index)

① Clustered index (primary key index)

3. Non-clustered index

① Ordinary index

② Unique index

③ Full text index

④ Composite index

4. Create index based on sql (index actual combat)

1、=,>=,<=,<>,and,or

① from test_excel where user_cardid = 'xxx'

② from test_excel where user_cardid = 'xxx' and user_age = 'xxx'  and user_name = 'xxx'

③ from test_excel where user_cardid = 'xxx' and user_age = 'xxx'  or user_name = 'xxx'

④ from test_excel where user_cardid = 'xx' and user_age >= 80 and user_age <= 100 and user_name <> 'xx'

2、in,not in,is null,is not null

3、like,order by

① from test_excel where user_name like '张三129%'

 4. Join table query (left join query left join)

Summarize


Take the MySQL database as an example:

1. What is a database index?

1. The role of index

        When there is a large amount of data in the database table, the index can greatly speed up the query speed of the table data, reduce the number of IO operations, and reduce the consumption of resources. It is a typical processing method that trades space for time;

2. Classification of indexes

        In fact, there is no specified standard for index types (because the engines are different, here we take MySQL's InnoDB engine as an example), but they can generally be divided into three categories:

                1. Clustered index: also called primary key index, all data is sorted according to the primary key index;

                2. Non-clustered index: A series of ordinary indexes can be summarized into non-clustered indexes, such as: ordinary index, unique index, full-text index;

                3. Composite index: Many are also called combined indexes. As you can tell by the name, they are indexes that combine many fields;

2. Principle of indexing

        Without creating an index, the database query data is matched from top to bottom. If there are 1 million pieces of data, you are out of luck, and the data you want happens to be the last one, then hahaha, and the index is to solve the problem. of this issue;

       ① Index structure: B-tree index, balanced tree (many people also call it B-tree, but in fact this literal translation is not good)

        The btree index has several very important concepts: order, root node, internal node, leaf node. Of course, you must also have the concept of disk block, because the index and data are stored in the disk, and each time you read the data, it is Reading a "magnetic block" can also be simply understood as "paging". A "magnetic block" contains a lot of data. Reading a disk is an IO operation, and all nearby data will be read into the buffer at once. area, and then find the data you want in the buffer. The amount of data read at one time is related to your operating system;

        Then the question comes, what is btree, look at the picture (I didn’t find a suitable picture on the Internet, so I drew one myself, the legendary soul painter)

         As shown in the picture:

                1. Root node: Part 1 of the magnetic block;

                2. Internal nodes: magnetic block 2, magnetic block 3 part;

                3. Leaf nodes: parts 4, 5, 6, 7, 8, 9, and 10 of magnetic blocks;

                4. Order: The maximum number of leaf nodes under a node is called the order. As shown in the figure above, the maximum number of leaf nodes is 4, so it is called a 4th order btree;

        And there is a fixed ratio between them. A balanced tree must follow a certain ratio (this is the time to test your mathematical understanding):

                1. Suppose we define the maximum number of leaf nodes under each node as m (actually the order is m), then each internal node contains at least m/2 child nodes;

                2. If the root is not a leaf node, the root must contain 2 nodes, and the nodes can be internal nodes or leaf nodes;

                3. (This is very important) All leaf nodes are on the same layer (that is, the height is the same) and do not have any parameters;

        ② New addition of btree

                Taking the above figure as an example, we make a new addition of a fourth-order btree:

                1. Insert 4 elements: 1, 2, 3, 4

                2. Insert another element: 5. Because this is a 4th-order btree, when it reaches 5, it must be split. So why is 3 split? Another concept is added here: taking Median

                3. Insert 2 more elements: 6, 7 

                 4. Insert another element: 8, and there is another split here. There are already 5 elements 4, 5, 6, 7, and 8, so the middle value 6 is split.

                5. The same operation will be done to add later. I will write them one by one here. The same is true for deleting. Just do it in reverse.

        ③ btree reading process

                

                 As shown in the figure, assuming that the data of "7" needs to be read, the process is: first read the root node (magnetic block 1), after matching, it is known that "7" is in (magnetic block 2), and then obtain (magnetic block 2) ), then read the subscript of (magnetic block 2), and after matching, we know that "7" is in (magnetic block 6), then obtain the subscript of (magnetic block 6), read (magnetic block 6), and finally load it into The data of "7" is obtained in the buffer.

                The entire process only performs 3 IO operations, which can greatly reduce the number of IO operations when the amount of data is large. As can be seen from the above figure, the number of IO operations after indexing is only related to the height of the btree (key point).

        ④ Comparison between B-tree and B+tree

                That’s almost the introduction to btree. Let’s compare B-tree and B+tree. Note that these 3 items are from Baidu search, and we haven’t studied b+tree yet.

                1. Comparison of space occupied. The internal nodes of b+tree do not have pointers to specific information (b-tree does), so b+tree takes up less space.

                2. Comparison of the reading process: b+tree always starts from the root node and then goes to the leaf node when reading (b-tree does not necessarily), so you will find that the query speed of each SQL is almost the same, and There is no obvious difference.

                3. Range search (the most important point): b-tree does not improve the efficiency of range search, but b+tree improves the efficiency of range search.

3. How to create an index

        1. Create a test table

CREATE TABLE test_excel (
  user_id varchar(255) NOT NULL COMMENT '表主键id',
  user_name varchar(255) NOT NULL COMMENT '用户姓名,不能为空',
  user_age int(11) DEFAULT NULL COMMENT '用户年龄,允许为空',
  user_cardid varchar(255) NOT NULL COMMENT '身份证号,不能为空,不允许重复',
  PRIMARY KEY (user_id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

                 When I was importing EasyExcel in my previous blog, I happened to create a test table, and the data was also imported. The test data was 2 million pieces, and this test table was used for all the subsequent ones.

        2. Clustered index (primary key index)

                ①Clustered  index (primary key index)

                        创建:alter table test_excel add primary key (user_id);

                        Delete: alter table test_excel drop primary key

        3. Non-clustered index

                ① Ordinary index

                        创建:alter table test_excel add index index_name(user_cardid)

                        删除:alter table test_excel drop index index_name

                ② Unique index

                        创建:alter table test_excel add unique index_age(user_age)

                        删除:alter table test_excel drop index index_age

                ③ Full text index

                        创建:alter table test_excel add fulltext index_name(user_name,user_cardid)

                        删除:alter table test_excel drop index index_name

                ④ Composite index

                        创建:alter table test_excel add index index_name(user_name,user_age,user_cardid)

                        删除:alter table test_excel drop index index_name

4. Create index based on sql (index actual combat)

        1、=,>=,<=,<>,and,or

                ① from test_excel where user_cardid = 'xxx'

                        Use a normal index: alter table test_excel add index index_cardid(user_cardid)

                ② from test_excel where user_cardid = 'xxx' and user_age = 'xxx'  and user_name = 'xxx'

                        Contains (=, and) . In this case, you can use either ordinary index or compound index. It is recommended to use ordinary index here, because ordinary index only needs to create one field to take effect. Suggestion: put the indexed field at the front of the conditional query, such as: user_cardid field at the front

                        Use a normal index: alter table test_excel add index index_cardid(user_cardid)

                        It is recommended to use a composite index when the values ​​are repeated and there are many ands : alter table test_excel add index index_name(user_cardid,user_age,user_name)

                ③ from test_excel where user_cardid = 'xxx' and user_age = 'xxx'  or user_name = 'xxx'

                        Contains (=, and, or) . For the processing of and, you can refer to the method ②. For or, you must create a separate index, otherwise it will have no effect.

                        Use a normal index: alter table test_excel add index index_name(user_name)

                ④ from test_excel where user_cardid = 'xx' and user_age >= 80 and user_age <= 100 and user_name <> 'xx'

                        Contains (=, and, >=, <=, <>) . Both ordinary indexes and composite indexes can be used. Composite indexes are used here.

                        Use a composite index: alter table test_excel add index index_name(user_cardid,user_age,user_name)

                        Note: Range queries have limitations on the "data amount" of the query. If the amount of data is too large, the index will not be used, and the range query index level can only be at the "range" level.

        2、in,not in,is null,is not null

                   Just create a normal index. In fact, it is the same as point 4 above. There is no need to write repeatedly, but you also need to pay attention to the amount of data queried. If the amount of data is too large, it will not be indexed.

        3、like,order by

                    ① from test_excel where user_name like '张三129%'

                        This also uses ordinary indexes: alter table test_excel add index index_name(user_name)

                        Note: Like '%xxx', indexes cannot be used if % is on the far left. Indexes can be used on other positions. If % must be on the far left, you can only use covering indexes. If there is also a user_age field: alter table test_excel add index index_name(user_name,user_age)

                       ② from test_excel where user_age < 5000 order by user_age desc

                        Use a normal index: alter table test_excel add index index_age(user_age)

        4. Join table query (left join query left join)

from test_excel e 
    left join test_pay p 
    on e.user_id = p.user_id 
    where e.user_id = '1499561693366382961' and e.user_name = '张三1299000'
    and p.pay_id = 1 and p.pay_number = 'xxx'

        使用复合索引:alter table test_excel add index index_sum(user_id,user_name)
                                 alter table test_pay add index index_sum(pay_id,pay_number) 

         When querying connected tables, there are usually a lot of fields (here is just an example), so it is generally recommended to build a composite index. Of course, building a normal index is also effective, but for MySQL, it is only If you can use one index, when there are too many fields, the ordinary index will definitely not be as efficient as the compound index;

 

Summarize

        When using indexes, you must remember the leftmost principle. The indexed columns must be at the front of the conditional query. When using indexes, you need to pay great attention to the order, otherwise it will be difficult to index;

        Compound index query condition sequence: If you create an index (alter table test_excel add index index_name(A,B,C)), then the condition sequence of your query can only be (A), (AB), (ABC), this kind of sorting efficiency is the highest: (where a='xxx' and b='xxx' and c='xxx');

        Query all indexes of a table: show index from test_excel

        To query whether a statement has entered the index: explain select * from test_excel, just add the explain keyword;

        Index type level: all, index, range, ref, eq_ref, system, const; "all, index" is equivalent to not using the index, and the index must be optimized to at least the range level

        Note: Sometimes the query speed will be slower after creating an index. First check whether there is any conflict in the created index. If there is no problem with the statement, you can delete the index and create it again.

Guess you like

Origin blog.csdn.net/xiaobug_zs/article/details/123222578
Recommended