15 questions tell you what is a MySQL index?

Table of contents

What is an index?

Advantages and disadvantages of indexing?

The role of the index?

Under what circumstances do you need to build an index?

Under what circumstances is the index not built?

indexed data structure

What is the difference between Hash index and B+ tree index?

Why is B+ tree more suitable for implementing database index than B tree?

What are the categories of indexes?

What is the leftmost matching principle?

What is a clustered index?

What is a covering index?

Index design principles?

When will the index become invalid?

What is a prefix index?


1. What is an index?

An index is a data structure used by the storage engine to improve the access speed of the database table . It can be compared to the directory of a dictionary, which can help you quickly find the corresponding records.

Indexes are generally stored in files on disk, which take up physical space.

2. What are the advantages and disadvantages of indexing?

advantage:

  • Speed ​​up data lookups
  • Adding indexes to fields used for sorting or grouping can speed up grouping and sorting
  • Speed ​​up table-to-table joins

shortcoming:

  • Indexing requires physical space
  • It will reduce the efficiency of adding, deleting, and modifying tables, because every time a table record is added, deleted, and modified, it is necessary to dynamically maintain the index , resulting in longer time for adding, deleting, and modifying

3. What is the role of the index?

The data is stored on the disk. When querying the data, if there is no index, all the data will be loaded into the memory and retrieved sequentially. The number of disk reads is large. With the index, there is no need to load all the data, because the height of the B+ tree is generally 2-4 layers, and it only needs to read the disk 2-4 times at most, and the query speed is greatly improved.

4. Under what circumstances do you need to build an index?

  1. Fields frequently used for queries
  2. The fields that are often used for connection are indexed, which can speed up the connection
  3. Fields that often need to be sorted are indexed, because the index is already sorted, which can speed up the sorting query

5. Under what circumstances is the index not built?

  1. Fields that are not used in the where condition are not suitable for indexing
  2. The table has fewer records. For example, if there are only a few hundred pieces of data, there is no need to add an index.
  3. Need to add and delete frequently. Need to evaluate whether it is suitable for indexing
  4. Columns participating in column calculations are not suitable for indexing
  5. Fields with low discrimination are not suitable for indexing, such as gender, which only has three values: male/female/unknown. Adding an index will not improve query efficiency.

6. Index data structure

The data structure of the index mainly includes B+ tree and hash table, and the corresponding indexes are B+ tree index and hash index respectively. The index types of the InnoDB engine include B+ tree index and hash index, and the default index type is B+ tree index.

7. B+ tree index

The B+ tree is implemented based on the B tree and leaf node sequential access pointers. It has the balance of the B tree and improves the performance of interval queries through sequential access pointers.

In the B+ tree, the keys in the nodes are arranged in increments from left to right. If the left and right adjacent keys of a pointer are keyi and keyi+1 respectively, then all the keys pointed to by the pointer are greater than or equal to keyi and less than or equal to keyi+1 .

When performing a search operation, first perform a binary search on the root node, find the pointer where the key is located, and then search recursively on the node pointed to by the pointer. Until the leaf node is found, then binary search is performed on the leaf node to find out the data item corresponding to the key.

The most commonly used index type in MySQL database is BTREE index, and the underlying layer is implemented based on the B+ tree data structure.

mysql> show index from blog\G;
*************************** 1. row ***************************
        Table: blog
   Not_unique: 0
     Key_name: PRIMARY
 Seq_in_index: 1
  Column_name: blog_id
    Collation: A
  Cardinality: 4
     Sub_part: NULL
       Packed: NULL
         Null:
   Index_type: BTREE
      Comment:
Index_comment:
      Visible: YES
   Expression: NULL

8. Hash index

The hash index is implemented based on the hash table. For each row of data, the storage engine will perform hash calculation on the index column to obtain the hash code, and the hash algorithm should try to ensure that the hash code value calculated by different column values ​​is Differently, the value of the hash code is used as the key value of the hash table, and the pointer to the data row is used as the value value of the hash table. In this way, the time complexity of finding a data is O(1), which is generally used for precise searching.

9. What is the difference between Hash index and B+ tree index?

  • Hash indexes do not support sorting because hash tables are unordered.
  • Hash indexes do not support range lookups .
  • Hash indexes do not support fuzzy queries and leftmost prefix matching of multi-column indexes.
  • Because there will be hash conflicts in the hash table , the performance of the hash index is unstable, while the performance of the B+ tree index is relatively stable, and each query is from the root node to the leaf node.

10. Why is B+ tree more suitable for implementing database index than B tree?

  • Since the data of the B+ tree is stored in the leaf nodes, the leaf nodes are all indexes, which is convenient for scanning the database. You only need to scan the leaf nodes once, but because the B tree also stores data in its branch nodes, we need to find For specific data, an in-order traversal is required to scan in order, so B+ trees are more suitable for interval queries, and range-based queries are very frequent in databases, so B+ trees are usually used for database indexes.
  • The nodes of the B+ tree only store the index key value, and the address of the specific information exists in the address of the leaf node. This allows more nodes to be stored in the page-based index. Reduce more I/O spending.
  • The query efficiency of the B+ tree is more stable, and any keyword search must take a path from the root node to the leaf node. All keyword queries have the same path length, resulting in the same query efficiency for each data.

11. What are the categories of indexes?

1. Primary key index : the only non-null index named primary, which does not allow null values.

2. Unique index : The value in the index column must be unique, but null values ​​are allowed. The difference between a unique index and a primary key index is that the unique index field can be null and there can be multiple null values, while the primary key index field cannot be null. The purpose of the unique index: to uniquely identify each record in the database table, mainly to prevent repeated insertion of data. The SQL statement to create a unique index is as follows:

ALTER TABLE table_name
ADD CONSTRAINT constraint_name UNIQUE KEY(column_1,column_2,...);

3. Composite index : The index created on the combination of multiple fields in the table will only be used when the left field of these fields is used in the query condition. When using the composite index, the principle of the leftmost prefix must be followed.

4. Full-text index : Full-text index can only be used on CHAR, VARCHAR and TEXT type fields.

5. Ordinary index : Ordinary index is the most basic index, it has no restrictions, and the value can be empty.

12. What is the leftmost matching principle?

If the leftmost index in the composite index is used in the SQL statement, then this SQL statement can use the composite index for matching. When a range query (>, <, between, like) is encountered, the matching will stop, and the subsequent fields will not use the index.

Create an index for (a,b,c), use a/ab/abc as the query condition to go to the index, use bc to not go to the index.

Create an index for (a, b, c, d), the query condition is a = 1 and b = 2 and c > 3 and d = 4, then the three fields a, b and c can use the index, but d cannot index. Because a range query was encountered.

As shown in the figure below, index (a, b) is established, a is globally ordered in the index tree, and b is globally unordered and locally ordered (when a is equal, it will be sorted according to b). Direct execution of the query condition b = 2 cannot use the index.

When the value of a is determined, b is ordered. For example, when a = 1, the value of b is 1, and 2 is an ordered state. When a = 2, the value of b is 1, and 4 is also an ordered state. When implementing a = 1 and b = 2, the a and b fields can use the index. However, when a > 1 and b = 2 are executed, the a field can use the index, but the b field cannot use the index. Because the value of a is a range at this time, it is not fixed, and the value of b is not ordered in this range, so the b field cannot use the index.

13. What is a clustered index?

InnoDB uses the primary key of the table to construct the primary key index tree, and the record data of the entire table is stored in the leaf nodes. The storage of the leaf nodes of the clustered index is logically continuous, connected by a doubly linked list, and the leaf nodes are sorted according to the order of the primary key, so the sorting search and range search for the primary key are faster.

The leaf nodes of the clustered index are the row records of the entire table. InnoDB primary keys use clustered indexes. Clustered indexes are much more efficient than non-clustered index queries.

For InnoDB, the clustered index is generally the primary key index in the table. If the specified primary key is not displayed in the table, the first unique index in the table that does not allow NULL will be selected. If there is no primary key and no suitable unique index, then InnoDB will generate a hidden primary key as a clustered index internally. The hidden primary key is 6 bytes in length, and its value will increase automatically as the data is inserted.

14. What is a covering index?

The selected data columns can only be obtained from the index, and there is no need to return to the table for secondary query, that is to say, the query column must be covered by the index used. For the secondary index of the InnoDB table, if the index can cover the queried columns, then the secondary query of the primary key index can be avoided.

Not all types of indexes can be covering indexes. The covering index needs to store the value of the index column, but the hash index and the full-text index do not store the value of the index column, so MySQL uses the b+ tree index as the covering index.

For a query that uses a covering index, use explain before the query, and the output extra column will be displayed as using index.

For example, user_like user like table, the combination index is (user_id, blog_id), user_id and blog_id are not null.

explain select blog_id from user_like where user_id = 13;

The Extra column of the explain result is Using index, the queried column is covered by the index, and the where filter condition conforms to the leftmost prefix principle, and the qualified data can be directly found through the index search , without the need to return to the table to query the data.

explain select user_id from user_like where blog_id = 1;

The Extra column of the explain result is Using where; Using index, the queried column is covered by the index, and the where filter condition does not conform to the leftmost prefix principle, and the qualified data cannot be found through the index search, but the qualified data can be found through the index scan . There is no need to return to the table to query data.

15. What is the design principle of the index?

 For fields that are often used as query conditions, indexes should be built to improve query speed

  • Index fields that frequently require sorting, grouping, and union operations
  • The higher the degree of discrimination of the index column , the better the effect of the index. For example, using a column with a low degree of discrimination such as gender as an index will have a poor effect.
  • Avoid indexing "large fields". Try to use fields with a small amount of data as indexes. Because MySQL maintains the field values ​​together when maintaining the index, this will inevitably cause the index to take up more space, and it will take more time to compare when sorting.
  • Try to use short indexes . When indexing longer strings, you should specify a shorter prefix length, because smaller indexes involve less disk I/O and the query speed is faster.
  • The more indexes the better, each index requires additional physical space, and maintenance also takes time.
  • Do not create indexes for fields that are frequently added, deleted, or modified. Assuming that a field is frequently modified, it means that the index needs to be rebuilt frequently, which will inevitably affect the performance of MySQL
  • Use the leftmost prefix principle .

Guess you like

Origin blog.csdn.net/agelee/article/details/129814150