MySQL database (8)

Table of contents

1. What is an index?

1.1 Principle of indexing

1.2 Advantages and Disadvantages of Indexes

2. Use of index

2.1 View index

2.2 Manually create indexes

2.3 Delete index

3. The underlying data structure of MySQL index

3.1 B-tree

3.2 B+ tree


1. What is an index?

An index is a special file that contains reference pointers to all records in the data table. You can create an index on one or more columns in the table and specify the type of index. Each type of index has its own data structure implementation.

In layman's terms, an index is a dictionary's query directory to speed up queries.

1.1 Principle of indexing

The index of the database is established with column dimensions. Since the database has many columns, there can also be many indexes.

  • Indexes are stored on the hard disk of the database server , specifically, in files on the hard disk.

Because the amount of data stored in the database is large, we want to create an index so that we can quickly lock the specified data when querying. The index built based on these huge amounts of data is also very large, and generally it will Create multiple different indexes. Therefore, it is impossible for all indexes to be stored in memory.

  • The index trades space for time
  • If no index is created, the default is full-text search.

1.2 Advantages and Disadvantages of Indexes

shortcoming:

  • Requires additional database hard drive space
  • It may slow down the speed of database additions , deletions and modifications. When performing additions, deletions and modifications , not only the data in the database must be changed, but also the maintenance index files must be changed. (But additions, deletions and modifications are relatively small in daily work)
  • Maintaining indexes consumes database resources.

advantage:

  • Greatly improve data query speed.
  • Reduce the number of hard disk IO times

2. Use of index

2.1 View index

show index from tb_name; --tb_name is a custom table name

  • First create a basic table and check the specific index. There is no index here.

  •  Delete the current table and create it again. This time, add the primary key constraint, and find that the database will automatically create an index named PRIMARY (the reason is that the primary key is not allowed to be repeated, so you need to query first for insertion or modification. This process is relatively complicated, so it will automatically Create an index to speed up queries)

  •  Delete again, this time add a unique constraint, and an index will be automatically created. The index name here is named after the column name.

  •  Delete the table again, this time adding primary key constraints and foreign key constraints. It is found that establishing foreign key constraints will automatically create an index, and the index is not unique.

Through the above example, indexes will be automatically generated for primary key constraints, unique constraints, and foreign key constraints in the MySQL database. 

2.2 Manually create indexes

create index index name on table name (field name);

There is a certain risk in manually creating an index, because the database is unavailable during the index creation process, and when the amount of data is large, index creation may fail or even cause the database to freeze. Therefore, the correct index creation is when creating the table.

When you really need to create an index based on a large amount of data, the method here is to back up the data in the table, prepare a new mysql server, then create the new table and index, and finally replace the mysql server that needs to be replaced. Close it and finally replace it with the new mysql server.

2.3 Delete index

drop index index name on table name;

When deleting an index, you also need to delete the main table with foreign key constraints first. This is the same as deleting a table. The child table also has a constraint effect on the parent table.

3. The underlying data structure of MySQL index

In order to understand the underlying data structure of MySQL, it depends on which storage engine MySQL uses. The current mainstream storage engine is Innodb , and the data structure here is on the hard disk. Next, it will be expanded with Innodb .

Indexes are for fast searches, so when selecting data structures, you will inevitably think of hash tables and binary search trees . However, there are fuzzy queries and range queries in the database , such as hash tables and binary search trees. These data can only be queried accurately, so these two data structures cannot be implemented. Then the red-black tree comes to mind, but the time complexity of the red-black tree is O(logn). The deeper the depth of the tree, the more efficient the query. Slow. Although this does not have a big impact in the memory, the data structure in the database is placed on the hard disk, which will greatly reduce the query efficiency! So here is the data structure of the database: B tree and B+ tree

3.1 B-tree

The core idea of ​​the B-tree is similar to the "binary search tree", but in order to reduce the height of the tree, the B-tree adopts an N-ary tree, which greatly reduces the height of the tree and the number of nodes. Why lower the height of the tree?

Because MySQL data is stored in hard disk files, when querying and processing data, you need to load the data from the hard disk into memory first. Hard disk IO operations are very time-consuming, so the more nodes and the higher the height of the tree, the higher the IO of the hard disk. The more operations you do. Such query efficiency will be greatly reduced.

The main features of B-trees are:

  • Multiple elements are stored in the nodes of the B-tree, and each internal node has multiple forks.
  • Store data in all nodes
  • Elements in the parent node will not appear in the child nodes.
  • All leaf nodes are located on the same level, leaf nodes have the same depth, and there are no pointer connections between leaf nodes.

The B-tree structure is roughly as follows:

Insert image description here

Although the B-tree is already ideal, there are still areas that can be optimized:

  • B-tree does not support fast search for range queries. For example: Still based on the above figure, we want to query data between 10 and 35. After finding 10, we need to return to the root node and traverse the search again, which requires multiple searches from the root node. Traversal and query efficiency need to be improved.
  • The complexity of the B-tree is very unstable. It depends on the position of the key in the tree. The best time complexity is O(1)

Due to certain shortcomings of B-tree, B+ tree was introduced in the database.

3.2 B+ tree

B+ tree is a modified version of B tree. Its differences from B tree are:

  • Leaf nodes store the complete set of data. Other nodes no longer store data, only the key.
  • Leaf nodes are connected using bidirectional pointers, and the lowest leaf node forms a bidirectional ordered linked list to facilitate range queries.

The general structure of the B+ tree:

Insert image description here

Since the B+ tree places all index items on leaf nodes, every time the data is queried, the leaf nodes need to be retrieved. Then the number of hard disk IOs retrieved each time has a direct relationship with the height of the tree, and The time complexity of the query is more stable.

However, since non-leaf nodes no longer store data, one hard disk IO operation can read more keys, and the index range can be larger and more accurate. Therefore, compared to the B-tree, the tree height of the B+ tree is theoretically higher than that of the B-tree. The tree should be short, which can reduce the corresponding hard disk IO operations.

The innodb search engine in the MySQL database uses the B+ tree

Guess you like

Origin blog.csdn.net/x2656271356/article/details/131859262