MySQL database index (on)

Previous review:

1. The data page consists of seven parts, including File Header (information describing the page), Page Header (information describing the data), Infimum + Supremum (the maximum and minimum values ​​of virtual data in the page), and User Records (the real user data) Stored part), Free Space (real data increase the divided part of the space), Page Directory (the relative position of the record in the page, the location of the slot storage), File Trailer (check whether the 16kb data page is complete).

2. The data in each page is a singly linked list, which is maintained by the data record header information next_record, which records the distance in bytes relative to the next data of this data. A page in memory is a data structure of a doubly linked list, maintained by the current page number, previous page number, and next page number of page information.

3. The data in the page will be grouped, the virtual minimum data is one group, the largest data group has a maximum of 8 data, and the full amount is divided into two ordinary grouped data. The offset of the last piece of data in each packet relative to the page is the slot data. The search for the data on this page is based on the dichotomy method. The grouping of the data is determined by the slot, and then the search is performed.

 

 

Lookup without index:

In one page:

If the data is in a page and the primary key is used as the condition, then we can use the method of dichotomy to find it. What we can know from the previous article is that the data in each page will be grouped, and then in the page The Page Diractory part of the information stores the information of the slot, locates the group where the data is located through the slot, and then starts to search for the data.

If we do not use the primary key as the search condition, then the method we search for through the dichotomy must not work. At this time, we can only search the data one by one very hard.

In many data pages:

If we have to search for unindexed data in many data pages, it is even more unfortunate. Only one piece of data is searched next to each other. If there are hundreds of millions of data in our database, it is very difficult and unrealistic. a way.

The implementation of the index:

 1: First, let's create a table that we need to use today:

2: We insert data:

 

3: We use a simplified diagram to show the status information of these data stores:

 We mentioned a problem in the previous article, and that is the orderliness of the data. We can see a clue from the above figure, that is, all the data are arranged by the size value of the primary key.

4: Suppose we can only store three pieces of data on one page. This is an assumption. Many pieces of data can be stored on one page, but we assume that only three pieces of data can be stored here for the convenience of demonstration, and then we add a new piece of data:

What we can see is that our newly inserted data is divided into a new page, but the strange point is that the number of the previous page is 10, why is the next page 28? Therefore, we must be clear that the page numbers are discontinuous. The doubly linked list of our pages is the connection that maintains the order of the pages. We also mentioned this problem earlier, using the File Header to store the page number, the previous page number, Next page number. We mentioned a problem earlier that the data has a certain order and needs to be sorted according to the size of the primary key, so it should be as shown in the following figure:

5: We have now figured out the arrangement of the data, then we are inserting multiple pieces of data in a row to take a look at the data intuitively:

Is there a problem? If we still have a lot of such data in the database, even if every page has a slot, it can help us quickly locate the data in the same page. So, if we are not on a page, do we just traverse the doubly linked list of pages? The answer is obviously not, so we need to create something new for each page.

6: Directory entry:

What we can see is that we have created four new things, what are they? This is the directory entry, one page corresponds to one directory entry, and the directory entry stores the smallest value of the page number + the primary key of the current page. Do you have an idea here? Is this thing really similar to our own data? Remember that we mentioned the record_type attribute of the record header information of the data earlier, 0 represents normal data, 1 represents non-leaf node data, 2 represents virtual minimum data, and 3 represents virtual maximum data. Therefore, the distinction between our own data and this directory item data is based on this attribute. Except for this, our catalog entry does not have the three columns that are automatically added by the database. Therefore, we need to allocate a page to these directory items at this time, so that they can be stored:

7: When we get here, we can probably understand the structure of the index. Then the question comes again, what should we do when we have more directory entry pages? Then what else can we do, continue to build the previous directory item page. Use the page number of the current directory entry and the primary key of the smallest directory entry on the current page.

8: Index search is achieved through layer-by-layer positioning. The top page is called the root node, the middle page is called the inner node, and the bottom layer is called the leaf node. We use the slot dichotomy in the page to quickly locate the page or group where the data is located, and we are traversing and searching. Finally, let me say that this thing is what we call the B+ tree, and it is not too big.

Summarize:

  What we talked about above is the clustered index constructed by the primary key that the Innodb storage engine will create for each table. The clustered index is the index in which all data is in the leaf nodes.

Secondary index:

 After talking about the clustered index, let's talk about what is the secondary index. The secondary index, as the name suggests, is the index we created by ourselves. Sometimes our business needs, we need to search, sort, group, etc. according to a certain or certain fields. At this time, in order to speed up the speed, the best solution is to create indexes of these columns:

Let's first look at the leaf node, which is composed of the columns we need to use + the primary key column, and we use the columns to be used as the basis for sorting. Then we look at the table of contents items, which are composed of the columns we need to use + page numbers. By analogy, we can draw the following conclusions:

  1. The difference between a secondary index and a clustered index is that leaf nodes do not include complete data

  2. The secondary index stores only the columns and primary keys we need to use. What if we want data from other columns? Back to the table: The primary key obtained through the secondary index is then searched in the clustered index.

Joint index:

 We looked at the clustered index and secondary index above. Next, let's look at what a joint index is. In fact, we can see that a joint refers to the combination of multiple columns: we use c2 and c3 to create

What we can see is that a joint index is to create an index with multiple fields, and then sort according to the order of the corresponding columns. For example, in the above figure, we use two columns, c2 and c3, so we are sorting according to c2 now. If c2 is the same, we will sort according to c3. Here is a note in advance: the use of a joint index must start from the leftmost column, that is, the column that starts sorting first. We'll go into more detail in the next article.

Index of Myisam storage engine:

 The storage engine we mainly used was Innodb, but for the sake of completeness of knowledge, let's introduce the index of the myisam storage engine. In fact, it is clear in one sentence. The leaf nodes of the myisam index do not store real data, only the value of the primary key. Clear enough, this is just a secondary index.

Index creation and deletion syntax:

1: Create an index when building a table: choose either index or key

 create tabel table name (column information) index|key index name (column used to create the index)

2: When modifying the table structure, we create an index:

 alter table table name add key|index index name (column used by the index)

3: Modify the table structure to delete the index:

 alter table table name drop key|index index name;

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

    

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324834910&siteId=291194637