I learned from the big guys in Ali for 3 months before I came to this MySQL study notes. Spring recruiting into the big factory is stable!

Preface

As the number of system users continues to increase, the importance of MySQL index is self-evident. For back-end engineers, only after understanding the index and its optimization rules and applying them in actual work, can they continuously improve system performance and development. Create a system with high performance, high concurrency and high availability.

This time the editor will first introduce the various concepts in the MySQL index, and then introduce a number of rules for optimizing the index, and finally use these rules to do detailed case analysis on the knowledge points often tested in the interview. The editor here is also I have compiled a copy of more than 500 pages of MySQL optimization study notes and shared it with you. I hope you can master the MySQL tuning technology faster. This editor will share the following content for you:

More about the index, divided into the following points to explain:

  • 1. An overview of the index (what is an index, the advantages and disadvantages of the index)
  • Two, the basic use of the index (create index)
  • Third, the basic principles of index (interview focus)
  • Fourth, the data structure of the index (B-tree, hash)
  • Five, the principle of creating an index (the most important thing, the interview must ask! Please collect it!)
  • 6. How to delete data with a level of one million or more

One, an overview of the index

1) What is an index?

Indexes are a special type of file (the index on the InnoDB data table is an integral part of the table space), they contain reference pointers to all records in the data table. More generally speaking, an index is equivalent to a directory. When you are using the Xinhua Dictionary, the table of contents is torn up for you. You can only turn from the first page to the thousandth page when you look up the idiom at the beginning of a word. tired! Return the catalog to you, you can quickly locate it!

2) The advantages and disadvantages of the index:

Can greatly speed up the data retrieval speed, which is also the main reason for creating an index. And by using the index, you can use the optimization hider in the query process to improve the performance of the system. However, the index also has disadvantages: the index requires additional maintenance costs; because the index file is a separate file, the addition, modification, and deletion of data will generate additional operations on the index file, which require additional IO , It will reduce the execution efficiency of adding/modifying/deleting.

Second, the basic use of the index

1) Create an index: (three ways)

The first way:

The second way: use the ALTER TABLE command to increase the index:

ALTER TABLE is used to create ordinary indexes, UNIQUE indexes or PRIMARY KEY indexes.

Among them, table_name is the name of the table whose index is to be added, and column_list indicates which columns to index. When there are multiple columns, each column is separated by a comma.

The index name index_name can be named by yourself. By default, MySQL will assign a name based on the first index column. In addition, ALTER TABLE allows multiple tables to be changed in a single statement, so multiple indexes can be created at the same time.

The third way: use the CREATE INDEX command to create

CREATE INDEX can add ordinary index or UNIQUE index to the table. (However, PRIMARY KEY index cannot be created)

Three, the basic principle of index

Indexes are used to quickly find records with specific values. If there is no index, generally the entire table is traversed when the query is executed.

The principle of indexing is very simple, which is to turn unordered data into ordered queries

1. Sort the contents of the indexed column

2. Generate an inverted list for the sorting results

3. Put the data address chain on the content of the inverted table

4. When querying, first get the contents of the inverted table, and then take out the data address chain, so as to get the specific data

Fourth, the data structure of the index (b tree, hash)

1) B-tree index

mysql fetches data through the storage engine, and basically 90% of people use InnoDB. According to the implementation method, there are currently only two types of InnoDB indexes: BTREE (B-tree) index and HASH index. B-tree index is the most frequently used index type in Mysql database. Basically all storage engines support BTree index. Usually, the index we are talking about does not accidentally refer to the (B-tree) index (actually implemented with B+tree, because when viewing the table index, mysql always prints BTREE, so it is referred to as B-tree index)

 

inquiry mode:

Primary key index area: PI (the address of the associated saved data) press the primary key to query,

Common index area: si (the address of the associated id, and then reach the address above). So press the primary key to query, the fastest

B+tree properties:

1.) The nodes of n subtrees contain n keywords, which are not used to store data but to store the index of the data.

2.) All leaf nodes contain information about all keywords, and pointers to records containing these keywords, and the leaf nodes themselves are linked in order of the size of the keywords from small to large.

3.) All non-terminal nodes can be regarded as index parts, and the nodes only contain the largest (or smallest) keywords in their subtrees.

4.) In the B+ tree, the insertion and deletion of data objects are only performed on the leaf nodes.

5.) The B+ tree has 2 head pointers, one is the root node of the tree, and the other is the leaf node with the smallest key code.

2) Hash index

Briefly speaking, similar to the HASH table (hash table) that is simply implemented in the data structure, when we use the hash index in mysql, it is mainly through the Hash algorithm (common Hash algorithms include direct addressing and square picking. , Folding method, divisor remainder method, random number method), the database field data is converted into a fixed-length Hash value, and the row pointer of this data is stored in the corresponding position of the Hash table; if a Hash collision occurs (two The hash value of different keywords is the same), it is stored in the form of a linked list under the corresponding hash key. Of course, this is only a simplified simulation diagram.

ps: Regarding data structure, friends who are interested in in-depth can follow me and check the topic [Data Structure]. I will not explain it in detail here.

Five, the principle of creating an index (the top priority)

Although the index is good, but it is not unlimited use, it is best to comply with the following principles

1) The leftmost prefix matching principle, a very important principle for combined indexes, mysql will always match to the right until it encounters a range query (>, <, between, like) and stops matching, such as a = 1 and b = 2 and c> 3 and d = 4 If an index of (a, b, c, d) order is established, d is not an index. If an index of (a, b, d, c) is established, it can be used, a, b The order of d can be adjusted arbitrarily.

2) Create indexes for fields that are frequently used as query conditions

3) Frequently updated fields are not suitable for index creation

4) If the column that cannot effectively distinguish the data is not suitable for the index column (such as gender, gender, gender, unknown, at most three, the degree of discrimination is too low)

5) Expand the index as much as possible, do not create a new index. For example, there is already an index of a in the table, and now you want to add an index of (a, b), you only need to modify the original index.

6) Data columns with foreign keys must be indexed.

7) For columns that are rarely involved in the query, do not create indexes for columns with more duplicate values.

8) Do not create indexes for columns defined as text, image, and bit data types.

6. How to delete data with a level of one million or more

Regarding the index: Because the index requires additional maintenance costs, because the index file is a separate file, when we add, modify, or delete data, additional operations on the index file will occur, which require additional IO, Will reduce the execution efficiency of addition/modification/deletion. Therefore, when we delete millions of data in the database, query the official MySQL manual to find that the speed of deleting data is directly proportional to the number of indexes created.

So when we want to delete millions of data, we can delete the index first (it takes more than three minutes at this time)

Then delete the useless data (this process takes less than two minutes)

After the deletion is completed, the index is recreated (the data is less at this time). The index creation is also very fast, about ten minutes.

Compared with the previous direct deletion, it is definitely much faster, not to mention that in case the deletion is interrupted, all deletions will be rolled back. That is even more a pit.

to sum up

Today, the explanation of the index is here. The main point is that the basic principles of indexing and the principle of index creation are the key points. The interview is basically necessary! You can collect a lot of understanding and understanding. The editor also compiled a copy of more than 500 pages of MySQL optimization study notes to share with you. I hope you can master the MySQL tuning technology faster! Finally, welcome everyone to communicate together, if you like the article, remember to follow me and like it, thank you for your support!

Guess you like

Origin blog.csdn.net/QLCZ0809/article/details/112385115