Detailed explanation of mysql index (transfer)

what is an index

Reprinted from: http://www.cnblogs.com/ggjucheng/archive/2012/11/04/2754128.html

Indexes are used to quickly find those records with specific values. All MySQL indexes are stored in the form of B-trees. If there is no index, MySQL must scan all records in the entire table starting from the first record until it finds a record that meets the requirements. The more records in the table, the higher the cost of this operation. If an index has been created on the column used as the search condition, MySQL can quickly get the location of the target record without scanning any records. If the table has 1000 records, looking up records by index is at least 100 times faster than scanning records sequentially. 

Suppose we create a table named people: 

CREATE TABLE people ( peopleid SMALLINT NOT NULL, name CHAR(50) NOT NULL ); 

Then, we insert 1000 different name values ​​into the people table at random. The figure below shows a small part of the data file where the people table is located: 

as you can see, the name column does not have any clear order in the data file. If we create an index on the name column, MySQL will sort the name column in the index: 

for each item in the index, MySQL internally keeps it a "pointer" to the actual location of the record in the data file. So, if we want to find the peopleid of the record whose name equals "Mike" (the SQL command is "SELECT peopleid FROM people WHERE name='Mike';"), MySQL can look up the "Mike" value in the index for name, and go directly to The corresponding row in the data file, accurately returning the peopleid (999) for that row. During this process, MySQL only needs to process one row to return the result. Without an index on the "name" column, MySQL would have to scan all the records in the data file, i.e. 1000 records! Obviously, the lower the number of records that MySQL needs to process, the faster it can complete the task. 


type of index 

MySQL provides a variety of index types to choose from, and each index is explained in detail below.


normal index 

This is the most basic type of index, and it has no restrictions like uniqueness. Ordinary indexes can be created in the following ways: 
create an index, such as CREATE INDEX <name of index> ON tablename (list of columns); 
modify a table, such as ALTER TABLE tablename ADD INDEX [name of index] (list of columns); 
create When specifying an index on a table, for example CREATE TABLE tablename ( [...], INDEX [index name] (column list) ); 

unique index 

This kind of index is basically the same as the previous "normal index", but there is one difference: all values ​​of the indexed column can only appear once, that is, they must be unique. Unique indexes can be created in several ways: 
create an index, such as CREATE UNIQUE INDEX <name of index> ON tablename (list of columns); 
modify a table, such as ALTER TABLE tablename ADD UNIQUE [name of index] (list of columns) ; 
Specify the index when creating the table, e.g. CREATE TABLE tablename ( [...], UNIQUE [name of index] (list of columns) ); 


primary key 

A primary key is a unique index, but it must be specified as "PRIMARY KEY". If you've ever used columns of type AUTO_INCREMENT, you're probably already familiar with concepts like primary keys. The primary key is usually specified when the table is created, such as "CREATE TABLE tablename ( [...], PRIMARY KEY (list of columns) ); ". However, we can also join the primary key by modifying the table, such as "ALTER TABLE tablename ADD PRIMARY KEY (list of columns); ". Each table can only have one primary key. 

full text index 

MySQL supports full-text indexing and full-text retrieval since version 3.23.23. In MySQL, the index type of full-text index is FULLTEXT. Full-text indexes can be created on columns of type VARCHAR or TEXT. It can be created by the CREATE TABLE command, or by the ALTER TABLE or CREATE INDEX commands. For large datasets, creating a full-text index via the ALTER TABLE (or CREATE INDEX) command is faster than inserting records into an empty table with a full-text index. The discussion below this article no longer covers full-text indexing, to learn more, see the MySQL documentation. 


Single-column index and multi-column index 

The index can be a single-column index or a multi-column index. Below we use specific examples to illustrate the difference between these two indexes. Suppose there is a people table like this: 

 
CREATE TABLE people (
peopleid SMALLINT NOT NULL AUTO_INCREMENT,
firstname CHAR(50) NOT NULL,
lastname CHAR(50) NOT NULL, 
age SMALLINT NOT NULL,
townid SMALLINT NOT NULL,
PRIMARY KEY (peopleid) );
 

Here's the data we inserted into the people table: 

There are four people named "Mikes" in this data fragment (two with the last name Sullivans, two with the last name McConnells), two people with the age of 17, and A Joe Smith with a different name. 

The main purpose of this table is to return the corresponding peopleid based on the specified user's last name, first name, and age. For example, we might need to find the peopleid of a user named Mike Sullivan who is 17 years old (the SQL command is SELECT peopleid FROM people WHERE firstname='Mike' AND lastname='Sullivan' AND age=17;). Since we don't want MySQL to scan the entire table every time a query is executed, we need to consider using an index. 

First, we can consider creating an index on a single column, such as the firstname, lastname or age columns. If we create an index on the firstname column (ALTER TABLE people ADD INDEX firstname (firstname);), MySQL will quickly limit the search to those records with firstname='Mike' through this index, and then on this "intermediate result set" Do a search with other criteria: it first excludes those records whose lastname is not equal to "Sullivan", and then those records whose age is not equal to 17. When the record meets all the search conditions, MySQL returns the final search result. 

Due to the establishment of the index on the firstname column, MySQL is much more efficient than performing a full scan of the table, but the number of records we require MySQL to scan is still far more than what is actually required. Although we can drop the index on the firstname column and create an index on the lastname or age column, in general, the search efficiency is still similar regardless of which column is created. 

In order to improve search efficiency, we need to consider the use of multi-column indexes. If you create a multi-column index on the firstname, lastname, and age columns, MySQL will be able to find the correct result in just one search! Here is the SQL command to create this multicolumn index: 

ALTER TABLE people ADD INDEX fname_lname_age (firstname,lastname,age); 

Since the index file is saved in B-tree format, MySQL can immediately go to the appropriate firstname and then to the appropriate firstname lastname, and finally go to the appropriate age. Without scanning any record in the data file, MySQL correctly found the target record for the search! 

Then, if a single-column index is created on the three columns of firstname, lastname, and age, is the effect the same as creating a multi-column index of firstname, lastname, and age? The answer is no, the two are completely different. When we execute a query, MySQL can only use one index. If you have three single-column indexes, MySQL will try to choose the most restrictive index. However, even the most restrictive single-column index is certainly far less restrictive than the multi-column index on the firstname, lastname, and age columns. 


leftmost prefix 

Multicolumn indexes have another advantage, which is manifested through a concept called Leftmost Prefixing. Continuing to consider the previous example, now we have a multi-column index on the firstname, lastname, age columns, let's call this index fname_lname_age. MySQL uses the fname_lname_age index when the search criteria is a combination of the following columns: 

firstname,lastname,age
firstname,lastname
firstname

On the other hand, it is equivalent to creating an index on the combination of (firstname, lastname, age), (firstname, lastname) and (firstname) columns. The following queries can use this fname_lname_age index: 

 
SELECT peopleid FROM people WHERE firstname='Mike' AND lastname='Sullivan' AND age='17'; 
SELECT peopleid FROM people WHERE firstname='Mike' AND lastname='Sullivan'; 
SELECT peopleid FROM people WHERE firstname='Mike'; 

The following queries cannot use the index at all:
SELECT peopleid FROM people WHERE lastname='Sullivan'; 
SELECT peopleid FROM people WHERE age='17'; 
SELECT peopleid FROM people WHERE lastname='Sullivan' AND age='17'; 
 

 

select index column 

Choosing which columns to create indexes on is one of the most important steps in the performance optimization process. There are two main types of columns that can be considered for indexes: columns that appear in the WHERE clause, and columns that appear in the join clause. See the following query: 

SELECT age ## without index
FROM people WHERE firstname = ' Mike ' ## Consider using an index
AND lastname = ' Sullivan ' ## Consider using an index

This query is slightly different from the previous query, but is still a simple query. Since age is referenced in the SELECT section, MySQL does not use it to limit column select operations. Therefore, for this query, it is not necessary to create an index on the age column. Here is a more complex example: 

SELECT people.age, ## do not use index
town.name ##do not use index
FROM people LEFT JOIN town ON
people.townid = town.townid ## Consider using an index
WHERE firstname = ' Mike ' ##Consider using an index
AND lastname = ' Sullivan ' ##Consider using an index


As in the previous example, since firstname and lastname appear in the WHERE clause, these two columns still need to be indexed. In addition to this, since the townid column of the town table is present in the join clause, we need to consider creating an index on this column. 

So, can we simply think that every column that appears in the WHERE clause and the join clause should be indexed? Almost, but not quite. We must also take into account the type of operator that compares the columns. MySQL uses indexes only for the following operators: <, <=, =, >, >=, BETWEEN, IN, and sometimes LIKE. The case where an index can be used in a LIKE operation is when the other operand does not start with a wildcard (% or _). For example, the query "SELECT peopleid FROM people WHERE firstname LIKE 'Mich%';" will use the index, but the query "SELECT peopleid FROM people WHERE firstname LIKE '%ike';" will not use the index. 

Analyze Indexing Efficiency

Now that we know a little bit about how to choose index columns, we can't tell which one is the most efficient. MySQL provides a built-in SQL command to help us accomplish this task, which is the EXPLAIN command. The general syntax of the EXPLAIN command is: EXPLAIN. You can find more explanation about this command in the MySQL documentation. Below is an example: 

EXPLAIN SELECT peopleid FROM people WHERE firstname='Mike' AND lastname='Sullivan' AND age='17'; 

This command will return the following analysis result: 

Let's take a look at the meaning of the EXPLAIN analysis result. 


table: This is the name of the table. 
type: The type of the join operation. Here's what the MySQL documentation says about the ref join type: 

"For each combination with a record in another table, MySQL will read all records with matching index values ​​from the current table. If the join operation uses only the lowest value of the key left prefix, or if the key is not of type UNIQUE or PRIMARY KEY (in other words, if the join operation cannot select unique rows based on the key value), MySQL uses the ref join type. If the join operation uses a key that matches only a small number of records, then ref is a good join type." 

In this case, since the index is not of type UNIQUE, ref is the best join type we can get. 

If EXPLAIN shows that the join type is "ALL", and you don't want to select most of the records from the table, then MySQL will be very inefficient because it scans the entire table. You can add more indexes to solve this problem. See the MySQL manual for more information. 

possible_keys: 
The names of possible indexes that can be exploited. The index name here is the index nickname specified when the index was created; if the index has no nickname, it defaults to the name of the first column in the index (in this case, "firstname"). The meaning of the default index names is often not obvious. 

Key: 
It shows the name of the index actually used by MySQL. If it is empty (or NULL), MySQL does not use the index. 

key_len: 
The length of the used part of the index, in bytes. In this example, key_len is 102, where firstname occupies 50 bytes, lastname occupies 50 bytes, and age occupies 2 bytes. If MySQL only used the firstname part of the index, key_len would be 50. 

ref: 
It shows the names of the columns (or the word "const") on which MySQL will select rows. In this example, MySQL selects rows based on three constants. 

rows: 
The number of records that MySQL thinks it must scan before finding the correct result. Obviously, the ideal number here is 1. 

Extra: 
Many different options may appear here, most of which will negatively affect the query. In this case, MySQL just reminds us that it will limit the search result set with the WHERE clause. 

Disadvantages of Indexing 

So far, we've discussed the advantages of indexes. In fact, indexes also have disadvantages. 

First, indexes take up disk space. Usually, this problem is not very prominent. However, if you create an index for every possible combination of columns, the size of the index file will grow much faster than the data file. If you have a very large table, the size of the index file may reach the maximum file limit allowed by the operating system. 

Second, indexes can slow down operations that require writing data, such as DELETE, UPDATE, and INSERT operations. This is because MySQL not only writes changes to the data files, but it also writes those changes to the index files. 

[Conclusion] In large databases, indexing is a key factor in improving speed. No matter how simple the structure of the table is, a table scan of 500,000 rows is not going to be fast by any means. If you have such large-scale tables on your website, you should really spend some time analyzing what indexes can be used, and consider whether the query can be rewritten to optimize the application. To learn more, see the MySQL manual. Also note that this article assumes that you are using MySQL version 3.23, and some queries cannot be executed on MySQL version 3.22.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326351405&siteId=291194637