MySQL database index

Why use an index?

In the case of no index, MySQL will scan the entire table to find records that meet the SQL conditions, and the time overhead is positively related to the amount of data in the table. Indexing some fields in relational data tables can greatly improve the query speed (of course, whether different fields are selective or not will cause the indexes established on these fields to improve the query speed differently, and the more indexes are not the better, because Index information needs to be updated when writing or deleting).

For MySQL's Innodb storage engine, most types of indexes are stored in B+Tree, a variant of the B-Tree data structure (MEMORY-type tables also support hash-type indexes). B-Tree is a data structure commonly used in databases or file systems. It is an N-ary balanced tree. This tree structure ensures that the keys stored by nodes at the same level are ordered. For a node, its left child All keys saved by the tree are smaller than the keys saved by this node, and all keys saved by its right subtree are larger than the keys saved by this node. In addition, in terms of engineering implementation, many optimizations have been made in combination with the locality principle of the operating system. In short, various characteristics or optimization techniques of b-tree can ensure: 1) When querying disk records, the number of disk reads is the least; 2) Any Both insert and delete operations have little impact on the tree structure; 3) The rebalance operation of the tree itself is very efficient.

 

Scenarios where MySQL uses indexes

MySQL uses indexes in the following operating scenarios:

1) Quickly find records that meet the where condition

2) Quickly determine the candidate set. If the where condition uses multiple index fields, MySQL will preferentially use the index that can minimize the size of the candidate recordset, so as to eliminate the records that do not meet the conditions as soon as possible.

3) If there is a joint index composed of several fields in the table, when searching for records, the leftmost prefix matching field of this joint index will also be automatically used as an index to speed up the search.

For example, if a joint index consisting of 3 fields (c1, c2, c3) is created for a table, (c1), (c1, c2), (c1, c2, c3) will be used as indexes, (c2, c3) ) will not be used as an index, and (c1, c3) actually only uses the c1 index.

4) Indexes are used for join operations on multiple tables (if the fields participating in the join are indexed in these tables)

5) If a field has been indexed, MySQL will use the index to find the min() or max() of the field

6) MySQL will use the index when doing sort or group operations on indexed fields

 

Which SQL statements will actually take advantage of indexes

From the MySQL official website document "Comparison of B-Tree and Hash Indexes", the following types of SQL may actually use indexes:

1) B-Tree can be used for column comparison expressions in SQL, such as =, >, >=, <, <= and between operations

2) If the condition of the like statement is a constant string that does not start with a wildcard, MySQL will also use the index

For example, SELECT * FROM tbl_name WHERE key_col LIKE 'Patrick%' or SELECT * FROM tbl_name WHERE key_col LIKE 'Pat%_ck%' can take advantage of indexes, while SELECT * FROM tbl_name WHERE key_col LIKE '%Patrick%' (starting with a wildcard) and SELECT * FROM tbl_name WHERE key_col LIKE other_col (like condition is not a constant string) cannot take advantage of indexes.

For SQL statements in the form of LIKE '%string%', if the length of the string following the wildcard is greater than 3, MySQL will use the Turbo Boyer-Moore algorithm to search.

3) If the column named col_name has been indexed, the SQL of the form "col_name is null" will use the index

4) For a joint index, the leftmost prefix matching field in the sql condition will use the index

5) If there is more than one where condition in the sql statement, MySQL will perform Index Merge optimization to narrow the range of candidate sets

 

 

The index is divided into single-column index and composite index. Single-column index, that is, an index contains only a single column, a table can have multiple single-column indexes, but this is not a composite index. Composite index, that is, an index contains multiple columns.

1) Ordinary index

Create an index:

CREATE INDEX indexName ON tableName(columName);

Drop index:

DROP INDEX [indexName] ON mytable; 

 

2) Unique index

Similar to a normal index, the difference is that the value of the index column must be unique, but null values ​​are allowed. In the case of a composite index, the combination of column values ​​must be unique.

Create an index:

CREATE UNIQUE INDEX indexName ON tableName(columName); 

 

3) Primary key index

It is a special unique index that does not allow nulls. Generally, the primary key index is created at the same time as the table is created.

 

4) Combined index

 

Here, I will focus on the difference between single-column index and composite index and the occasions when they should be used.

1. First, determine the optimization goal, in what business scenario, the size of the table, and so on. If the table is relatively small, it may not be necessary to add an index.

2. Which fields can be indexed, generally the fields behind where, order by or group by.

3. The index needs to be maintained when the record is modified, so there will be overhead, and it is necessary to measure the gain and loss after the index is built.

 

For example: student table. It can be considered that the repetition degree of name is relatively small, while the repetition degree of age is relatively large. For a single-column index, it is more suitable to build on a column with a low degree of repetition.

For select * from students where name='Zhang San'and age=18; there are two cases:
A. name and age are separately indexed
. Generally speaking, mysql will choose one of the indexes, and name is more likely, because mysq will count the repetition on each index and select the field with low repetition. The index of another age will not be used, but there is also the cost of maintaining the index, so the index of age does not need to be created.

B. The joint index of name and age
This index has the best fit, and mysql will directly select this index. However, compared with a single name index, the maintenance cost is larger, and the storage space occupied by the index data is also larger.

 

那么综合看来,有必要使用联合索引吗? 我的看法是没有必要,因为学校里可能会重名的人比较少。用name就可以比较精准的找到记录,即使有重复的也会比较少。

 

什么情况下使用联合索引比较好呢?
举一个例子,大学选认课老师,需要创建一个关系对应表,有2个字段,student_id 和 teacher_id,想要查询某个老师和某个学生是否存在师生关系。 
一个学生会选几十个老师,一个老师会带几百个学生。
如果只为student_id建立索引的情况下,经过索引会选出几十条记录,然后在内存中where一下,去除其余的老师。
相反如果只为teacher_id建立索引,经过索引会选出几百条记录,然后在内存中where一下,去除其余的学生。
两种情况都不是最优的,这个时候使用联合索引最合适,通过索引直接找到对应记录。

 

再进行一个例子的分析:

 

CREATE TABLE myIndex (
testID INT NOT NULL AUTO_INCREMENT PRIMARY KEY, 
name VARCHAR(50) NOT NULL, 
city VARCHAR(50) NOT NULL, 
age INT NOT NULL, 
schoolID INT NOT NULL
);
 

 

插入10000条数据,其中有六条name = 'jack' 的记录,但city,age,school 的组合各不相同。

来看这条T-SQL:SELECT testID FROM myIndex WHERE name='jack' AND city='上海' AND age=28;

首先考虑单列索引:

在name列上建立了索引。执行 T-SQL 时,MYSQL 很快将目标锁定在了name=jack的 6 条记录上,取出来放到一中间结果集。在这个结果集里,先排除掉 city 不等于"上海"的记录,再排除 age 不等于 28 的记录,最后筛选出唯一的符合条件的记录。

虽然在 vc_Name 上建立了索引,查询时MYSQL不用扫描整张表,效率有所提高,但离我们的要求还有一定的距离。同样的,在 city 和age 分别建立的MySQL单列索引的效率相似。

 

 

为了进一步榨取 MySQL 的效率,就要考虑建立组合索引。就是将 name,city,age 建到一个索引里:

ALTER TABLE myIndex ADD INDEX name_city_age (name(10), city, age);

 

建表时,name 长度为 50,这里为什么用 10 呢?因为一般情况下名字的长度不会超过 10,这样会加速索引查询速度,还会减少索引文件的大小,提高 INSERT 的更新速度。

执行 T-SQL 时,MySQL 无须扫描任何记录就到找到唯一的记录。

 

如果分别在 name,city,age 上建立单列索引,让该表有 3 个单列索引,查询时和上述的组合索引效率一样吗?大不一样,远远低于我们的组合索引。虽然此时有了三个索引,但 MySQL 只能用到其中的那个它认为似乎是最有效率的单列索引。

 

建立这样的组合索引,其实是相当于分别建立了:(name, city, age) 、(name, city) 、(name) 这样的三个组合索引。因为 mysql 组合索引“最左前缀”的结果。简单的理解就是只从最左面的开始组合。并不是只要包含这三列的查询都会用到该组合索引。

 

http://blog.csdn.net/weiwangchao_/article/details/50256673

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326273727&siteId=291194637