MySQL Index--BNL/ICP/MRR/BKA

MySQL relational query algorithm:

BNL(Block Nested-Loop)
ICP(Index Condition Pushdown)
MRR(Multi-Range Read)
BKA(Batched Key Access)

 

BNL (Block Nested-Loop)
scenario:
assuming TB1 associating query and TB2, each row of data to TB1 as the outer loop scanned TB2 Find rows match, but the index TB2 not be used, it is necessary to scan the entire table T2 data, so the number of outer rows of TB1 TB2 determines the number of scans of the inner layer.

Optimization:
The outer table TB1 data of N rows split Block, each Block contains M data, N times of scanning TB2, TB2 each row scan data to match the data of a Block , number of scans of the original table TB2 is decreased from M * N times to N times.

Key:
1, is not available in the table index
2, and the order of appearance in the table can not change, such as LEFT JOIN operation

The algorithm already exists in the MySQL 5.1 release.

 

ICP(Index Condition Pushdown)

Scene:
indexed IDX_C1_C2_C3 (C1, C2, C3) , the query SELECT * FROM TB1 WHERE C1 = ' XXX' AND C3 = 'XXX' on the assumption table TB1

In previous versions MySQL 5.6, due to the lack of the filter condition C2, Innodb storage engine can use index layer IDX_C1_C2_C3 find all index records satisfy the conditions according to C1 = 'XXX' conditions, and then to find from these clustered indexes index record, the found MySQL Server table data back to the layer, and then the final result is used by MySQL Server layer C3 = 'XXX' filter condition.


MySQL 5.6 and then introduced into the ICP release characteristics, engine Innodb storage layer can only be used in accordance with the index IDX_C1_C2_C3 C1 = 'XXX' conditions to scan all of the index entries satisfying the condition, then the index records are filtered as C3 = 'XXX' conditions, and According to the index filtered records over and over again clustered index lookup table to find the data back to the MySQL Server layer to get the final result.

100 is assumed to meet the data behavior C1 = 'XXX' condition 100,000, satisfies C1 = 'XXX' AND C3 = 'XXX' data behavior, then:
1, in the MySQL 5.5 version, the need for a clustered index TB1 is 100000 Index Seek operation, Innodb storage engine layer transfer lines to the MySQL Server 100000 layer.
2, the MySQL 5.6 version, an ICP clustered index need only be 100 times TB1 Index Seek operation, engine Innodb storage layer 100 rows of data transfer to the MySQL Server layer.

By ICP MySQL Server layer filter term "sink" into the storage engine layer, so as to achieve:
1, to reduce the number of operations to find the index of aggregation;
2, reducing the amount of data returned from the memory layer to a MySQL Server Engine layer;
3, reduce the number of access layer of MySQL Server storage engine layer.

PS1: ICP using only non-clustered indexes.
PS2: In MySQL 5.6 supports only the ordinary table ICP operations, and support for MySQL 5.7 in the partition table ICP operation.

 

MRR (Multi-Range Read)
indexed IDX_C1 (C1) is assumed on the table TB1, the query SELECT * FROM TB1 WHERE C1 IN ( 'XXX1', 'XXX2', ...., 'XXXN')

In previous versions of MySQL 5.6, the first in accordance with C1 = 'XXX1' conditions IDX_C1 index lookup, then follow the index to find the corresponding data recording record clustered index to TB1 found in, and then follow C1 = 'XXX2' ... to C1 = 'XXXN' operation, the result of the operation of each set, and the final result set. Since the random clustering key index record contains data C1 obtained when the conditions from causing a more random IO, affect the storage performance server aggregation index Index seek operation.

MRR introduced characteristic MySQL 5.6 version, according to first find the conditions C1 = .... and C1 'XXX1' = 'XXXM' index into the buffer record satisfies the criteria, the index register when the Buffer is full then the buffer according to the sort key for aggregation, according to the results of the sorted index record corresponding to aggregate found, by sorting, the original can be effectively random sequential search to find the conversion section to sequentially random IO IO, prompted query performance, reduce the query consumption of server IO performance.

PS1: MRR也仅适用于非聚集索引,且根据非聚集索引得到的结果集在聚集键上是随机无序的。
PS2: 假设上面TB1的聚集索引为ID,那么IDX_C1(C1)等价于IDX_C1(C1,ID),如果仅对非聚集索引进行单个等值查询,那么得到的结果集对聚集键也是有序的,无需使用MRR特性。
PS3: MRR中涉及到的Buffer的大小取决于参数read_rnd_buffer_size的设置

 

BKA(Batched Key Access)
场景:
假设TB1和TB2进行关联查询,以TB1为外表循环到TB2中进行关联匹配,表TB2上有可使用的索引。

在MySQL 5.6版本前,只能循环TB1中的数据依次到TB2上进行索引查找,如果TB1上的数据是无序的,则对TB2的索引查找也是随机的,产生大量的随机IO操作。
在MySQL 5.6版本中,按照MRR的特性,先将TB1中的数据放入Buffer中,当Buffer满时对Buffer中的数据按照关联键进行排序,然后有序地对TB2进行索引查找,将部分随机IO操作转换为顺序IO操作。

PS1: BKA依赖于MRR,因此要使用BKA必须开启MRR特性,但又由于基于mrr_cost_based的成本估算不能保证MRR被使用,因此官方推荐关闭mrr_cost_based。
PS2: BKA使用的Buffer的大小取决于参数join buffer siz

设置开启MRR和BKA并关闭mrr_cost_based
SET optimizer_switch='mrr=on,mrr_cost_based=off,batched_key_access=on';

 

BKA和BNL的区别:
1、内表索引,BKA要求内表有可以使用的索引,而BNL则是因为内表没有可使用的索引而不得已的优化
2、算法目的,BKA算法的目的是减少对内表的随机Index Seek操作和降低随机IO,而BNL算法的目的是减少对内表的扫描次数和减少扫描带来的IO开销。

Guess you like

Origin www.cnblogs.com/gaogao67/p/11111585.html