This chapter relates TiDB the following source:
Table 1. How to scan convert operator sweep index operator;
2. How the Selection operator filters simplification, into the interval scanning;
Suppose we have a table:
t1( id int primary key not null auto_increment, a int, b int, c varchar(256), index(a) );
Wherein, id is the primary key, a is an index;
We execute the following sql:
select a from t1 where a=5 or ( a>5 and (a>6 and a <8) and a<12);
The final execution of this plan sql is this:
+---------------+--------+-----------+-----------------------------------------------------------------------+ | id | count | task | operator info | +---------------+--------+-----------+-----------------------------------------------------------------------+ | IndexReader_6 | 260.00 | root | index:IndexScan_5 | | └─IndexScan_5 | 260.00 | cop[tikv] | table:t1, index:a, range:[5,5], (6,8), keep order:false, stats:pseudo | +---------------+--------+-----------+-----------------------------------------------------------------------+
This is an index scan execution plan, index scan interval is [5,5], (6,8);
We go to the source code to see how such a plan is generated;
This is the first produced after parsing sql execution plan:
After calling logicalOptimize function does logic optimization, execution plan goes like this:
Selection operator where to go?
Selection operator DataSource is pushed down to the operator, saving the operator push down on the filter in the DataSource pushedDownConds, is such that:
Such a recursive filter operator tree index scans difficult to use, because the index is the order of the bottom, so to Switch The tree scan interval;
In the physical optimization, generating calls DetachCondAndBuildRangeForIndex scan interval, this function is called recursively following two functions:
detachDNFCondAndBuildRangeForIndex, expand disjunctive normal form (DNF), generating a combined scan interval or scan interval;
detachCNFCondAndBuildRangeForIndex, expand CNF (CNF), generating a combined scan interval or scan interval;
The above expression tree finally generated such interval: [5,5], (6,8) --- "[" is an open section, "(" closed interval, recursion is eliminated;
Next, the index scan An alternative approach will be added to the access list of the DataSource;
Save the possible solutions to access the table in the DataSource possibleAccessPaths where there is two options:
1. The full table scan, filter expression tree with: a = 5 or (a> 5 and (a> 6 and a <8) and a <12);
2. sweeping a column index, perform a scan interval [5,5], (6,8);
Physical optimization phase, will function for each recursive call findBestTask operator from the operator root of the tree, DataSoure operator will get the best execution plan from possibleAccessPaths;
Here used skyline pruning algorithm, from multiple dimensions to determine which implementation plan better, and finally with an index scan operator to replace the DataSource operator;
Ultimately generate such execution plan:
End;