Superficial talk TiDB operator sweeping table, index scan operator, conjunctive normal form (CNF), disjunctive normal form (DNF), skyline pruning

This chapter relates TiDB the following source:

  Table 1. How to scan convert operator sweep index operator;

  2. How the Selection operator filters simplification, into the interval scanning;

Suppose we have a table:

t1(
  id int primary key not null auto_increment,
  a int,
  b int,
  c varchar(256),
  index(a)
);

Wherein, id is the primary key, a is an index; 

We execute the following sql:

select a from t1 where a=5 or ( a>5 and (a>6 and a <8)  and a<12);

The final execution of this plan sql is this:

+---------------+--------+-----------+-----------------------------------------------------------------------+
| id            | count  | task      | operator info                                                         |
+---------------+--------+-----------+-----------------------------------------------------------------------+
| IndexReader_6 | 260.00 | root      | index:IndexScan_5                                                     |
| └─IndexScan_5 | 260.00 | cop[tikv] | table:t1, index:a, range:[5,5], (6,8), keep order:false, stats:pseudo |
+---------------+--------+-----------+-----------------------------------------------------------------------+  

This is an index scan execution plan, index scan interval is [5,5], (6,8);

We go to the source code to see how such a plan is generated;

This is the first produced after parsing sql execution plan:

After calling logicalOptimize function does logic optimization, execution plan goes like this:

Selection operator where to go?

Selection operator DataSource is pushed down to the operator, saving the operator push down on the filter in the DataSource pushedDownConds, is such that:

Such a recursive filter operator tree index scans difficult to use, because the index is the order of the bottom, so to Switch The tree scan interval;

In the physical optimization, generating calls DetachCondAndBuildRangeForIndex scan interval, this function is called recursively following two functions:

detachDNFCondAndBuildRangeForIndex, expand disjunctive normal form (DNF), generating a combined scan interval or scan interval;
detachCNFCondAndBuildRangeForIndex, expand CNF (CNF), generating a combined scan interval or scan interval;

The above expression tree finally generated such interval: [5,5], (6,8) --- "[" is an open section, "(" closed interval, recursion is eliminated;

Next, the index scan An alternative approach will be added to the access list of the DataSource;

Save the possible solutions to access the table in the DataSource possibleAccessPaths where there is two options:

  1. The full table scan, filter expression tree with: a = 5 or (a> 5 and (a> 6 and a <8) and a <12);

  2. sweeping a column index, perform a scan interval [5,5], (6,8);

 

Physical optimization phase, will function for each recursive call findBestTask operator from the operator root of the tree, DataSoure operator will get the best execution plan from possibleAccessPaths;

 

Here used skyline pruning algorithm, from multiple dimensions to determine which implementation plan better, and finally with an index scan operator to replace the DataSource operator;

Ultimately generate such execution plan:

 

 End;

 

 

 

 

 

 

 

 

 

 

Guess you like

Origin www.cnblogs.com/lijingshanxi/p/12077587.html