Calcite analysis - Rule

Calcite source code analysis, refer to:

http://matt33.com/2019/03/07/apache-calcite-process-flow/

https://matt33.com/2019/03/17/apache-calcite-planner/

 

Rule as Calcite core query optimization,

Specific look at a few representative Rule, to see how to achieve

 

In the simplest case, Join associativity, JoinAssociateRule

First of all classes of Rule inherit RelOptRule

/**
 * Planner rule that changes a join based on the associativity rule.
 *
 * <p>((a JOIN b) JOIN c) &rarr; (a JOIN (b JOIN c))</p>
 *
 * <p>We do not need a rule to convert (a JOIN (b JOIN c)) &rarr;
 * ((a JOIN b) JOIN c) because we have
 * {@link JoinCommuteRule}.
 *
 * @see JoinCommuteRule
 */
public class JoinAssociateRule extends RelOptRule {

RelOptRule used to transform expression

It maintains Operand tree shows that the rule which can be applied to the tree structure, see the following specific examples

/**
 * A <code>RelOptRule</code> transforms an expression into another. It has a
 * list of {@link RelOptRuleOperand}s, which determine whether the rule can be
 * applied to a particular section of the tree.
 *
 * <p>The optimizer figures out which rules are applicable, then calls
 * {@link #onMatch} on each of them.</p>
 */
public abstract class RelOptRule {
 
  /**
   * Root of operand tree.
   */
  private final RelOptRuleOperand operand;

  /** Factory for a builder for relational expressions.
   *
   * <p>The actual builder is available via {@link RelOptRuleCall#builder()}. */
  public final RelBuilderFactory relBuilderFactory;

  /**
   * Flattened list of operands.
   */
  public final List<RelOptRuleOperand> operands;

  //~ Constructors -----------------------------------------------------------

  /**
   * Creates a rule with an explicit description.
   *
   * @param operand     root operand, must not be null
   * @param description Description, or null to guess description
   * @param relBuilderFactory Builder for relational expressions
   */
  public RelOptRule(RelOptRuleOperand operand,
      RelBuilderFactory relBuilderFactory, String description) {
    this.operand = Objects.requireNonNull(operand);
    this.relBuilderFactory = Objects.requireNonNull(relBuilderFactory);this.description = description;
    this.operands = flattenOperands(operand);
    assignSolveOrder();
  }

For example, the associative law Join call super, i.e. RelOptRule constructor

  /**
   * Creates a JoinAssociateRule.
   */
  public JoinAssociateRule(RelBuilderFactory relBuilderFactory) {
    super(
        operand(Join.class,
            operand(Join.class, any()),
            operand(RelSubset.class, any())),
        relBuilderFactory, null);
  }

operand is a tree structure, the type of Top Operand is Join, he has two Children, which is a join, and the other is RelSubset

Their children are any ()

  /**
   * Creates a list of child operands that signifies that the operand matches
   * any number of child relational expressions.
   *
   * @return List of child operands that signifies that the operand matches
   *   any number of child relational expressions
   */
  public static RelOptRuleOperandChildren any() {
    return RelOptRuleOperandChildren.ANY_CHILDREN;
  }

 

To understand the Rule if then used to look inside the code HepPlaner

Because HepPlaner logic is simple, that is, through all HepRelVertex, to see if RelOptRule can match, if match on the application of Rule

So calls to applyRule, input contains Rule and Vertex

  private HepRelVertex applyRule(
      RelOptRule rule,
      HepRelVertex vertex,
      boolean forceConversions) {

    final List<RelNode> bindings = new ArrayList<>();
    final Map<RelNode, List<RelNode>> nodeChildren = new HashMap<>();
    boolean match =
        matchOperands(
            rule.getOperand(),
            vertex.getCurrentRel(),
            bindings,
            nodeChildren);

    if (!match) {
      return null;
    }

    HepRuleCall call =
        new HepRuleCall(
            this,
            rule.getOperand(),
            bindings.toArray(new RelNode[0]),
            nodeChildren,
            parents);

    // Allow the rule to apply its own side-conditions.
    if (!rule.matches(call)) {
      return null;
    }

    fireRule(call);

    if (!call.getResults().isEmpty()) {
      return applyTransformationResults(
          vertex,
          call,
          parentTrait);
    }
    return null;
  }

1. First call, matchOperands, to see if the match,

  Private  boolean matchOperands ( 
      RelOptRuleOperand operand,   // the Operand Rule of 
      RelNode rel,   // Vertex in RelNode 
      List <RelNode> Bindings, 
      the Map <RelNode, List <RelNode >> nodeChildren) {
     IF (! operand.matches (rel)) { // first operand and whether the top node comparison match 
      return  to false ; 
    } 
    bindings.add (the rel); // if match, the top relnode added Bindings
     // then compare whether the child match
     // child of several types:
     // Any, casual, returned directly to true
     //Unordered, disorder, for each operand if any of the child RelNode meet to
     // the Default, s Some, the Operand and strictly ordered matching RelNode 
    List <HepRelVertex> childRels = (List) rel.getInputs ();
     Switch (operand .childPolicy) {
     Case the ANY:
       return  to true ;
     Case Unordered:
       // . the for each the operand, AT Least One Child MUST match the If
       // . matchAnyChildren, usually there apos One Just the operand 
      for (RelOptRuleOperand childOperand: operand.getChildOperands ()) {
         Boolean = match to false ;
         for (HepRelVertex childRel: childRels) { 
          match =
              matchOperands( //递归调用matchOperands
                  childOperand,
                  childRel.getCurrentRel(),
                  bindings,
                  nodeChildren);
          if (match) {
            break;
          }
        }
        if (!match) {
          return false;
        }
      }
      final List<RelNode> children = new ArrayList<>(childRels.size());
      for (HepRelVertex childRel : childRels) {
        children.add(childRel.getCurrentRel());
      }
      nodeChildren.put(rel, children);
      return true;
    default:
      int n = operand.getChildOperands().size();
      if (childRels.size() < n) {
        return false;
      }
      //一一按顺序对应match
      for (Pair<HepRelVertex, RelOptRuleOperand> pair
          : Pair.zip(childRels, operand.getChildOperands())) {
        boolean match =
            matchOperands(
                pair.right,
                pair.left.getCurrentRel(),
                bindings,
                nodeChildren);
        if (!match) {
          return false;
        }
      }
      return true;
    }
  }

Logic is to compare itself, and then recursively compare children

Comparison function,

It can be seen from the type, Trait, Predicate up comparison match

 /**
   * Returns whether a relational expression matches this operand. It must be
   * of the right class and trait.
   */
  public boolean matches(RelNode rel) {
    if (!clazz.isInstance(rel)) {
      return false;
    }
    if ((trait != null) && !rel.getTraitSet().contains(trait)) {
      return false;
    }
    return predicate.test(rel);
  }

child into the type,

/**
 * Policy by which operands will be matched by relational expressions with
 * any number of children.
 */
public enum RelOptRuleOperandChildPolicy {
  /**
   * Signifies that operand can have any number of children.
   */
  ANY,

  /**
   * Signifies that operand has no children. Therefore it matches a
   * leaf node, such as a table scan or VALUES operator.
   *
   * <p>{@code RelOptRuleOperand(Foo.class, NONE)} is equivalent to
   * {@code RelOptRuleOperand(Foo.class)} but we prefer the former because
   * it is more explicit.</p>
   */
  LEAF,

  /**
   * Signifies that the operand's children must precisely match its
   * child operands, in order.
   */
  SOME,

  /**
   * Signifies that the rule matches any one of its parents' children.
   * The parent may have one or more children.
   */
  UNORDERED,
}

Different types match comments in accordance with the rules

2. If the match, the result of the current package will continue to HepRuleCall, inherited from RelOptRuleCall

RelOptRule is the first invocation, that is, will record the call context data used and results

/**
 * A <code>RelOptRuleCall</code> is an invocation of a {@link RelOptRule} with a
 * set of {@link RelNode relational expression}s as arguments.
 */
public abstract class RelOptRuleCall {

  /**
   * Generator for {@link #id} values.
   */
  private static int nextId = 0;

  //~ Instance fields --------------------------------------------------------

  public final int id;
  protected final RelOptRuleOperand operand0; //Rule的Operand树的root
  protected the Map <RelNode, List <RelNode >> nodeInputs; // all their RelNode and the Inputs 
  public  Final RelOptRule rule;
   public  Final RelNode [] RELS; // match to all RelNodes 
  Private  Final RelOptPlanner Planner;
   Private  Final List <RelNode > parents; // for Top RelNodes in terms of parents

 

3. Call rule.matches (call)

The default implementation, return true, that is not checked, this allows the Rule add about a particular match detection

 

4. Call fireRule (call)

Of which primarily calls, ruleCall.getRule () onMatch (ruleCall).;

So doing something needs to be defined in the Rule of onMatch inside

 

Here's a look realize JoinAssociateRule,

  public void onMatch(final RelOptRuleCall call) {
    final Join topJoin = call.rel(0);
    final Join bottomJoin = call.rel(1);
    final RelNode relA = bottomJoin.getLeft();
    final RelNode relB = bottomJoin.getRight();
    final RelSubset relC = call.rel(2);
    final RelOptCluster cluster = topJoin.getCluster();
    final RexBuilder rexBuilder = cluster.getRexBuilder();

    if (relC.getConvention() != relA.getConvention()) {
      // relC could have any trait-set. But if we're matching say
      // EnumerableConvention, we're only interested in enumerable subsets.
      return;
    }

    //        topJoin
    //        /     \
    //   bottomJoin  C
    //    /    \
    //   A      B

    final int aCount = relA.getRowType().getFieldCount();
    final int bCount = relB.getRowType().getFieldCount();
    final int cCount = relC.getRowType().getFieldCount();
    final ImmutableBitSet aBitSet = ImmutableBitSet.range(0, aCount);
    final ImmutableBitSet bBitSet =
        ImmutableBitSet.range(aCount, aCount + bCount);

This is part of the initialization, a good understanding of the above, the following calculation is doing these count?

Is in fact considered RexInputRef,

RexNode and RelNode in common is that they represent a tree

The difference is the different things they represent, RelNode composed of representatives of the relational algebra count the number of child, and RexNode represents an expression tree

For example RexLiteral denotes a constant, RexVariable represent variables, RexCall showing the operation to connect and Variable Literal

RexVariable often represents an input field, for efficiency, there will only record id field, that this RexInputRef

/**
 * Variable which references a field of an input relational expression.
 *
 * <p>Fields of the input are 0-based. If there is more than one input, they are
 * numbered consecutively. For example, if the inputs to a join are</p>
 *
 * <ul>
 * <li>Input #0: EMP(EMPNO, ENAME, DEPTNO) and</li>
 * <li>Input #1: DEPT(DEPTNO AS DEPTNO2, DNAME)</li>
 * </ul>
 *
 * <p>then the fields are:</p>
 *
 * <ul>
 * <li>Field #0: EMPNO</li>
 * <li>Field #1: ENAME</li>
 * <li>Field #2: DEPTNO (from EMP)</li>
 * <li>Field #3: DEPTNO2 (from DEPT)</li>
 * <li>Field #4: DNAME</li>
 * </ul>
 *
 * <p>So <code>RexInputRef(3, Integer)</code> is the correct reference for the
 * field DEPTNO2.</p>
 */
public class RexInputRef extends RexSlot {

RexInputRef while the index is not fixed, is sequenced, and

Therefore, according to the above fieldcount are calculated, corresponding to the index can be discharged 

 

Then do a bunch of conditions is to adjust the position and the corresponding index

    // Goal is to transform to
    //
    //       newTopJoin
    //        /     \
    //       A   newBottomJoin
    //               /    \
    //              B      C

    // Split the condition of topJoin and bottomJoin into a conjunctions. A
    // condition can be pushed down if it does not use columns from A.
    final List<RexNode> top = new ArrayList<>();
    final List<RexNode> bottom = new ArrayList<>();
    JoinPushThroughJoinRule.split(topJoin.getCondition(), aBitSet, top, bottom);
    JoinPushThroughJoinRule.split(bottomJoin.getCondition(), aBitSet, top,
        bottom);

    // Mapping for moving conditions from topJoin or bottomJoin to
    // newBottomJoin.
    // target: | B | C      |
    // source: | A       | B | C      |
    final Mappings.TargetMapping bottomMapping =
        Mappings.createShiftMapping(
            aCount + bCount + cCount,
            0, aCount, bCount,
            bCount, aCount + bCount, cCount);
    final List<RexNode> newBottomList = new ArrayList<>();
    new RexPermuteInputsShuttle(bottomMapping, relB, relC)
        .visitList(bottom, newBottomList);
    RexNode newBottomCondition =
        RexUtil.composeConjunction(rexBuilder, newBottomList);

1. BottomJoin original TopJoin and inside conditions, divided into A and related and unrelated

Because after adjustment finished, and A-related conditions to put TopJoin, while A and needs unrelated to put BottomJoin

aBitSet which records index fields for all of A,

  /**
   * Splits a condition into conjunctions that do or do not intersect with
   * a given bit set.
   */
  static void split(
      RexNode condition,
      ImmutableBitSet bitSet,
      List<RexNode> intersecting,
      List<RexNode> nonIntersecting) {
    for (RexNode node : RelOptUtil.conjunctions(condition)) {//把conjunction的条件拆分开
      ImmutableBitSet inputBitSet = RelOptUtil.InputFinder.bits(node); //找出condition所用到的fields
      if (bitSet.intersects(inputBitSet)) {
        intersecting.add(node);
      } else {
        nonIntersecting.add(node);
      }
    }
  }

How to find fields that used condition?

As follows, so back in the recursive process all inputRef encounter recorded

    /**
     * Returns a bit set describing the inputs used by an expression.
     */
    public static ImmutableBitSet bits(RexNode node) {
      return analyze(node).inputBitSet.build();
    }

  /** Returns an input finder that has analyzed a given expression. */
    public static InputFinder analyze(RexNode node) {
      final InputFinder inputFinder = new InputFinder();
      node.accept(inputFinder); //Visitor模式
      return inputFinder;
    }

   //如果RexNode是InputRef,就记录下该Ref的index
    publicVisitInputRef void (RexInputRef inputRef) { 
      inputBitSet.set (inputRef.getIndex ()); 
      return  null ; 
    } 
  
  // if RexNode a Call, the Call accept recursively related the Operands 
  public R & lt visitCall (RexCall Call) {
     IF (! Deep) {
       return  null ; 
    } 

    R & lt R & lt = null ;
     for (the operand RexNode: call.operands) { 
      R & lt = operand.accept ( the this ); 
    } 
    return R & lt; 
  }

 

2. Since the tree structure has changed, it will cause the entire field of index changes

For the original, field bottomJoin, A starts from 0, B is started from aCount
For converted, field BottomJoin, B start from 0, C is started from bCount

So it is necessary to complete the correction Ref,

createShiftMapping, resulting mapping, include the original index, and the current new index

The first parameter is the index of the range, a total aCount + bCount + cCount, so this value is less than the required index

Then, only B, C of the index needs to be adjusted, the adjustment does not need to index A

Wherein, B is originally from Acount beginning, the current starting from 0, a total of bCount fields; C was originally started from aCount + bCount, starting from the current bCount, a total of two fields cCount

Then, RexPermuteInputsShuttle make a specific update

/**
 * Shuttle which applies a permutation to its input fields.
 *
 * @see RexPermutationShuttle
 * @see RexUtil#apply(org.apache.calcite.util.mapping.Mappings.TargetMapping, RexNode)
 */
public class RexPermuteInputsShuttle extends RexShuttle {
  //~ Instance fields --------------------------------------------------------

  private final Mappings.TargetMapping mapping;
  private final ImmutableList<RelDataTypeField> fields;

  //对于每个InputRef,根据Mapping更新index
  @Override public RexNode visitInputRef(RexInputRef local) {
    final int index = local.getIndex();
    int target = mapping.getTarget(index);
    return new RexInputRef(
        target,
        local.getType());
  }

 

Finally, build a new join,

    RexNode newBottomCondition =
        RexUtil.composeConjunction(rexBuilder, newBottomList);

    final Join newBottomJoin =
        bottomJoin.copy(bottomJoin.getTraitSet(), newBottomCondition, relB,
            relC, JoinRelType.INNER, false);

    // Condition for newTopJoin consists of pieces from bottomJoin and topJoin.
    // Field ordinals do not need to be changed.
    RexNode newTopCondition = RexUtil.composeConjunction(rexBuilder, top);
    @SuppressWarnings("SuspiciousNameCombination")
    final Join newTopJoin =
        topJoin.copy(topJoin.getTraitSet(), newTopCondition, relA,
            newBottomJoin, JoinRelType.INNER, false);

    call.transformTo(newTopJoin);

 

Guess you like

Origin www.cnblogs.com/fxjwind/p/11279080.html