Clickhouse Explain

一、Explain

1. 功能版本及语法

1.1 版本
20.6之后
1.2 语法
EXPLAIN [AST | SYNTAX | PLAN | PIPELINE] [setting = value, ...] SELECT ... [FORMAT ...]

2.Explain Types

2.1 Types
  1. AST — Abstract syntax tree (抽象语法树)

  2. SYNTAX — Query text after AST-level optimizations(在 AST 级优化后查询文本。)

  3. PLAN — Query execution plan(查询执行计划)

  4. PIPLINE — Query execution pipeline(查询执行管道)

2.2.AST
  1. simple query
    EXPLAIN AST SELECT 1;
    
    
    ┌─explain───────────────────────────┐
    │ SelectWithUnionQuery (children 1) │
    │  ExpressionList (children 1)      │
    │   SelectQuery (children 1)        │
    │    ExpressionList (children 1)    │
    │     Literal UInt64_1              │
    └───────────────────────────────────┘
    
  2. complex query
    EXPLAIN AST ALTER TABLE t1 DELETE WHERE date = today();
    
    
    ┌─explain──────────────────────────┐
    │ AlterQuery  t1 (children 1)      │
    │  ExpressionList (children 1)     │
    │   AlterCommand 32 (children 1)   │
    │    Function equals (children 1)  │
    │     ExpressionList (children 2)  │
    │      Identifier date             │
    │      Function today (children 1) │
    │       ExpressionList             │
    └──────────────────────────────────┘
    
2.3.SYNTAX

返回语法优化后的查询,这个是我们开发过程中经常用到的,可以看下面两个例子

  1. 多表联查会优化成 CROSS JOIN
    EXPLAIN SYNTAX SELECT * FROM system.numbers AS a, system.numbers AS b, system.numbers AS c;
    
    
    ┌─explain────────────────────────────┐
    │ SELECT                             │
    │     `--a.number` AS `a.number`,    │
    │     `--b.number` AS `b.number`,    │
    │     number AS `c.number`           │
    │ FROM                               │
    │ (                                  │
    │     SELECT                         │
    │         number AS `--a.number`,    │
    │         b.number AS `--b.number`   │
    │     FROM system.numbers AS a       │
    │     CROSS JOIN system.numbers AS b │
    │ ) AS `--.s`                        │
    │ CROSS JOIN system.numbers AS c     │
    └────────────────────────────────────┘
    
    
    
    
  2. 三元运算符优化
    #查看优化
    EXPLAIN SYNTAX SELECT number = 1 ? 'ck' : (number = 2 ? 'hbase' : 'hive') FROM numbers(10);
    
    ┌─explain──────────────────────────────────────────────────────┐
    │ SELECT if(number = 1, 'ck', if(number = 2, 'hbase', 'hive')) │
    │ FROM numbers(10)                                             │
    └──────────────────────────────────────────────────────────────┘
    
    
    #开启三元运算符优化
    SET optimize_if_chain_to_multiif = 1;
    #再次查看优化后效果
    EXPLAIN SYNTAX SELECT number = 1 ? 'ck' : (number = 2 ? 'hbase' : 'hive') FROM numbers(10);
    
    ┌─explain───────────────────────────────────────────────────────┐
    │ SELECT multiIf(number = 1, 'ck', number = 2, 'hbase', 'hive') │
    │ FROM numbers(10)                                              │
    └───────────────────────────────────────────────────────────────┘
    
    可以看到优化后的已经将 多个  `if()`函数替换为 `multiIf()` 函数
    
    
2.4.PLAN

查看执行计划,相对hive比较简洁

  1. simple query
    explain plan select arrayJoin([1,2,3,null,null]);
    
    ┌─explain───────────────────────────────────────────────────────────────────┐
    │ Expression ((Projection + Before ORDER BY))                               │
    │   SettingQuotaAndLimits (Set limits and quota after reading from storage) │
    │     ReadFromStorage (SystemOne)                                           │
    └───────────────────────────────────────────────────────────────────────────┘
    
    
  2. complex query
    explain select database,table,count(1) cnt from system.parts where database in ('datasets','system') group by database,table order by database,cnt desc limit 2 by database;
    
    
    ┌─explain─────────────────────────────────────────────────────────────────────────────────────┐
    │ Expression (Projection)                                                                     │
    │   LimitBy                                                                                   │
    │     Expression (Before LIMIT BY)                                                            │
    │       MergingSorted (Merge sorted streams for ORDER BY)                                     │
    │         MergeSorting (Merge sorted blocks for ORDER BY)                                     │
    │           PartialSorting (Sort each block for ORDER BY)                                     │
    │             Expression (Before ORDER BY)                                                    │
    │               Aggregating                                                                   │
    │                 Expression (Before GROUP BY)                                                │
    │                   Filter (WHERE)                                                            │
    │                     SettingQuotaAndLimits (Set limits and quota after reading from storage) │
    │                       ReadFromStorage (SystemParts)                                         │
    └─────────────────────────────────────────────────────────────────────────────────────────────┘
    

    大体的一个执行路线
    limit -> merge sort (order by 归并排序) -> agg(count() 聚合) -> filter(where 过滤 )

    ​ 另外, explain 默认就是explain plain

  3. open all param

    打开其他参数

    EXPLAIN header=1, actions=1,description=1 SELECT number from system.numbers limit 10;
    
    
    ┌─explain───────────────────────────────────────────────────────────────────┐
    │ Expression ((Projection + Before ORDER BY))                               │
    │ Header: number UInt64                                                     │
    │ Actions: INPUT :: 0 -> number UInt64 : 0                                  │
    │ Positions: 0                                                              │
    │   SettingQuotaAndLimits (Set limits and quota after reading from storage) │
    │   Header: number UInt64                                                   │
    │     Limit (preliminary LIMIT (without OFFSET))                            │
    │     Header: number UInt64                                                 │
    │     Limit 10                                                              │
    │     Offset 0                                                              │
    │       ReadFromStorage (SystemNumbers)                                     │
    │       Header: number UInt64                                               │
    └───────────────────────────────────────────────────────────────────────────┘
    
    
2.5.PIPLINE

查看执行管道

  1. 不带参数

    EXPLAIN PIPELINE SELECT sum(number) FROM numbers_mt(10000) GROUP BY number % 20;
    
    
    ┌─explain───────────────────────┐
    │ (Expression)                  │
    │ ExpressionTransform           │
    │   (Aggregating)               │
    │   AggregatingTransform        │
    │     (Expression)              │
    │     ExpressionTransform       │
    │       (SettingQuotaAndLimits) │
    │         (ReadFromStorage)     │
    │         Limit                 │
    │           Numbers 0 → 1       │
    └───────────────────────────────┘
    
  2. 其他参数

    EXPLAIN PIPELINE header=1,graph=1 SELECT sum(number) FROM numbers_mt(10000) GROUP BY number%20;
    
    
    ┌─explain─────────────────────────────────────┐
    │ digraph                                     │
    │ {                                           │
    │   rankdir="LR";                             │
    │   { node [shape = rect]                     │
    │         n2 [label="Limit"];                 │
    │         n1 [label="Numbers"];               │
    │     subgraph cluster_0 {                    │
    │       label ="Aggregating";                 │
    │       style=filled;                         │
    │       color=lightgrey;                      │
    │       node [style=filled,color=white];      │
    │       { rank = same;                        │
    │         n4 [label="AggregatingTransform"];  │
    │       }                                     │
    │     }                                       │
    │     subgraph cluster_1 {                    │
    │       label ="Expression";                  │
    │       style=filled;                         │
    │       color=lightgrey;                      │
    │       node [style=filled,color=white];      │
    │       { rank = same;                        │
    │         n3 [label="ExpressionTransform"];   │
    │       }                                     │
    │     }                                       │
    │     subgraph cluster_2 {                    │
    │       label ="Expression";                  │
    │       style=filled;                         │
    │       color=lightgrey;                      │
    │       node [style=filled,color=white];      │
    │       { rank = same;                        │
    │         n5 [label="ExpressionTransform"];   │
    │       }                                     │
    │     }                                       │
    │   }                                         │
    │   n2 -> n3 [label="                         │
    │ number UInt64 UInt64(size = 0)"];           │
    │   n1 -> n2 [label="                         │
    │ number UInt64 UInt64(size = 0)"];           │
    │   n4 -> n5 [label="                         │
    │ modulo(number, 20) UInt8 UInt8(size = 0)    │
    │ sum(number) UInt64 UInt64(size = 0)"];      │
    │   n3 -> n4 [label="                         │
    │ number UInt64 UInt64(size = 0)              │
    │ modulo(number, 20) UInt8 UInt8(size = 0)"]; │
    │ }                                           │
    └─────────────────────────────────────────────┘
    
    

二、Explain Estimate

显示在处理查询时要从表中读取的估计行数标记数和部分数适用于MergeTree family.

  1. 创建表

    CREATE TABLE ttt (i Int64) ENGINE = MergeTree() ORDER BY i SETTINGS index_granularity = 16, write_final_mark = 0;
    INSERT INTO ttt SELECT number FROM numbers(128);
    OPTIMIZE TABLE ttt;
    
  2. 查询

    EXPLAIN ESTIMATE SELECT * FROM ttt;
    
  3. 结果

    ┌─database─┬─table─┬─parts─┬─rows─┬─marks─┐
    │ default  │ ttt   │     1 │  128 │     8 │
    └──────────┴───────┴───────┴──────┴───────┘
    

三、老版本查看执行计划

  1. 执行计划

    clickhouse-client -h 主机名 --send_logs_level=trace <<< "sql" > /dev/null
    

    send_logs_level 设置日志等级,<<<将SQL语句重定向至clickhouse-client进行查询,> /dev/null将查询结果重定向到空设备吞掉,以便观察日志

    另外,日志级别除了trace可以,也可以为debug;此外,这种方式只能是在日志中查看,并且只能在SQL语句真正执行的时候,如果线上数据量比较大,建议添加 limit

猜你喜欢

转载自blog.csdn.net/SpringBoots/article/details/121104675