Hive学习笔记(15)- Hive 优化

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/u012292754/article/details/86547312

1 Hive 优化

1.1 hive.fetch.task.conversion

<property>
    <name>hive.fetch.task.conversion</name>
    <value>more</value>
    <description>
      Expects one of [none, minimal, more].
      Some select queries can be converted to single FETCH task minimizing latency.
      Currently the query should be single sourced not having any subquery and should not have
      any aggregations or distincts (which incurs RS), lateral views and joins.
      0. none : disable hive.fetch.task.conversion
      1. minimal : SELECT STAR, FILTER on partition columns, LIMIT only
      2. more    : SELECT, FILTER, LIMIT only (support TABLESAMPLE and virtual columns)
    </description>
  </property>

1.2 大表拆分(子表)

create [temporary] [external] table [if not EXISTS] [db.name] table_name
[as SELECT_statement];

1.3 外部表,分区表

  • 结合使用
  • 多级分区

1.4 数据

  • 存储格式(textfile, orcfile,parquet)
  • 数据压缩(snappy)
set parquet.compression=SNAPPY;
create table page_views_par_snappy
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
stored as parquet
as SELECT * FROM page_views; 

1.5 SQL 优化

  • 优化 SQL 语句

1.5.1 join

https://cwiki.apache.org/confluence/display/Hive/LanguageManual+JoinOptimization

  • Common / Shuffle / Reduce join
* 发生在 reduce task
* 大表对大表
* 每个表数据从文件中读取

  • map join
* 发生在 Map Task
* 小表对大表
* 大表的数据从文件中读取
* 小表的数据在内存
* DistributedCache

  • smb join(sort-merge-bucket)

1.6 HQL 执行计划 explain

https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Explain

1.7 并行执行

  • hive.exec.parallel.thread.number
  • hive.exec.parallel

1.8 JVM 重用

  • mapreduce.job.jvm.numtasks

1.9 Reduce 数目

  • mapreduce.job.reduce

1.10 推测执行

  • mapreduce.map.speculative
  • hive.mapred.reduce.tasks.speculative.execution

1.11 Map 数目

  • hive.merge.size.per.task

1.12 动态分区

1.13 Strict Mode

  • 对分区表进行查询,在 where 子句中没有加分区过滤,将禁止提交任务(默认:nonstrict),hive.mapred.mode=strict
  • 使用严格模式可以禁止 3 中类型的查询
* 

猜你喜欢

转载自blog.csdn.net/u012292754/article/details/86547312