hive sql的使用(一)

版权声明:欢迎转载,请注明出处。 https://blog.csdn.net/zxc995293774/article/details/81490898

hive sql

case when

case用法1:
CASE a WHEN b THEN c [WHEN d THEN e]* [ELSE f] END
/* When a = b, returns c; when a = d, returns e; else returns f. */
case用法2:
CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END
/* When a = true, returns b; when c = true, returns d; else returns e. */

两个例子:

select (case col1 when 0 then col2 else col1 end) as new column from table_name;
select sum(case when col1 = 'val1' then col2 else 0 end) from table_name group by col3;

percent、percentile

percentile(BIGINT col, p):准确地求在列col上的对应的p分位数,用的较多的是求中位数,col列的值只能是整数,不能是浮点数;
p必须在0和1之间;
比如p=0.1,则col列上的值,有10%的数小于查询出的结果;
percentile(BIGINT col, array(p1 [, p2]...)):当求多个分位数时用数组;

percentile_approx(DOUBLE col, p [, B]):求约等于列col上的对应的p分位数;
percentile_approx(DOUBLE col, array(p1 [, p2]...) [, B]):当求多个分位数时用数组;

一个例子:
select percentile(summary, 0.1) from (select imei, sum(download) as summary from(select imei, (case diffSize when 0 then apkSize else diffSize end) as download from table_name where date=20180415) as imeiselect group by imei order by summary desc) as percselect;

子查询

子查询就是在一个查询之中嵌套了其它若干的查询。

as

as可以给表或者列或者子查询起别名。
比如hive sql当子查询在from后,如果子查询没有别名报错。

explode

explode拆分json格式数组,参考1很详细。

/* explode() takes in an array (or a map) as an input and outputs the elements of the array (map) as separate rows. */
select explode(col1) as content from tableName;

但是,如果select查询结果增加其他列,则报错,解决方法是使用lateralview。

lateral view

lateralView: LATERAL VIEW udtf(expression) tableAlias AS columnAlias

SELECT qId, cId, vId FROM answer
LATERAL VIEW explode(vIds) visitor AS vId
WHERE cId = 2

references

  1. case when用法
  2. 表/列别名as用法
  3. lateral vie例子

猜你喜欢

转载自blog.csdn.net/zxc995293774/article/details/81490898