7. Functions of Hive series

# 查看系统自带的函数
show functions;
# 显示自带的函数的用法
desc function upper;
# 详细显示自带的函数的用法
desc function extended upper;
# 如果员工的 comm 为 NULL,则用-1 代替
select comm,nvl(comm, -1) from emp;
# CASE WHEN THEN ELSE END
select dept_id,
 sum(case sex when '男' then 1 else 0 end) male_count,
 sum(case sex when '女' then 1 else 0 end) female_count
from emp_sex group by dept_id;
# 行转列 CONCAT_WS(separator, str1, str2,...):它是一个特殊形式的CONCAT()。第一个参数剩余参数间的分隔符。分隔符可以是与剩余参数一样的字符串
SELECT t1.c_b, CONCAT_WS("|",collect_set(t1.name))
FROM (SELECT NAME, CONCAT_WS(',',constellation,blood_type) c_b FROM person_info) t1 GROUP BY t1.c_b
# 列转行 EXPLODE(col):将 hive 一列中复杂的 Array 或者 Map 结构拆分成多行。
# LATERAL VIEW用法:LATERAL VIEW udtf(expression) tableAlias AS columnAlias 解释:用于和 split, explode 等 UDTF 一起使用,它能够将一列数据拆成多行数据,在此基础上可以对拆分后的数据进行聚合
SELECT movie, category_name FROM
movie_info
lateral VIEW explode(split(category,",")) movie_info_tmp AS category_name;

custom function

  • Hive comes with some functions, such as: max/min, etc., but the number is limited, and you can easily expand it by customizing UDF.
  • When the built-in functions provided by Hive cannot meet your business processing needs, you can consider using user-defined functions (UDF: user-defined function).
  • According to the user-defined function category, it is divided into the following three types:
    • UDF (User-Defined-Function)
      one in and one out
    • UDAF (User-Defined Aggregation Function)
      aggregate function, multiple inputs and one output, similar to: count/max/min
    • UDTF (User-Defined Table-Generating Functions)
      has one input and multiple outputs, such as lateral view explode()
  • The implementation method is a little bit, look it up when you use it yourself

Guess you like

Origin blog.csdn.net/SJshenjian/article/details/131862032