Hive常用函数(日期函数,取整函数,字符串操作函数,集合操作函数)

常用日期函数

unix_timestamp:返回当前或指定时间的时间戳

select unix_timestamp();
+-------------+
|     _c0     |
+-------------+
| 1650871673  |
+-------------+
select unix_timestamp('2022-04-25','yyyy-MM-dd');
+-------------+
|     _c0     |
+-------------+
| 1650844800  |
+-------------+

from_unixtime:将时间戳转为日期格式

select from_unixtime(1650844800);
+----------------------+
|         _c0          |
+----------------------+
| 2022-04-25 00:00:00  |
+----------------------+

current_date:当前日期

select current_date;
+-------------+
|     _c0     |
+-------------+
| 2022-04-25  |
+-------------+

current_timestamp:当前的日期加时间

select current_timestamp;
+-------------------------+
|           _c0           |
+-------------------------+
| 2022-04-25 15:32:29.09  |
+-------------------------+

to_date:抽取日期部分

select to_date('2022-04-25 15:32:29.09');
+-------------+
|     _c0     |
+-------------+
| 2022-04-25  |
+-------------+

year:获取年

select year('2022-04-25 15:32:29.09');
+-------+
|  _c0  |
+-------+
| 2022  |
+-------+

month:获取月

select month('2022-04-25 15:32:29.09');
+------+
| _c0  |
+------+
| 4    |
+------+

day:获取日

select day('2022-04-25 15:32:29.09');
+------+
| _c0  |
+------+
| 25   |
+------+

hour:获取时

select hour('2022-04-25 15:32:29.09');
+------+
| _c0  |
+------+
| 15   |
+------+

minute:获取分

select minute('2022-04-25 15:32:29.09');
+------+
| _c0  |
+------+
| 32   |
+------+

second:获取秒

select second('2022-04-25 15:32:29.09');
+------+
| _c0  |
+------+
| 29   |
+------+

weekofyear:当前时间是一年中的第几周

select weekofyear('2022-04-25 15:32:29.09');
+------+
| _c0  |
+------+
| 17   |
+------+

dayofmonth:当前时间是一个月中的第几天

select dayofmonth('2022-04-25 15:32:29.09');
+------+
| _c0  |
+------+
| 25   |
+------+

months_between: 两个日期间的月份

select months_between('2022-01-25','2022-04-25');
+-------+
|  _c0  |
+-------+
| -3.0  |
+-------+

add_months:日期加减月

select add_months('2022-01-25',+3);
+-------------+
|     _c0     |
+-------------+
| 2022-04-25  |
+-------------+
select add_months('2022-01-25',-3);
+-------------+
|     _c0     |
+-------------+
| 2021-10-25  |
+-------------+

datediff:两个日期相差的天数

select datediff('2022-01-25','2022-04-25');
+------+
| _c0  |
+------+
| -90  |
+------+

date_add:日期加天数

select date_add('2022-04-25',-3);
+-------------+
|     _c0     |
+-------------+
| 2022-04-22  |
+-------------+

date_sub:日期减天数

减负等于加正

select date_sub('2022-04-25',-3);
+-------------+
|     _c0     |
+-------------+
| 2022-04-28  |
+-------------+

last_day:日期的当月的最后一天

select last_day('2022-04-25');
+-------------+
|     _c0     |
+-------------+
| 2022-04-30  |
+-------------+

date_format(): 格式化日期

select date_format('2022-04-25 15:20:17','yyyy/MM/dd HH:mm:ss');
+----------------------+
|         _c0          |
+----------------------+
| 2022/04/25 15:20:17  |
+----------------------+

常用取整函数

round: 四舍五入

select round(3.14);
+------+
| _c0  |
+------+
| 3    |
+------+
select round(3.54);
+------+
| _c0  |
+------+
| 4    |
+------+

ceil: 向上取整

select ceil(3.14);
+------+
| _c0  |
+------+
| 4    |
+------+
select ceil(3.54);
+------+
| _c0  |
+------+
| 4    |
+------+

floor: 向下取整

select floor(3.14);
+------+
| _c0  |
+------+
| 3    |
+------+
select floor(4.14);
+------+
| _c0  |
+------+
| 4    |
+------+

常用字符串操作函数

upper: 转大写

select upper('Hive');
+-------+
|  _c0  |
+-------+
| HIVE  |
+-------+

lower: 转小写

select lower('Hive');
+-------+
|  _c0  |
+-------+
| hive  |
+-------+

length: 长度

select length('Hive');
+------+
| _c0  |
+------+
| 4    |
+------+

trim: 前后去空格

select trim(' Hive ');
+-------+
|  _c0  |
+-------+
| Hive  |
+-------+

lpad: 向左补齐,到指定长度

select lpad('Hive',1,'l');
+------+
| _c0  |
+------+
| H    |
+------+

rpad: 向右补齐,到指定长度

select rpad('Hive',5,'l');
+--------+
|  _c0   |
+--------+
| Hivel  |
+--------+

regexp_replace:使用正则表达式匹配目标字符串,匹配成功后替换!

select regexp_replace('2020/04/25', '/', '-');
+-------------+
|     _c0     |
+-------------+
| 2020-04-25  |
+-------------+

集合操作函数

size: 集合中元素的个数
“map” or “list” is expected at function SIZE
(函数 SIZE 需要“map”或“list”)

select name,deductions from employees;
+-------------------+----------------------------------------------------+
|       name        |                     deductions                     |
+-------------------+----------------------------------------------------+
| John Doe          | {
   
   "Federal Taxes":0.2,"State Taxes":0.05,"Insurance":0.1} |
| Mary Smith        | {
   
   "Federal Taxes":0.2,"State Taxes":0.05,"Insurance":0.1} |
| Todd Jones        | {
   
   "Federal Taxes":0.15,"State Taxes":0.03,"Insurance":0.1} |
| Bill King         | {
   
   "Federal Taxes":0.15,"State Taxes":0.03,"Insurance":0.1} |
| Boss Man          | {
   
   "Federal Taxes":0.3,"State Taxes":0.07,"Insurance":0.05} |
| Fred Finance      | {
   
   "Federal Taxes":0.3,"State Taxes":0.07,"Insurance":0.05} |
| Stacy Accountant  | {
   
   "Federal Taxes":0.15,"State Taxes":0.03,"Insurance":0.1} |
+-------------------+----------------------------------------------------+
select size(deductions) from employees;
+------+
| _c0  |
+------+
| 3    |
| 3    |
| 3    |
| 3    |
| 3    |
| 3    |
| 3    |
+------+

map_keys: 返回map中的key

select map_keys(deductions) from employees;
+----------------------------------------------+
|                     _c0                      |
+----------------------------------------------+
| ["Federal Taxes","State Taxes","Insurance"]  |
| ["Federal Taxes","State Taxes","Insurance"]  |
| ["Federal Taxes","State Taxes","Insurance"]  |
| ["Federal Taxes","State Taxes","Insurance"]  |
| ["Federal Taxes","State Taxes","Insurance"]  |
| ["Federal Taxes","State Taxes","Insurance"]  |
| ["Federal Taxes","State Taxes","Insurance"]  |
+----------------------------------------------+

map_values: 返回map中的value

select map_values(deductions) from employees;
+------------------+
|       _c0        |
+------------------+
| [0.2,0.05,0.1]   |
| [0.2,0.05,0.1]   |
| [0.15,0.03,0.1]  |
| [0.15,0.03,0.1]  |
| [0.3,0.07,0.05]  |
| [0.3,0.07,0.05]  |
| [0.15,0.03,0.1]  |
+------------------+

array_contains: 判断array中是否包含某个元素

select name,subordinates from employees;
+-------------------+------------------------------+
|       name        |         subordinates         |
+-------------------+------------------------------+
| John Doe          | ["Mary Smith","Todd Jones"]  |
| Mary Smith        | ["Bill King"]                |
| Todd Jones        | []                           |
| Bill King         | []                           |
| Boss Man          | ["John Doe","Fred Finance"]  |
| Fred Finance      | ["Stacy Accountant"]         |
| Stacy Accountant  | []                           |
+-------------------+------------------------------+
select name,array_contains(subordinates,'Bill King') from employees;
+-------------------+--------+
|       name        |  _c1   |
+-------------------+--------+
| John Doe          | false  |
| Mary Smith        | true   |
| Todd Jones        | false  |
| Bill King         | false  |
| Boss Man          | false  |
| Fred Finance      | false  |
| Stacy Accountant  | false  |
+-------------------+--------+

sort_array: 将array中的元素排序

select name,sort_array(subordinates) from employees;
+-------------------+------------------------------+
|       name        |             _c1              |
+-------------------+------------------------------+
| John Doe          | ["Mary Smith","Todd Jones"]  |
| Mary Smith        | ["Bill King"]                |
| Todd Jones        | []                           |
| Bill King         | []                           |
| Boss Man          | ["Fred Finance","John Doe"]  |
| Fred Finance      | ["Stacy Accountant"]         |
| Stacy Accountant  | []                           |
+-------------------+------------------------------+

多维分析

grouping sets:多维分析

这个建议看其他大佬的文章

猜你喜欢

转载自blog.csdn.net/weixin_46322367/article/details/124406389