有销售表T,样例数据如下,请用sql查出每个员工的年累计销售额
原表T:
员工姓名 月份 销售额
emi 201801 10000
emi 201802 11000
emi 201803 9000
emi 201901 10000
tommy 201801 12500
tommy 201802 10500
tommy 201803 8900
tommy 201901 9000
查询结果要求如下:
员工姓名 月份 销售额 年累计销售额
emi 201801 10000 10000
emi 201802 11000 21000
emi 201803 9000 30000
tommy 201801 12500 12500
tommy 201802 10500 23000
tommy 201803 8900 31900
tommy 201901 9000 9000
hive建表
create table t(t_name string,t_month string,t_sale int)
row format delimited fields terminated by '\t';
创建本地数据文件vim /root/temp/data.csv
emi 201801 10000
emi 201802 11000
emi 201803 9000
emi 201901 10000
tommy 201801 12500
tommy 201802 10500
tommy 201803 8900
tommy 201901 9000
加载数据
load data local inpath '/root/temp/data.csv' into table t;
查询结果,显示数据导入完成。
select * from t;
显示:
+--------+--------+--------+
| t_name | t_date | t_sale |
+--------+--------+--------+
| emi | 201801 | 10000 |
| emi | 201802 | 11000 |
| emi | 201803 | 9000 |
| emi | 201901 | 10000 |
| tommy | 201801 | 12500 |
| tommy | 201802 | 10500 |
| tommy | 201803 | 8900 |
| tommy | 201901 | 9000 |
+--------+--------+--------+
查询年销售额语句
使用sum()开窗函数
select t_name `员工姓名`
,t_date `月份`
,t_sale `销售额`
,sum(t_sale) over (partition by t_name,substr(t_date,1,4)) as `年累计销售额` from t;
结果显示
+----------+--------+--------+--------------+
| 员工姓名 | 月份 | 销售额 | 年累计销售额 |
+----------+--------+--------+--------------+
| emi | 201801 | 10000 | 30000 |
| emi | 201802 | 11000 | 30000 |
| emi | 201803 | 9000 | 30000 |
| emi | 201901 | 10000 | 10000 |
| tommy | 201801 | 12500 | 31900 |
| tommy | 201802 | 10500 | 31900 |
| tommy | 201803 | 8900 | 31900 |
| tommy | 201901 | 9000 | 9000 |
+----------+--------+--------+--------------+
此种语法用于查询每月在全年总量的占比
添加order by
用于显示截至当前日期的累加
select t_name `员工姓名`,t_date `月份`,t_sale `销售额`,sum(t_sale) over (partition by t_name,substr(t_date,1,4) order by t_date ) as `年累计销售额` from t;
结果显示,与题目要求一致
+----------+--------+--------+--------------+
| 员工姓名 | 月份 | 销售额 | 年累计销售额 |
+----------+--------+--------+--------------+
| emi | 201801 | 10000 | 10000 |
| emi | 201802 | 11000 | 21000 |
| emi | 201803 | 9000 | 30000 |
| emi | 201901 | 10000 | 10000 |
| tommy | 201801 | 12500 | 12500 |
| tommy | 201802 | 10500 | 23000 |
| tommy | 201803 | 8900 | 31900 |
| tommy | 201901 | 9000 | 9000 |
+----------+--------+--------+--------------+
常见错误:使用group by
写
select t_name,t_month,t_sale,sum(t_sale) over (partition by t_name,substr(t_month,1,4)) from sales group by t_name,substr(t_month,1,4);
代码直接显示出错
答案sql语句:
select
员工姓名
,月份
,销售额
,sum(销售额) over (partition by 员工姓名,substr(月份,1,4) order by 月份) as 年累计销售额 from t;