用hive实现判断股票价格的波峰 波谷

需求 对与同一类别的股票在连续时间的三个价格中 

如果中间的价格比旁边的高为波峰 比方便的低为波谷

比如下图11,10为波谷 13为波峰

        15
         
    13    
12        
  11      
      10  

创建表

CREATE TABLE IF NOT EXISTS lp.study_2(
  `id` int commit "主键",
  `code` bigint commit "编号",
  `time` string commit "时间",
  `price` double commit "价格")
ROW FORMAT DELIMITED
  FIELDS TERMINATED BY '\u0001'
  LINES TERMINATED BY '\n'
STORED AS INPUTFORMAT
  'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat';


一、先从最简单的来 比如只有一个类别(code 都是1) id是连续自增,时间是连续递增的

  insert into table lp.study_2
  select 1,1,'2017-12-14 12:03:06',12 from a1 limit 1
  union all
  select 2,1,'2017-12-14 12:03:07',11 from a1 limit 1
  union all
  select 3,1,'2017-12-14 12:03:08',13 from a1 limit 1
  union all
  select 4,1,'2017-12-14 12:03:09',10 from a1 limit 1
  union all
  select 5,1,'2017-12-14 12:03:10',15 from a1 limit 1;


 实现sql

 select a.id,a.price,
 case when b.price is null then "未知"
      when c.price is null then "未知"
      when a.price>b.price and a.price>c.price then "波峰"
      when a.price<b.price and a.price<c.price then "波谷"
 else "未知" end as mark
 from lp.study_2 a 
 left join 
 lp.study_2 b on a.code=b.code and b.id=a.id-1
 left join
 lp.study_2 c on a.code=c.code and c.id=a.id+1

结果

1 12.0  未知
2 11.0 波谷
3 13.0 波峰
4 10.0 波谷
5 15.0 未知


二、如果有若干个类别的话  为了方便看 所以数据比较整齐

  insert into table lp.study_2
  select 1,1,'2017-12-14 12:03:06',12 from a1 limit 1
  union all
  select 2,1,'2017-12-14 12:03:07',11 from a1 limit 1
  union all
  select 3,1,'2017-12-14 12:03:08',13 from a1 limit 1
  union all
  select 4,1,'2017-12-14 12:03:09',10 from a1 limit 1
  union all
  select 5,1,'2017-12-14 12:03:10',15 from a1 limit 1
  union all 
  select 6,2,'2017-12-14 12:03:06',12 from a1 limit 1
  union all
  select 7,2,'2017-12-14 12:03:07',14 from a1 limit 1
  union all
  select 8,2,'2017-12-14 12:03:08',13 from a1 limit 1
  union all
  select 9,2,'2017-12-14 12:03:09',10 from a1 limit 1
  union all
  select 10,2,'2017-12-14 12:03:10',15 from a1 limit 1;

类别1  11 10为波谷 13为波峰

        15
         
    13    
12        
  11      
      10  
类别2 14为波峰 10为波谷

        15
  14      
    13    
12        
         
      10  

第一步:首先生成中间表 利用函数 根据code分组根据time排序

  create table lp.study_3 as 
  select code,time,price,row_number() over(partition by code order by time) rn from lp.study_2; 


第二步:计算

  select a.code,a.time,a.price,
 case when b.price is null then "未知"
      when c.price is null then "未知"
      when a.price>b.price and a.price>c.price then "波峰"
      when a.price<b.price and a.price<c.price then "波谷"
 else "中间" end as mark,a.rn
 from lp.study_3 a 
 left join 
 lp.study_3 b on a.code=b.code and b.rn=a.rn-1
 left join
 lp.study_3 c on a.code=c.code and c.rn=a.rn+1 order by a.code,a.rn;

 结果




猜你喜欢

转载自blog.csdn.net/oracle8090/article/details/78802079