Log consecutive number of days

Recently, a demand, the maximum number of days of continuous user logged in, login log data table hive.traffic.access_user only look at two fields: uid, day; date auxiliary table hive.ods.dim_date, this table has only one field day;
Let me talk about ideas,

uid day rownumber day-rownumber【days】
101 20190911 1 20190911-1=20190910
101 20190912 2 20190912-2=20190910
101 20190913 3 20190913-3=20190910
101 20190916 4 20190916-4=20190912
101 20190917 5 20190917-5=20190912

As can be seen, as long as a continuous log in, then the difference between day-rownumber is the same, then the question is, when such reductions in the plan period or New Year's Eve will be a problem, so we have to first convert the Date sequence of numbers

select day,ROW_NUMBER() OVER(ORDER BY day) daynum from hive.ods.dim_date

Here Insert Picture Description
Next, we need to log user login uid according to the group, and sorted by date, and then calculate the rownumber

with a as (select uid,day from hive.traffic.access_user where day>=20190801 and uid<>'')
select uid,day,ROW_NUMBER() OVER(PARTITION BY uid ORDER BY uid,day) rownum from a group by day,uid

Here Insert Picture Description
The next step is to calculate the difference, the difference represents the same continuous login date, complete the following sql

with a as (select uid,day from hive.traffic.access_user where day>=20190801 and uid<>''),
b as (select uid,day,ROW_NUMBER() OVER(PARTITION BY uid ORDER BY uid,day) rownum from a group by day,uid ),
c as(select day,ROW_NUMBER() OVER(ORDER BY day) daynum from hive.ods.dim_date),
d as (select uid,b.day,daynum,rownum,daynum-rownum days from b join c on b.day=c.day )
select uid,min(day)"连续登录开始日",count(*) "连续登录天数" from d group by uid,days

Here Insert Picture Description
end

Published 118 original articles · won praise 37 · views 170 000 +

Guess you like

Origin blog.csdn.net/woloqun/article/details/101280577