Functions before and after SQL windowing functions (LEAD, LAG)

window function

When we need to perform some more complex subqueries, the aggregation function will be very troublesome, so we can use the window function to group and then use the function query. The window function can display both the data before aggregation and the data after aggregation, and can return the column value of the basic row and the result column after aggregation in the same row

Common application scenario: sorting the grades of students in the class

Common window functions
insert image description here

The basic form of the window function

func_name(<parameter>) 
OVER(
[PARTITION BY <part_by_condition>] 
[ORDER BY <order_by_list> ASC|DESC]
[rows between ?? And ??]
)

For the explanation of specific fields, see my previous article: Basic usage of SQL window function and aggregation function


before and after function

LEADFunctions and LAGfunctions are mainly used to query the previous or next value of the current field. If there is no data in the upward or downward value, it will be displayed as NULL

  • LEAD: backward offset
  • LAG: forward offset
LAG(<expression>,offset,default_value) 
OVER(
     PARTITION BY expr,
     ORDER BY expr [ASC|DESC]
	)

field explanation

  • Expression: the field that needs to be offset
  • Offset: the amount of offset
  • default_value: the default value when the recording window is exceeded (can be set to 0, default is null)

Application 1:
Weather table for comparison of temperature before and after dates
insert image description here

  • Get the temperature of the previous day and the temperature of the next day
select *,
lead(temperature, 1) over(order by recordDate) as lead_temp,
lag(temperature, 1) over(order by recordDate) as lag_temp
from weather

insert image description here

  • Get the date with higher temperature than the previous day
with a as (       
select *,
lead(temperature, 1) over(order by recordDate) as lead_temp,
lag(temperature, 1) over(order by recordDate) as lag_temp
from weather
)
select * from a 
where lag_temp < temperature

insert image description here


Application 2: Find
the LeetCode of users who have logged in for 5 consecutive days: 1454. Active users

user login form
insert image description here

answer:

  1. The user may have logged in multiple times on the same day, but we only need one login date, so we need to group by user_id, login_time to deduplicate
  2. Use the lead() over() window function to find the fourth login date down
  3. Use datediff to find out whether the fourth landing date is 4 days away from the current date, that is, 5 consecutive days

leadFind the difference in days with

select user_id, login_time, 
lead(login_time,4) over(partition by user_id order by login_time) as '5次后登录的时间', 
datediff(lead(login_time,4) over(partition by user_id order by login_time), login_time) as '天数差'
from user_login
group by user_id, date(login_time);

Then find users who have logged in for 5 consecutive days from the above, that is, users with a "day difference" of 4

-- 完整代码
with a as(
select user_id, login_time, 
lead(login_time,4) over(partition by user_id order by login_time) as '5次后登录的时间', 
datediff(lead(login_time,4) over(partition by user_id order by login_time), login_time) as date_diff
from user_login
group by user_id, date(login_time)
)
select distinct user_id from a where date_diff = 4;

insert image description here


The two main examples used here are as follows:

-- weather表
drop table if exists weather;
create table weather(
    id int,
    recordDate date,
    temperature int
);
insert into weather
values (1,'2015-01-01',10),
       (2,'2015-01-02',25),
       (3,'2015-01-03',20),
       (4,'2015-01-04',30);

-- 用户登录表
drop table if exists user_login;
create table user_login
( 
user_id varchar(100), 
login_time datetime
); 

insert into user_login values 
(1,'2020-11-25 13:21:12'), 
(1,'2020-11-24 13:15:22'), 
(1,'2020-11-24 10:30:15'), 
(1,'2020-11-24 09:18:27'), 
(1,'2020-11-23 07:43:54'), 
(1,'2020-11-10 09:48:36'), 
(1,'2020-11-09 03:30:22'), 
(1,'2020-11-01 15:28:29'), 
(1,'2020-10-31 09:37:45'), 
(2,'2020-11-25 13:54:40'), 
(2,'2020-11-24 13:22:32'), 
(2,'2020-11-23 10:55:52'), 
(2,'2020-11-22 08:56:33'),
(2,'2020-11-22 06:30:09'), 
(2,'2020-11-21 08:33:15'), 
(2,'2020-11-20 05:38:18'), 
(2,'2020-11-19 09:21:42'), 
(2,'2020-11-02 00:19:38'), 
(2,'2020-11-01 09:03:11'), 
(2,'2020-10-31 07:44:55'), 
(2,'2020-10-30 08:56:33'), 
(2,'2020-10-29 09:30:28'); 

Reference source: Window function SQL practice questions
in MySQL8 : active users who have logged in for 5 consecutive days

Guess you like

Origin blog.csdn.net/weixin_46599926/article/details/128276307