MySql8.0 window function

1. Basic concepts

MySQL has supported window functions since 8.0 . This function has long been supported in most commercial databases and some open source databases, and some are also called analytical functions.

concept:

The concept of a window can be understood as a record set; a window function is a special function executed on a record set that meets certain conditions. For each record, the function must be executed in this window. Some functions vary with the record, and the window size All are fixed, this is a static window; some functions are different records corresponding to different windows, this kind of dynamically changing window is called a sliding window.

Window function and aggregate function:
  1. The aggregation function is to aggregate multiple records into one;
  2. The window function will be executed for every record, how many records are executed or how many;
  3. Aggregate functions can also be used for window functions.

2. Basic format

Basic syntax: <window function> over (clause)
  • The position of <window function> can place dedicated window functions (rank(), percent_rank(), dense_rank(), etc.), or aggregate functions (sum(), avg(), max(), etc.).

  • Window functions operate on the results of where or group by clauses, so in principle, they are only written in the SELECT clause.

  • over is used to specify the window range for the function to execute. If the clause is empty , it means that the window contains all rows that meet the WHERE condition, and the window function calculates based on all rows.

  • If the clause is not empty, the following 4 syntaxes are supported to set the window:

    window_name : assign an alias to the window, if the window SQL involved more, using aliases may look more legible.

    SELECT
    	`姓名`,
    	`班级`,
    	`人气`,
    	rank() over w1 AS rak
    FROM
    	`民工漫班级` window w1 AS ( PARTITION BY `班级` ORDER BY `人气` DESC );
    

    ​ Results:
        Insert picture description here

    partition clause : those fields that are grouped by the window, the window functions are performed in a different packet.

    order by clause : which fields are sorted by the window function are numbered according to the order of the sorted records.

    ​ Frame clause: The frame is a subset of the current partition. The clause is used to define the rules of the subset. It is usually used as a sliding window. (Not introduced in this article)

    ​ Used to operate the newly-created migrant worker class table:

        Insert picture description here

SELECT
	*,
	RANK() over ( PARTITION BY `班级` ORDER BY `人气` DESC ) AS ranking 
FROM
	`民工漫班级`;
# PARTITION BY `班级`:按班级分组(使用group by会改变表的行数,一个类别只保留一行;partition by则不会减少表的行数)
# ORDER BY `人气` DESC:对按班级分组后的结果按人气降序排名,名次作为字段 ranking

​ Get the result:

        Insert picture description here

Three, mysql window function

Functional division:

According to function, the window functions supported by MySQL can be divided into the following categories:

  1. Sequence number function: ROW_NUMBER(), RANK(), DENSE_RANK()
  2. Distribution functions: PERCENT_RANK(), CUME_DIST(), PERCENT_RANK()
  3. Before and after functions: LAG(expr,n), LEAD(expr,n)
  4. Head and tail functions: FIRST_VALUE(expr), LAST_VALUE(expr)
  5. Other functions: NTH_VALUE(expr, n), NTILE(n)

Introduce separately:
Sequence number function: ROW_NUMBER(), RANK(), DENSE_RANK()

For example, we also use the migrant worker man class table above to sort them by popularity with three functions:

SELECT
	*,
	RANK() over (  ORDER BY `人气` DESC ) AS `rank` ,
	DENSE_RANK() over (  ORDER BY `人气` DESC ) AS `dense_rank` ,
	ROW_NUMBER() over (  ORDER BY `人气` DESC ) AS `row_number` 
FROM
	`民工漫班级`;

Result: no more partition by to group classes
        Insert picture description here

From this we can see:

  • RANK(): Sort by side by side, skip repeated sequence numbers-1, 1, 3
  • DENSE_RANK(): Parallel sorting, do not skip repeated sequence numbers-1, 1, 2
  • ROW_NUMBER(): sequential sorting-1, 2, 3; equivalent to row number.

Distribution function: PERCENT_RANK(), CUME_DIST()

percent_rank()

Purpose: Related to the previous RANK() function, each row is calculated according to the following formula: (rank-1) / (rows-1), where rank is the serial number generated by the RANK() function, and rows is the total row of records in the current window This function can be used to calculate quantiles.

Continue to give examples (unexpected application scenarios in real life):

SELECT
	*,
	RANK() OVER w AS rankNo,
	PERCENT_RANK() OVER w AS percent_rankNo 
FROM
	`民工漫班级` WINDOW w AS ( ORDER BY `人气` DESC );

result:
    Insert picture description here

For Luffy, percent_rankNo = (rank-1) / (rows-1) = (3-1) / (10-1) =0.22222222……


CUME_DIST()

Purpose: the number of rows in the group that are less than or equal to the current rank value / the total number of rows in the group

Example: Query the proportion of less than or equal to the current popularity (or a few percent of someone in the top, very commonly used in life)

SELECT
	*,
	CUME_DIST() OVER w AS cdt 
FROM
	`民工漫班级` WINDOW w AS ( ORDER BY `人气` DESC );

result:
            Insert picture description here

Before and after functions: LAG(expr,n), LEAD(expr,n)

Purpose: Return the value of expr in the first n lines (LAG(expr,n)) or the last n lines (LEAD(expr,n)) of the current line (based on the origin of the current line)

Example: This should be quite common in life

SELECT
	*,
	`我的前面一名人气` - `人气` AS `我和前面一名的差距`,
	`人气` - `后面一名人气` AS `我甩开后面一名多少差距` 
FROM
	(
	SELECT
		*,
		LAG( `人气`, 1 ) OVER w AS `我的前面一名人气`,# 取前面第一行的人气值
		LEAD( `人气`, 1 ) OVER w AS `后面一名人气` # 取后面第一行的人气值
		
	FROM
		`民工漫班级` WINDOW w AS ( ORDER BY `人气` DESC ) 
	) t;

result:
    Insert picture description here

Aizen's first line is Luffy, whose popularity is 90, with a gap of 2; the next line is Sasuke, leaving him with a popularity.

Head and tail functions: FIRST_VALUE(expr),LAST_VALUE(expr)

Purpose: return the value of the first (FIRST_VALUE(expr)) or the last (LAST_VALUE(expr)) expr

Example: As of the current popularity, according to the popularity ranking, what is the first and last place (in descending order, the last one must be yourself)

SELECT
	*,
	FIRST_VALUE( `人气` ) OVER w AS `当前第一人气值`,
	LAST_VALUE( `人气` ) OVER w AS `当前倒数第一人气值` 
FROM
	`民工漫班级` WINDOW w AS ( ORDER BY `人气` DESC );

result:
    Insert picture description here

Other functions: NTH_VALUE(expr, n), NTILE(n)

NTH_VALUE(expr, n)

Purpose: Return the value of the Nth expr in the window, expr can be an expression or a column name.

Example: As of the current popularity, display the second, fourth, and sixth popularity value of each character's popularity

SELECT
	*,
	NTH_VALUE( `人气`, 2 ) OVER w AS `第二人气值`,
	NTH_VALUE( `人气`, 4 ) OVER w AS `第四人气值`,
	NTH_VALUE( `人气`, 6 ) OVER w AS `第六人气值` 
FROM
	`民工漫班级` WINDOW w AS ( ORDER BY `人气` DESC );

Result:
    Insert picture description here


NTILE(n)

Purpose: Divide the ordered data in the partition into n levels and record the number of levels

Example:

SELECT
	*,
	ROW_NUMBER() OVER w AS 'row_number',
	NTILE( 2 ) OVER w AS 'ntile2',
	NTILE( 4 ) OVER w AS 'ntile4' 
FROM
	`民工漫班级` WINDOW w AS ( ORDER BY `人气` );

result:
    Insert picture description here

Fourth, use aggregate functions as window functions

Purpose: Dynamically apply aggregate functions (SUM(), AVG(), MAX(), MIN(), COUNT()) to each record in the window, and dynamically calculate various aggregate function values ​​in the specified window

Example: Ordinary use:

SELECT
	*,
	sum( `人气` )  AS current_sum,
	avg( `人气` )  AS current_avg,
	count( `人气` )  AS current_count,
	max( `人气` )  AS current_max,
	min( `人气` ) AS current_min 
FROM
	`民工漫班级`;

result:
    Insert picture description here

Sort by student number, as a window function:

SELECT
	*,
	sum( `人气` ) over w AS current_sum,
	avg( `人气` ) over w AS current_avg,
	count( `人气` ) over w AS current_count,
	max( `人气` ) over w AS current_max,
	min( `人气` ) over w AS current_min 
FROM
	`民工漫班级` WINDOW w AS ( ORDER BY `学号` );

result:
    Insert picture description here

Taking current_sum as an example, the value of current_sum in each row is the sum of the 人气values ​​of all the rows above .

This article mainly refers to:
"MySql8.0 Reference Manual"

Guess you like

Origin blog.csdn.net/weixin_45341339/article/details/111321116