postgreSQL window function summary

postgreSQL window function summary 1

Window Function Description 2

Difference row_number / rank / dense_rank 2

2 window function statement

1 Prepare data 3

1.1 to create a test table test1 3

1.2 is inserted into the data in Table 3 test1

2 rank over the window function using 3

2.1 According to the partition to see the number of each row 3

2.2 According to the partition view and sort the data in each row of 4

2.3 View 4 highest data for each department

3 row_number over use of the window function 5

The line number of the display data 5 3.1

3.1.1 display order of line numbers 5

3.1.2 acquiring data over a period of 5

3.2 shows the number of partitions 6

3.3 6 according to the display data packets wages department ordering

3.4 View 7 highest data for each department

4 dense_rank window function using 7

The difference between 4.1 rank and dense_rank 7

4.2 dense_rank display window function 7

4.3 rank display window function 8

5 rank / row_number / dense_rank Comparative 8

6 percent_rank window function using 9

6.1 Calculation of the ratio of the packet 9

7 grouping sets using function 10

7.1 first and then grouped according to the department in accordance with the wages packet 10

8 + aggregate function using a window function 11

8.1 View a department number 11

8.2 Statistical wages of each department and 11

8.3 sorted by each department of statistics of wages and 12

8.4 according to the grouping and sorting statistics 12

8.5 window clause 13

Description 13 8.5.1 windom clause

8.5.2 execute SQL statements 13

Sequence function of a window function 14 8.6

DESCRIPTION 14 8.6.1 Function sequence

8.6.2 statement executed 14

9 first_value\last_value使用 15

9.1 first_value and last_value Description 15

15 execution of SQL 9.2

Window Function Description

1, we know that there is a class function called SQL aggregate functions, such as sum (), avg (), max (), etc., such a function may aggregate multiple rows of data in accordance with rules of a line, in general, the aggregate the number of rows to be less than the number of rows in front of the gathering, but sometimes we want to show both before the gathered data, but also to display data gathered, when we introduced window function.

2, all the SQL processing, the window function is the last step in the implementation, but only precede Order by words.

3, Partition By clause referred query partition clause, very similar to Group By, according to the boundary value are the data packet, and the function performed in the previous Over each of the packet, if the packet is exceeded, then the function recalculate.

4, the data input of the order by clause will force the sort. Order By clause for such row_number (), lead (), LAG () and other functions are necessary because if the data is out of order, the results of these functions make no sense. So if you have a Order By clause, count (), min (), etc. The results calculated makes no sense.

5, if only partition by clause does not specify the order by then, our aggregation aggregates within the group.

6, when there is more than one window function with a select query, with each of them is not affected.

Difference row_number / rank / dense_rank of

These three scenarios using the window function is very large, the difference are as follows:

1, ROW_NUMBER () starting from 1, in order to generate the recording sequence of a packet, the value ROW_NUMBER () does not duplicate, when the value of the same sort, are arranged in the order recorded in the table

2, rank () entry in the packet data generated rankings, the ranking in the ranking equal to leave vacancies in

3, dense_rank () entry in the packet data generated rankings, the ranking will not leave a gap equal in the ranking

note:

The difference between rank and dense_rank that will not leave a gap equal rank.

Window function statement

<Window Function>

OVER ([PARTITION BY <column list>]

ORDER BY <sort the list by column>)

over: window function keyword

partition by: grouping the result set

Packet data sorting setting result set: order by

Aggregate functions: aggregate functions (SUM, AVG, COUNT, MAX, MIN)

Built-in functions: rank, dense_rank, row_number, percent_rank, grouping sets, first_value, last_value, nth_value other special window function

1 Prepare data

1.1 create table test1 test

create table test1(
department varchar(50),
number numeric,
wages numeric
);

1.2 is inserted into the data table test1

insert into test1 values
('发展部','8','6000'),
('发展部','10','5200'),
('销售部','1','5000'),
('销售部','3','4800'),
('发展部','7','4200'),
('销售部','4','4800'),
('发展部','9','4500'),
('私立部','5','3500'),
('私立部','2','3900'),
('发展部','11','5200');

2 rank over the use of a window function

rank (): returns the line and, when the ratio of the line number is repeated and repeated intermittently, i.e. return 1,2,2,4 ...

2.1 According to the partition to see the number of each row

select *,rank() over(partition by department) cn from test1;

2.2 View data for each line in accordance with the zoning and sorting

select *,rank() over(partition by department order by wages desc) cn from test1;

2.3 to view the highest data for each department

select * from (
select *,rank() over(partition by department order by wages desc) cn from test1)
tn where cn=1;

3 row_number over use of the window function

row_number (): Returns the row number, the ratio of the line number will not be repeated continuously repeated, i.e. return 1,2,3,4,5 ...., does not return 1,2,2,4 ...

The line number of the display data 3.1

3.1.1 sequentially displayed line number

select *,row_number() over() cn from test1

3.1.2 acquire data over

select *,row_number() over() cn from test1 limit 4 OFFSET 2

3.2 shows the number of the partition

select *,row_number() over(partition by department) cn from test1

3.3 show the data in packets wages department ordering

select *,row_number() over(partition by department order by wages desc) cn from test1

3.4 to view the highest data for each department

select * from ( select *,row_number() over(partition by department order by wages desc) cn from test1 )
tn where cn =1;

4 dense_rank window function used

The difference between 4.1 rank and dense_rank

rank (): returns the line and, when the ratio of the line number is repeated and repeated intermittently, i.e. return 1,2,2,4 ...

dense_rank (): returns the line and, when the ratio of the line number is repeated but continuously repeated, i.e. return 1,2,2,3

Note the difference between two of his

4.2 dense_rank window display function

select *,dense_rank() over(partition by department order by wages desc) cn from test1;

4.3 rank window display function

select *,rank() over(partition by department order by wages desc) cn from test1;

5 rank / row_number / dense_rank compare

rank():返回行号,对比值重复时行号重复并间断, 即返回 1,2,2,4...
row_number():返回行号,对比值重复时行号不重复不间断,即返回 1,2,3,4,5....,不返回 1,2,2,4...
dense_rank():返回行号,对比值重复时行号重复但不间断, 即返回 1,2,2,3

select department,number,wages,
-- 值同排名相同,同时不保留被占用的排名序号,即总排名号不连续
rank() over(partition by department order by wages desc) as rnl,
-- 值同,排名相同,保留下一个的排名序列号,即总排名连续
dense_rank() over(partition by department order by wages desc) as rn2,
-- 强制按列的结果排序,更像行号。
row_number() over(partition by department order by wages desc) as rn3
from test1;

6 percent_rank using window functions

percent_rank (): starting from the current, calculates the ratio (the line number -1) * (1 / (total number of records -1)) in the packet

Calculating the ratio of packets 6.1

select *,percent_rank() over(partition by department order by wages desc) cn from test1;

Use 7 grouping sets of functions

7.1 department first and then grouped according to the packet according to wages

在以下结果中可以看出wages有相同的显示了null值,如果想做唯一数据去掉该条件即可
select department,wages,count(1) from test1 group by grouping sets(department,(department,wages)) order by department;

8 + aggregate function using window function

8.1 View a number of sectors

select department,number,wages,count(*) over() from test1 where department = '发展部';

wages of 8.2 in each department and statistics

select department,number,wages,sum(wages) over(partition by department) from test1;

8.3 按照排序统计每个部门的wages之和

select department,number,wages,sum(wages) over(partition by department ORDER BY wages desc) from test1;

8.4 按照分组和排序统计数据

select department,number,wages,
sum(wages) over() sum1,
sum(wages) over (order by department) sum2,
sum(wages) over (partition by department) sum3,
sum(wages) over ( partition by department order by wages desc) sum4
from test1
order by department desc;

8.5 window子句使用

8.5.1 windom子句的说明

我们在上面已经通过使用partition by子句将数据进行了分组的处理,如果我们想要更细粒度的划分,我们就要引入window子句了。

window子句:

- preceding(preceding):往前

- following(following):往后

- current row(current row):当前行

- unbounded(unbounded):起点

- unbounded preceding 表示从前面的起点

- unbounded following:表示到后面的终点

8.5.2 执行的SQL语句

select department,number,wages,
--所有行相加
sum(wages) over() as sum1,
-- 统计按照department组内的sum
sum(wages) over(partition by department) as sum2,
-- 统计按照department分组wages排序sum
sum(wages) over(partition by department order by wages) as sum3,
-- 表示从前面的起点到当前的行统计sum
sum(wages) over(partition by department order by wages rows between unbounded preceding and current row) as sum4,
-- 表示往前1行到当前的行的统计
sum(wages) over(partition by department order by wages rows between 1 preceding and current row) as sum5,
-- 表示往前1行到往后1行的统计
sum(wages) over(partition by department order by wages rows between 1 preceding and 1 following )as sum6,
-- 表示当前的行到后面的重点统计
sum(wages) over(partition by department order by wages rows between current row and unbounded following) as sum7
from test1;

8.6 窗口函数中的序列函数

8.6.1 序列函数的说明

常用的序列函数有下面几个:

ntile(ntile)

ntile(n),用于将分组数据按照顺序切分成n片,返回当前切片值

ntile不支持rows between,

比如 ntile(2) over(partition by cost order by name rows between 3 preceding and current row)

8.6.2 执行的语句

select department,number,wages,
-- 全局数据进行分割
ntile(3) over() as sample1,
-- 按照分组,将数据今个
ntile(3) over(partition by department) sample2,
-- 按照排序对数据进行分割
ntile(3) over(order by department) sample3,
-- 按照分组和排序进行数据分割
ntile(3) over(partition by department order by wages) sample4
from test1 order by department desc;

9 first_value\last_value使用

9.1 first_value和last_value说明

first_value取分组内排序后,截止到当前行,第一个值

last_value取分组内排序后,截止到当前行,最后一个值,如果有重复值获取获取最后一个

以下函数在greenplum才可使用

nth_value用来取结果集每一个分组的指定行数的字段值。(如果不存在返回null)

9.2 执行的SQL

select department,number,wages,
first_value(number) over(partition by department order by wages desc)as f1,
last_value(number) over(partition by department order by wages desc) as f2
from test1;

发布了427 篇原创文章 · 获赞 108 · 访问量 69万+

Guess you like

Origin blog.csdn.net/xfg0218/article/details/104340898