Detailed explanation of grouping function in POSTGRESQL

The grouping function is a grouping and aggregation function in postgresql. Through this function, multiple reports of different levels or dimensions can be included in one query. Let's take a look at how to use this function.

1. Build test data

For the convenience of data display, the test data (fruit_sale table) is constructed here for demonstration, and the construction code is as follows:


DROP TABLE IF EXISTS "fruit_sale";
CREATE TABLE "fruit_sale" (
  "statistical_date" date,
  "product" varchar(255) COLLATE "pg_catalog"."default",
  "year" varchar(5) COLLATE "pg_catalog"."default",
  "qty" numeric(8),
  "amount" numeric(8),
  "region" varchar(50) COLLATE "pg_catalog"."default"
)
;


INSERT INTO "fruit_sale" VALUES ('2018-01-01', '西瓜', '2018', 1721, 253541, '华南');
INSERT INTO "fruit_sale" VALUES ('2019-03-01', '西瓜', '2019', 3437, 104221, '华南');
INSERT INTO "fruit_sale" VALUES ('2019-05-01', '西瓜', '2019', 8963, 122630, '华南');
INSERT INTO "fruit_sale" VALUES ('2019-06-01', '苹果', '2019', 1274, 150122, '华南');
INSERT INTO "fruit_sale" VALUES ('2019-05-01', '苹果', '2019', 6319, 282352, '华南');
INSERT INTO "fruit_sale" VALUES ('2018-11-01', '苹果', '2018', 8614, 170263, '华南');
INSERT INTO "fruit_sale" VALUES ('2018-02-01', '西瓜', '2018', 5530, 129644, '华南');
INSERT INTO "fruit_sale" VALUES ('2018-07-01', '西瓜', '2018', 4711, 129644, '华南');
INSERT INTO "fruit_sale" VALUES ('2018-08-01', '西瓜', '2018', 9187, 220605, '华南');
INSERT INTO "fruit_sale" VALUES ('2018-05-01', '西瓜', '2018', 5678, 129644, '华南');
INSERT INTO "fruit_sale" VALUES ('2018-09-01', '西瓜', '2018', 4029, 119187, '华南');
INSERT INTO "fruit_sale" VALUES ('2018-10-01', '西瓜', '2018', 3129, 137928, '华南');
INSERT INTO "fruit_sale" VALUES ('2018-03-01', '西瓜', '2018', 4496, 203471, '华中');
INSERT INTO "fruit_sale" VALUES ('2018-04-01', '西瓜', '2018', 7359, 206686, '华中');
INSERT INTO "fruit_sale" VALUES ('2018-12-01', '西瓜', '2018', 8646, 267718, '华中');
INSERT INTO "fruit_sale" VALUES ('2018-01-01', '苹果', '2018', 5559, 269419, '华中');
INSERT INTO "fruit_sale" VALUES ('2018-04-01', '苹果', '2018', 5590, 182167, '华中');
INSERT INTO "fruit_sale" VALUES ('2018-07-01', '苹果', '2018', 3852, 130764, '华中');
INSERT INTO "fruit_sale" VALUES ('2018-06-01', '西瓜', '2018', 7434, 206686, '华中');
INSERT INTO "fruit_sale" VALUES ('2019-01-01', '苹果', '2019', 5558, 156995, '华中');
INSERT INTO "fruit_sale" VALUES ('2018-08-01', '苹果', '2018', 8625, 235426, '华中');
INSERT INTO "fruit_sale" VALUES ('2018-11-01', '西瓜', '2018', 2633, 175737, '华东');
INSERT INTO "fruit_sale" VALUES ('2019-01-01', '西瓜', '2019', 1223, 113053, '华东');
INSERT INTO "fruit_sale" VALUES ('2019-02-01', '西瓜', '2019', 9079, 200716, '华东');
INSERT INTO "fruit_sale" VALUES ('2019-06-01', '西瓜', '2019', 1991, 167150, '华东');
INSERT INTO "fruit_sale" VALUES ('2018-02-01', '苹果', '2018', 5832, 142631, '华东');
INSERT INTO "fruit_sale" VALUES ('2018-05-01', '苹果', '2018', 1392, 249027, '华东');
INSERT INTO "fruit_sale" VALUES ('2018-06-01', '苹果', '2018', 9694, 179832, '华东');
INSERT INTO "fruit_sale" VALUES ('2018-09-01', '苹果', '2018', 7249, 286565, '华东');
INSERT INTO "fruit_sale" VALUES ('2019-04-01', '西瓜', '2019', 6524, 206686, '华东');
INSERT INTO "fruit_sale" VALUES ('2019-03-01', '苹果', '2019', 6545, 238608, '华东');
INSERT INTO "fruit_sale" VALUES ('2018-12-01', '苹果', '2018', 2140, 139439, '华东');
INSERT INTO "fruit_sale" VALUES ('2018-10-01', '苹果', '2018', 3490, 125275, '华东');
INSERT INTO "fruit_sale" VALUES ('2019-04-01', '苹果', '2019', 9992, 157696, '华中');
INSERT INTO "fruit_sale" VALUES ('2018-03-01', '苹果', '2018', 5276, 120441, '华东');
INSERT INTO "fruit_sale" VALUES ('2019-02-01', '苹果', '2019', 2246, 216573, '华东');

Some screenshots of the data are as follows:
insert image description here

2. Perform data aggregation

2.1 Normal Aggregation

-- 普通聚合
SELECT
	product,
	YEAR,
	SUM ( qty ) qty 
FROM
	fruit_sale 
GROUP BY
	product,
	YEAR;

The running effect is as follows:
2.1

2.2 groupping sets

2.2.1 Multi-dimensional

Aggregate according to multiple dimensions, the aggregation code is as follows:

--grouping set 多维度
SELECT
	product,
	YEAR,
	SUM ( qty ) qty 
FROM
	fruit_sale 
GROUP BY
	GROUPING SETS ( product, YEAR );

The realization result is shown in the figure below:
2.2.1

2.2.2 Multi-dimensional, summary

By changing the parameters after set, you can control the dimension and level of aggregation, and use the following code to perform multi-dimensional and aggregate:

-- grouping set 多维度+汇总
SELECT
	product,
	YEAR,
	SUM ( qty ) qty 
FROM
	fruit_sale 
GROUP BY
	GROUPING SETS ( product, YEAR, ( ) );

The code execution effect is shown in the figure below:
2.2.2

2.2.3 Multi-dimensional, different levels

By changing the parameters after set, you can control the dimension and level of aggregation, and use the following code to perform multi-dimensional and different-level aggregation:

-- grouping set  多维度+不同级别
SELECT
	product,
	YEAR,
	SUM ( qty ) qty 
FROM
	fruit_sale 
GROUP BY
	GROUPING SETS ( product, YEAR, ( product, YEAR ) );

The code execution effect is shown in the figure below:
2.2.3

2.3 cube

When using cube, all grouping sets will be generated according to the specified fields. If the number of specified fields is n, there will be 2 to the nth power combinations (groupings).

2.3.1 Some cubes

code show as below:

-- 部分cube
SELECT GROUPING
	( product ) category_id,
	product,
	YEAR,
	SUM ( qty ) qty 
FROM
	fruit_sale 
GROUP BY
	YEAR,
	CUBE ( product );

The result of running the code is as follows:
2.3.1

2.3.2 Overall cube

Aggregate products and years, and it can be observed that there are 2 to the power of 2 (4) categories, and the code is as follows:

-- 整体cube
SELECT GROUPING
	( product, year ) category_id,
	product,
	YEAR,
	SUM ( qty ) qty 
FROM
	fruit_sale 
GROUP BY
	CUBE ( product, year ) 
ORDER BY
	GROUPING ( product, year );

Some screenshots of the running results are as follows:

2.3.2
Aggregating products, years and regions, it can be observed that there are 2 to 3 (8) categories, and the code is as follows:

--3字段cube
	SELECT GROUPING
	( product, year,region ) category_id,
	product,
	YEAR,
	region,
	SUM ( qty ) qty 
FROM
	fruit_sale 
GROUP BY
	CUBE ( product, year,region ) 
ORDER BY
	GROUPING ( product, year,region );

After the code is executed, some results are as follows:
2.3.2.b

2.4.rollup

When rollup is used, grouped data with a hierarchical structure will be generated in order according to the specified combination field. If A, B, and C fields are specified, grouping data of A, AB, and ABC levels will be generated respectively.

2.4.1 Partial rollup

Partial rollup can be performed, the following is the corresponding code

--部分rollup
SELECT GROUPING
	( product, YEAR ) category_id,
	region,
	product,
	YEAR,
	SUM ( qty ) qty 
FROM
	fruit_sale 
GROUP BY
	region,
	ROLLUP ( product, YEAR );

The running effect is as follows:
2.4.1

2.4.2 Overall rollup

During rollup, different results can be generated depending on the order of the fields.
First perform an overall rollup according to the (region, product, YEAR) fields, and the following is the corresponding code:

SELECT GROUPING
	( region, product, YEAR ) category_id,
	region,
	product,
	YEAR,
	SUM ( qty ) qty 
FROM
	fruit_sale 
GROUP BY
	ROLLUP ( region, product, YEAR ) 
ORDER BY
	GROUPING ( region, product, YEAR );

After the above code is run, some screenshots of the results are as follows. The data in the red box is different from the code in 2.4.1:
2.4.2.a
then perform an overall rollup according to the ( product, YEAR, region ) fields, and the following is the corresponding code:

	--(product, YEAR, region)组合进行rollup
SELECT GROUPING
	( product, YEAR, region ) category_id,
	region,
	product,
	YEAR,
	SUM ( qty ) qty 
FROM
	fruit_sale 
GROUP BY
	ROLLUP ( product, YEAR, region ) 
ORDER BY
	GROUPING ( product, YEAR, region );	

According to the above code, the screenshot of the generated result is as follows. The data in the red box is the difference between rollup( product, YEAR, region ) and rollup(region, product, YEAR):
2.4.2.b

Guess you like

Origin blog.csdn.net/qq_41780234/article/details/126233330