Regional sales analysis (SQL combat Part 3)

SQL data analysis process:
Here Insert Picture Description
analysis needs:

ESC electricity supplier company wants a regional sales analysis, data analysts hope that the transfer of each year, each quarter the sales champion of the provinces and sales data. Sales data should include: total sales, the total number of completed orders, average sales per order. Business and answer the following two questions:

  • 2018 second quarter sales champion is which province? (Text)
  • 2018 first quarter total sales in Shanxi Province in the first quarter year on year growth in 2017. How much? (Text + mathematical formulas)
  • Submit full code

The results obtained demand, a demand immediately mapped data frame structure:

  • Header : Sales champion provinces ( first column ) + + Year + quarter total sales (in accordance with the respective annual, quarterly clustering) + + average total number of orders for each order sales
    Here Insert Picture Description
    analysis target refinement:

  • The final result is returned : Sales champion provinces and sales data

  • Aggregation: for each year, each quarter, total sales of the largest provinces

Positioning data:

  • Providing a single date, sales, order id and a derived variable at each line - the line average salesorder_info
  • Provide the link between the order and the order of the customer information table customer_info
    Here Insert Picture Description
    detailed answers steps:

① get information quarterly, annual information data (YEAR, MONTH)

② connection between tables to get information through the provinces (INNER JOIN)

③ Calculation of the provinces quarterly sales per year (polymerization)

Total sales (SUM)
the total number of completed orders (COUNT)
average of the sales order (AVG)

④ get each year, quarterly sales champion of the province and sales data

Calculated each year, each quarter the highest sales (GROUP BY, MAX)
connection between tables to get through each year, the highest quarterly sales in the corresponding provinces and other sales data (INNER JOIN)

—— ———— ———— ————— ——— ———— ———— ———— ———— ———— ——— ——

Exploration phase: (does not meet the business requirements)
Here Insert Picture Description


# 生成c表

SELECT
		a.`province`,
		#b.`create_time`,
		YEAR(create_time) AS `year`,
		CASE
			WHEN MONTH(create_time) >=10 THEN 4
			WHEN MONTH(create_time) >=7 THEN 3
			WHEN MONTH(create_time) >=4 THEN 2
			ELSE 1
		END AS `quarter`,
		SUM(payment_amount) AS `total_sales`,
		COUNT(order_id) AS `order_count`,
		AVG(payment_amount) AS `avg_payment`
FROM customer_info AS a
INNER JOIN `order_info` AS b
ON a.customer_id =  b.customer_id
GROUP BY `province`, `year`,`quarter`


# 进一步由c表生成d表

SELECT
		`year`,
		`quarter`,
		MAX(`total_sales`) AS `total_sales`  #我要适当修改一些字段命名
FROM
(
SELECT
		a.`province`,
		#b.`create_time`,
		YEAR(create_time) AS `year`,
		CASE
			WHEN MONTH(create_time) >=10 THEN 4
			WHEN MONTH(create_time) >=7 THEN 3
			WHEN MONTH(create_time) >=4 THEN 2
			ELSE 1
		END AS `quarter`,
		SUM(payment_amount) AS `total_sales`,
		COUNT(order_id) AS `order_count`,
		AVG(payment_amount) AS `avg_payment`
FROM customer_info AS a
INNER JOIN `order_info` AS b
ON a.customer_id =  b.customer_id
GROUP BY `province`, `year`,`quarter`
) AS c
GROUP BY `year`,`quarter`

# 生成终表

SELECT
	c.`province`,
	d.`year`,
	d.`quarter`,
	d.`total_sales`,
	c.`order_count`,
	c.`avg_payment`
FROM
(
SELECT
		`year`,
		`quarter`,
		MAX(`total_sales`) AS `total_sales`  # 适当修改一些字段命名
FROM
(
SELECT
		a.`province`,
		#b.`create_time`,
		YEAR(create_time) AS `year`,
		CASE
			WHEN MONTH(create_time) >=10 THEN 4
			WHEN MONTH(create_time) >=7 THEN 3
			WHEN MONTH(create_time) >=4 THEN 2
			ELSE 1
		END AS `quarter`,
		SUM(payment_amount) AS `total_sales`,
		COUNT(order_id) AS `order_count`,
		AVG(payment_amount) AS `avg_payment`
FROM customer_info AS a
INNER JOIN `order_info` AS b
ON a.customer_id =  b.customer_id
GROUP BY `province`, `year`,`quarter`
) AS c
GROUP BY `year`,`quarter`
) AS d

INNER JOIN
(
SELECT
		a.`province`,
		#b.`create_time`,
		YEAR(create_time) AS `year`,
		CASE
			WHEN MONTH(create_time) >=10 THEN 4
			WHEN MONTH(create_time) >=7 THEN 3
			WHEN MONTH(create_time) >=4 THEN 2
			ELSE 1
		END AS `quarter`,
		SUM(payment_amount) AS `total_sales`,
		COUNT(order_id) AS `order_count`,
		AVG(payment_amount) AS `avg_payment`
FROM customer_info AS a
INNER JOIN `order_info` AS b
ON a.customer_id =  b.customer_id
GROUP BY `province`, `year`,`quarter`
) AS c
ON c.`year` = d.`year` AND c.`quarter` = d.`quarter`AND c.`total_sales` = d.`total_sales`

/*You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '= d.`quarter`, c.`total_sales` = d.`total_sales`' 
ON c.`year` = d.`year` , c.`quarter` = d.`quarter`, c.`total_sales` = d.`total_sales`*/


1. The final result:
Here Insert Picture Description

2.2018 in the second quarter of the sales charts: Jilin Province

3: 2018 first quarter total sales in the first quarter of 2017, Shanxi Province, an increase of: (119,001.49 - 46850.5) / 46850.5 = 154%

____________________________________________________________________________________________________________-

Code logical thinking:

  • Year computing product sales recorded in the quarter - "order information table / order_info" (YEAR / CASE WHE)
  • Calculated each year, quarterly sales - "order information table / order_info" (GROUP BY / SUM / COUNT)
  • Connect the customer information table, generating provinces, annual, quarterly, temporary table of total sales, number of orders, average order sales field c table
  • By further MAX function, and an annual, quarterly group, generating annual, quarterly, the maximum total sales of the temporary table table fields d
  • In turn, connected by inner join statements table c, d table, and connection information provinces, the total number of orders, sales orders and so on to get the final average each year, quarterly data of the championship table;
    Here Insert Picture Description

Technology Review:

Technical Note: polymeric multi-field, multi-key field as a polymerization splicing table (AND); is worth recalling;
one step more difficult, but also wants to have more time to write, more thinking;

Subsequent updates:

sjk special exercises + project combat + xiaozao

Published 17 original articles · won praise 10 · views 1671

Guess you like

Origin blog.csdn.net/weixin_44976611/article/details/104872787