[Interview questions] There is a "Student Transcript Sheet", which contains 3 fields: student number, course, and grade.
Question: Find out the students of Class A and Class B of each course, the judgment standard is the cumulative proportion, 0~60% is recorded as Class A, and 60%~85% is recorded as Class B
【Problem solving ideas】
What is the 28th rule?
The 80/20 rule means that in any set of things, the most important thing only accounts for a small part, about 20%. For example, in a store, only 20% of the products are sold
What is the ABC Taxonomy?
The ABC classification method is a classification method derived from the 28th law. Because it divides objects into three categories: A, B, and C, it is called ABC classification, also known as Pareto analysis.
ABC classification calculation steps:
1) Sort the analysis objects from large to small
2) Calculate the cumulative proportion of each object and the object before it
3) Record the cumulative proportion of 0-60% as Class A, 60%-85% as Class B, and more than 85% as Class C
1. Problem-solving ideas
Topic requirements: find out the students in category A and category B of each course, the judgment standard is the cumulative proportion, 0~60% is recorded as category A, 60%~85% is recorded as category B;
Therefore, the core problem is to calculate the cumulative proportion.
So, what is the cumulative ratio?
Cumulative percentage of course A = cumulative grade of course A / total grade of course A
"Course total grade" is easy to understand, that is, the sum of the grades of all students in each course.
The definition of "Course Cumulative Grade" is:
1) The grades of students in each course are sorted from big to small;
2) Calculate the cumulative grades of each student and the courses before the student.
For example, in the math courses in the table below, the grades in descending order are 96, 65, 55. The cumulative score of the mathematics course of the student number (S002) is 96, the cumulative score of the mathematics course of the student number (S001) is 96+65=161, and so on.
2. Cumulative course grades
The cumulative problem should be solved with window functions.
select *,
sum(成绩) over (partition by 课程
order by 成绩 desc
rows between unbounded preceding and current row) as 课程累计成绩
from 学生成绩表;
search result:
Name the query result of this SQL query as subquery t1.
The rows between ... and ... usage of the window function is used here. The meaning is to sum field 1 from "Start Row" to "End Row".
sum(字段1) over (partition by 字段2
order by 字段3
rows between 起始行and 终止行)
For this question, it is required to get "the cumulative grades of each student and the courses before the student", so the "start line" is the first line (unbounded preceding) of each window, and the "end line" is the current line ( current row).
3. Overall course grade
According to the definition of indicators: the cumulative proportion of course A = the cumulative score of course A / the total score of course A.
Get the numerator in front: the cumulative grade of each course.
Also need to get the denominator: the total course grade for each course.
The total course score of each course, related to "each" should think of using "summary analysis", group by course (group by), summary (job search results and sum)
select 课程,sum(成绩) as 课程总成绩
from 学生成绩表
group by 课程;
search result:
Name the query result of this SQL query as subquery t2.
3. Cumulative proportion
According to the definition of indicators: the cumulative proportion of course A = the cumulative score of course A / the total score of course A.
In order to facilitate the calculation, it is necessary to summarize the results obtained in the above two steps into a table.
Record the query results of the cumulative grades of each course obtained in the first step as table t1, and the query results of the total grades of each course obtained in the second step as table t2, and perform multi-table joins.
select t1.学号,
t1.课程,
t1.成绩,
t1.课程累计成绩,
t2.课程总成绩,
t1.课程累计成绩/2.课程总成绩 as 累计成绩占比
from t1
left join t2
on t1.课程 = t2.课程;
Substituting the subqueries t1 and t2 in steps 1 and 2 into the above SQL statement, we get:
select t1.学号,
t1.课程,
t1.成绩,
t1.课程累计成绩,
t2.课程总成绩,
t1.课程累计成绩/t2.课程总成绩 as 累计成绩占比
from (
select *,
sum(成绩) over (partition by 课程
order by 成绩 DESC
rows between unbounded preceding and current row) as 课程累计成绩
from 学生成绩表
) as t1
left join (
select 课程,sum(成绩) as 课程总成绩
from 学生成绩表
group by 课程
) as t2
on t1.课程 = t2.课程;
search result
Name the query result of this SQL query as subquery t3
4. Classification
The requirement of the title is: to find out the students of class A and class B of each course, the judgment standard is the cumulative proportion, 0~60% is recorded as class A, and 60%~85% is recorded as class B
select t3.学号,
t1.课程,
t1.成绩,
case when t3.累计成绩占比 > 0 and t3.累计成绩占比 <= 0.6 then 'A'
t3.累计成绩占比 > 0.6 and t3.累计成绩占比 <= 0.85 then 'B'
end as 类别
from t3
where t3.累计成绩占比 <= 0.85;
Substituting the subquery t3 in step 3 into the above SQL statement, we get:
select t3.学号,
t3.课程,
t3.成绩,
case when t3.累计成绩占比 > 0 and t3.累计成绩占比 <= 0.6 then 'A'
when t3.累计成绩占比 > 0.6 and t3.累计成绩占比 <= 0.85 then 'B'
end as 类别
from (
select t1.学号,
t1.课程,
t1.成绩,
t1.课程累计成绩,
t2.课程总成绩,
t1.课程累计成绩/t2.课程总成绩 as 累计成绩占比
from (
select *,
sum(成绩) over (partition by 课程
order by 成绩 DESC
rows between unbounded preceding and current row) as 课程累计成绩
from 学生成绩表
) as t1
left join (
select 课程,sum(成绩) as 课程总成绩
from 学生成绩表
group by 课程
) as t2
on t1.课程 = t2.课程
) as t3
where t3.累计成绩占比 <= 0.85;
[Test points for this question]
1. Examine the understanding of Pareto analysis ideas;
2. Examine the understanding of window functions and use them flexibly to solve business problems;
3. Examine the understanding of multi-table joins.
⬇️Click "Read the original text"
Sign up for free Data analysis training camp