CASE WHEN function @sql learning

In mysql, you can use the CASE WHEN function to complete data grouping.

The CASE WHEN function is used to judge and group data

Knowledge of flow control statements from MySQL triggers.

CASE WHEN is a conditional control statement commonly used in SQL programming.

 

Function of CASE WHEN :

  • Processing of new data items

Purpose: According to the existing fields, use the Case When statement to make logical judgments, and can process new fields.

For example, the new fields "age group" and "asset level" are calculated

  • Summary information processing

Purpose: Combining the Case When statement with the summary function (such as the Sum function) can realize more flexible summary information processing functions.

Tips: It is very convenient to use statistics on some fields after group by

It's like wanting to count gender after groupby

  • filter control

Purpose: Case When can also be used in filter conditions to achieve more flexible control of filter conditions.

 

There are two syntaxes for case when: what is the difference between the two syntaxes?

1. Simple function: enumerate all possible values ​​of this field*

CASE [col_name]

WHEN [value1] THEN [result1]

WHEN [value2] THEN [result2]

ELSE [default]

END

 

2. Search function (conditional judgment)

You can write judgments, and the search function will only return the first value that meets the conditions, and other cases are ignored

CASE

WHEN [expr] THEN [result1]

WHEN [expr] THEN [result2]

ELSE [default]

END

 

 

  • Processing of new data items

Example; (simple function) According to the hero's name, match the equipment that belongs to them ("equipment" is a new data item)

SELECT
    NAME '英雄',
    CASE NAME
        WHEN '德莱文' THEN   '斧子'
        WHEN '德玛西亚-盖伦' THEN   '大宝剑'
        ELSE  '无'
    END [as] '装备'
FROM    user_info;

 

Example: (Search function) According to age, create a new data item "age group", field grouping

-- when 表达式中可以使用 and 连接条件
SELECT
    NAME '英雄',    age '年龄',
    CASE
        WHEN age < 18 THEN   '少年'
        WHEN age < 30 THEN   '青年'
        WHEN age >= 30  AND age < 50 THEN   '中年'
        ELSE   '老年'
END [as] '年龄段'
FROM    user_info;
  • Summary information processing

Example: (Cooperate with aggregate function) Count the number of male and female customers among the pre-80s and post-80s.

SELECT 
	CASE WHEN birth_dt < mdy(1,1,1980) THEN '80前' ELSE '80后' END as 年龄段
	, SUM( CASE WHEN gender = '1' THEN 1 ELSE 0 END) as 男性数量
	, SUM( CASE WHEN gender = '2' THEN 1 ELSE 0 END) as 女性数量
FROM custom_info
GROUP BY 年龄段 

Question: group by is based on the age group, but it seems that the aggregation after grouping is not seen. Thinking about the SQL execution process, such as getting the new data item 'age group' first, and then counting the number of men and women according to the age group.

 

Example: Count the number of boys and girls in each class by cls_id

use sc_sys;
-- 查看学生表(结果1)
SELECT * FROM tb_student;
-- 按cls_id 分班,创建新的数据项“班别”(结果2)
SELECT *, 
	CASE cls_id
		WHEN 1 THEN '1班'
		WHEN 2 THEN '2班'
		WHEN 3 THEN '3班'
		WHEN 4 THEN '4班'
		WHEN 5 THEN '5班'
		WHEN 6 THEN '6班'
		WHEN 7 THEN '7班'
		WHEN 8 THEN '8班'
		ELSE '其他班级'
	END as 班别
FROM tb_student ORDER BY 班别, ssex DESC;

-- 按cls_id创建班别,并统计每个班别的男、女生数量(结果3)
SELECT 
	CASE cls_id
		WHEN 1 THEN '1班'
		WHEN 2 THEN '2班'
		WHEN 3 THEN '3班'
		WHEN 4 THEN '4班'
		WHEN 5 THEN '5班'
		WHEN 6 THEN '6班'
		WHEN 7 THEN '7班'
		WHEN 8 THEN '8班'
		ELSE '其他班级'
	END 班别
	, SUM( CASE WHEN ssex = '男' THEN 1 ELSE 0 END) as 男性数量
	, SUM( CASE WHEN ssex = '女' THEN 1 ELSE 0 END) as 女性数量
FROM tb_student
GROUP BY 班别;

Tips: It is very convenient to use statistics on some fields after group by

It's like wanting to count gender after groupby

 

-- 按cls_id创建班别,并统计每个班别的男、女生数量(结果4)
SELECT 
	CASE cls_id
		WHEN 1 THEN '1班'
		WHEN 2 THEN '2班'
		WHEN 3 THEN '3班'
		WHEN 4 THEN '4班'
		WHEN 5 THEN '5班'
		WHEN 6 THEN '6班'
		WHEN 7 THEN '7班'
		WHEN 8 THEN '8班'
		ELSE '其他班级'
	END 班别, ssex, COUNT(*) 数量
FROM tb_student
GROUP BY 班别, ssex DESC
ORDER BY 班别;
-- 若报错,将班别改为class,即字段命名最好还用英文
-- Unknown column '班別' in 'order clause'

 

  • filter control

    Example: Screening the list of target customers,

    For the 0200 region, target customers with assets greater than or equal to 1 million,

    For other regions, target customers with assets greater than or equal to 500,000.

    Output the customer number, name, mobile phone number, area code and total assets of the target customer.

SELECT  Party_Id, Name, Mobile, Zone_Num, Total_Asset 
FROM   Custom_Info
WHERE  CASE  WHEN  Zone_Num='0200'  THEN   Total_Asset>=1000000 
        	  ELSE   Total_Asset>=500000   
        END
-- 即针对0200地区,使用条件Total_Asset>=1000000
-- 而针对其他地区,使用条件Total_Asset>=500000  
  • Other functions: row to column

Example: the aggregation function sum cooperates with the simple function of case when to realize row-to-column conversion , alias

-- 聚合函数 sum 配合 case when 的简单函数实现行转列
SELECT  st.stu_id '学号',   st.stu_name '姓名',
    sum( CASE co.course_name  WHEN '大学语文' THEN  sc.scores ELSE  0 END ) '大学语文',
    sum( CASE co.course_name  WHEN '新视野英语' THEN sc.scores  ELSE  0 END ) '新视野英语',
FROM    edu_student st
LEFT JOIN edu_score sc ON st.stu_id = sc.stu_id
LEFT JOIN edu_courses co ON co.course_no = sc.course_no
GROUP BY    st.stu_id
ORDER BY    NULL;
-- 注释:group by后默认排序 后面跟上order by null表示 不排序,查询速度更快。

This kind of function is a bit like the function that the previous post wants to achieve, "Data division processing (dataframe data structure in pandas based on python)", link https://blog.csdn.net/Cameback_Tang/article/details/102876947

 

Case when function can also be used in groupby

-- 查询output的分布
select 
case when output < -500 then '(, -500)'
when output < -250 then '[-500, -250)'
when output < -200 then '[-250, -200)'
when output < -128 then '[-200, -128)'
when output < -88  then '[-128, -88)'
when output < -18  then '[-88, -18)'
when output < 0    then '[-18, 0)'
when output = 0    then '0'
else 'else' end output_bin
, count(user_id) cnt
from data_table
group by case when output < -500 then '(, -500)'
when output < -250 then '[-500, -250)'
when output < -200 then '[-250, -200)'
when output < -128 then '[-200, -128)'
when output < -88  then '[-128, -88)'
when output < -18  then '[-88, -18)'
when output < 0    then '[-18, 0)'
when output = 0    then '0'
else 'else' end

 

Guess you like

Origin blog.csdn.net/Cameback_Tang/article/details/108247243