非技术快速入门

1.基础查询

（1）基础查询

查询单列：查询所需的列名写在 SELECT 关键字之后，FROM 关键字指出从哪个表中查询数据。

select device_id from user_profile

查询多个列：唯一的不同是必须在SELECT 关键字后给出多个列名，列名之间必须以逗号分隔。

select device_id,gender,age,university from user_profile

查询所有列：在实际列名的位置使用星号（*）通配符

select * from user_profile

（2）简单处理查询结果

查询结果去重：distinct 关键字加在去重字段前

select distinct university from user_profile

查询结果限制返回行数：limit语句限制返回结果行数。limit语句一般加在SQL语句末尾，并且与数字搭配，写作limit n，n代表想要限制返回的行数。

select device_id from user_profile limit 2

查询后的列重命名：列重命名的语法 -- 'as'

select device_id as user_infos_example from user_profile limit 2

2.条件查询

（1）基础操作符

where语句：where语句限制条件对数据进行过滤

select device_id,university from user_profile where university = '北京大学'

不等于号：< > 或！=

select device_id,gender,age,university from user_profile where age>24

select device_id,gender,age,university from user_profile 
where university !='复旦大学'

范围值：范围值限制方法为between n1 and n2，n1和n2为要限制的区间范围，需要注意两点：

一是 and 之前的值需要小于and 之后的值，不然查询会返回空结果。
二是在hive sql中结果会包括两端值，即如果语句写为 betwen 10 and 20, 那么结果中会包括取值等于10或20的数据。

select device_id,gender,age from user_profile where age between 20 and 23

空值：当一个字段不包括任何值时，称其包含空值 NULL

对空值的判断：特殊的WHERE子句，IS NULL 或 IS NOT NULL

select device_id,gender,age,university from user_profile where age is not null

（2）高级操作符

AND操作符：在过滤数据时如果我们想结果同时满足多个条件，可以使用 AND 操作符给 WHERE子句附加条件

select device_id,gender,age,university,gpa from user_profile 
where gender='male' and gpa>3.5

OR操作符：在过滤数据时如果我们想要结果只需满足多个条件中的一个，可以使用OR操作符对条件进行连接

select device_id,gender,age,university,gpa from user_profile 
where university='北京大学' or gpa>3.7

IN 和 Not IN：IN 操作符用来指定条件范围，范围中的每个条件都可以进行匹配。IN 取一组由逗号分隔、括在圆括号中的合法值。WHERE 子句中的 NOT 操作符有且只有一个功能，那就是否定其后所跟的任何条件。

select device_id,gender,age,university,gpa from user_profile 
where university in ('北京大学','复旦大学','山东大学')

操作符混合运用：两者优先级的问题， SQL在处理 OR操作符前，优先处理 AND 操作符。

select device_id,gender,age,university,gpa from user_profile 
where gpa>3.5 and university = '山东大学' or gpa>3.8 and university = '复旦大学'

Like操作符——模糊匹配

Like操作符需要和通配符结合使用，一般最常用的通配符是 %，在搜索串中，%表示任何字符出现任意次数。使用两个通配符，它们位于检索词的两端，%北京% 标识匹配任何位置上包含文本北京的值

select device_id,age,university from user_profile 
where university like '%北京%'

（3）基础排序

单列排序：ORDER BY 子句取一个或多个列的名字，据此对输出进行排序。

select device_id,age from user_profile order by age

多列排序：按多个列排序，指定列名，列名之间用逗号分开即可，在排序时会按照列给出的先后顺序依次排序。

select device_id,gpa,age from user_profile
order by gpa,age

升序降序：在默认情况下，ORDER BY会对数据进行升序排序（asc）。为了进行降序排序，必须指定desc 关键字。desc关键字只应用到直接位于其前面的列名。也就是说如果我们SQL语句写的是 Order By age desc,gpa，那么排序将首先按照age降序排序序，再按照gpa升序排序输出。

select device_id,gpa,age from user_profile
order by gpa desc,age desc

3.高级查询

（1）计算函数

AVG：AVG()为平均值函数，可用来返回所有列的平均值，也可以用来返回特定列或行的平均值。
COUNT：COUNT()函数为计数函数，可利用 COUNT()确定表中行的数目或符合特定条件的行的数目。使用 COUNT(*)对表中行的数目进行计数，不管表列中包含的是空值（NULL）还是非空值。 使用 COUNT(column)对特定列中具有值的行进行计数，忽略 NULL 值。

MAX：MAX()返回指定列中的最大值。MAX在使用时，()需指定要返回最大值的列名

select max(gpa) as gpa from user_profile where university='复旦大学'

MIN：返回指定列的最小值。MIN()也需要求指定列名

SUM：SUM()用来返回指定列值的和（总计）。

取整函数：round(value,n)，其中 value代表想要限制小数位数的字段，n代表想要限制的小数位数。

select 
  count(gender) as male_num, 
  round(avg(gpa),1) as avg_gpa
from user_profile where gender='male'

（2）分组查询

分组计算：在使用Group by时，有一些事项需要注意：

1. GROUP BY 子句可以包含任意数目的列，因而可以对分组进行嵌套，更细致地进行数据分组。

2. 除聚集计算语句外，SELECT 语句中的每一列都必须在 GROUP BY 子句中同时给出。

3. 如果分组列中包含具有 NULL 值的行，则 NULL 将作为一个分组返回。如果列中有多行 NULL 值，它们将分为一组。

4. GROUP BY 子句必须出现在 WHERE 子句之后，ORDER BY 子句之前。

select 
  gender,university,
  count(device_id) as user_num,
  avg(active_days_within_30) as avg_active_days,
  avg(question_cnt) as avg_question_cnt
from user_profile
group by gender,university

分组过滤

除了能用 GROUP BY 分组数据外，SQL 还允许在分组的结果下进行过滤，分组查询的结果不能简单的使用Where语句进行过滤，而需要使用专门的Having语句。

select 
  university,
  avg(question_cnt) as avg_question_cnt,
  avg(answer_cnt) as avg_answer_cnt
from user_profile
group by university
having avg_question_cnt<5 or avg_answer_cnt<20

where后面不能加聚合函数！！！！！ 以下表示错误

where sum(item_price*quantity)>=1000

having 必须在 group by 之后

select distinct order_num,sum(item_price*quantity) as total_price
from OrderItems
group by order_num
having sum(item_price*quantity)>=1000
order by order_num

分组排序 ：Order By

select university,avg(question_cnt) as avg_question_cnt
from user_profile
group by university
order by avg_question_cnt

4.多表查询

（1）子查询

子查询（subquery），即嵌套在其他查询中的查询。

select device_id,question_id,result
from question_practice_detail
where device_id in (
    select device_id from user_profile where university='浙江大学'
)
order by question_id

（2）链接查询

左链接（左外连接），其语句为Left join .. on

左连接的定义：是以左表为基础，根据ON后给出的两表的条件将两表连接起来。结果会将左表所有的查询信息列出，而右表只列出ON后条件与左表满足的部分。在使用Left Join时注意要点：

在使用Left join 时，写在前面的表为匹配时的底表，使用on给出匹配条件，匹配条件可以不唯一。（在写表与表之间的链接关系时，大多数情况下都必须要限制匹配条件，如果在匹配时没用限制匹配条件，这时就会导致笛卡尔积。）
在表匹配时，一般我们会将要匹配的两段查询用括号括起来，并在括号末尾给予一串字母，作为表的别名。
在使用on写匹配条件时，如果两个表中有字段名称一样，需要用表名.字段的写法指出该字段取自哪一个表，在最终Select语句输出时同理，可以看到设备ID最终输出的写法a.device_id，因为device_id同时存在于两个表中，我们这里指定要用a表中的device_id列来输出。

select university,difficult_level,
  round(count(qpd.question_id) / count(distinct qpd.device_id),4) as avg_answer_cnt
from question_practice_detail as qpd

left join user_profile as up 
on qpd.device_id=up.device_id
left join question_detail as qd
on qpd.question_id=qd.question_id

group by university,difficult_level

右链接

右向外联接是左向外联接的反向联接，是以右表为基础，返回右表的所有行，在实际使用中，一般使用Left join就可以满足需求。

内链接

内连接是A表的所有行和B表的所有行在指定条件下得到的交集，所用到的语句为Join

select university,
  count(question_id) / count(distinct qpd.device_id) as avg_answer_cnt
from question_practice_detail as qpd
inner join user_profile as up
on qpd.device_id=up.device_id
group by university

select university,difficult_level,
  round(count(qpd.question_id) / count(distinct qpd.device_id) ,4) as avg_answer_cnt
from question_practice_detail as qpd

inner join user_profile as up 
on qpd.device_id=up.device_id and up.university='山东大学'
inner join question_detail as qd
on qpd.question_id=qd.question_id

group by difficult_level

（3）组合查询

SQL 也允许执行多个查询（多条 SELECT 语句），并将结果作为一个查询结果集返回。这些组合查询通常称为并（union）或复合查询（compound query）。

union：可给出多条 SELECT 语句，将它们的结果组合成一个结果集。UNION 默认从查询结果集中自动去除了重复的行

union all：不去重的返回所有的匹配行

select device_id,gender,age,gpa from user_profile
where university='山东大学'

union all

select device_id,gender,age,gpa from user_profile
where gender='male'

5.必会的常用函数

（1）条件函数

IF 条件函数

IF函数是最常用到的条件函数，其写法为 if(x=n,a,b)，x=n代表判断条件，如果x=n时，那么结果返回a，否则返回b。

select if(age<25 or age is null,'25岁以下','25岁及以上') as age_cut,
  count(*) as number
from user_profile
group by age_cut

Case when

case when与if的作用基本相同，也是按照条件更换列中的内容，区别是case when可以对多个条件进行转换，注意结尾需要加end作为结束

CASE 测试表达式
WHEN 简单表达式1 THEN 结果表达式1
WHEN 简单表达式2 THEN 结果表达式2
 …
WHEN 简单表达式n THEN 结果表达式n
[ ELSE 结果表达式n+1 ]
END

select case when age<25 or age is null then '25岁以下'
            else '25岁及以上'
            end as age_cut,count(*) as number
from user_profile
group by age_cut

select device_id,gender,
  case when age<20 then '20岁以下'
       when age>=20 and age<=24 then '20-24岁'
       when age>=25 then '25岁及以上'
       else '其他'
       end as age_cut
from user_profile

（2）日期函数

时间戳-日期格式转化

from_unixtime可以将时间戳转换成日期

from_unixtime(time,'yyyy-MM-dd') as time

unix_timestamp可以将日期转换回时间戳

from_unixtime('2021-08-01','yyyy-MM-dd') as time

年月日截取

SQL为此提供了对应的年、月、日提取函数，分别为year(),month(),day()。

year('2021-08-01'),month('2021-08-01'),day('2021-08-01')

日期差计算

datedff

datediff的作用为计算两个日期之间的天数间隔，语法为datediff(date1,date2)，返回起始时间 date1 和结束时间 date2 之间的天数，date1大于date2的情况下，返回的天数为正数，date1小于date2的情况下，返回的天数为负数。

datediff('2021-08–09','2021-08-01')

date_sub

语法为date_sub (string startdate, interval int day) ，返回开始日期startdate减少days天后的日期。

date_sub('2021-08–09',interval 8 day)

date_add

语法为date_add(string startdate, interval int day) ，返回开始日期startdate增加days天后的日期

date_add('2021-08–01',interval 8 day)

select
  day(date) as day,
  count(question_id) as question_cnt
from question_practice_detail
where month(date)=8 and year(date)=2021
group by date

（3）文本函数

长度 —length

length函数返回文本字段中值的长度

连接 —concat

CONCAT函数用于将两个或多个字符串连接起来，形成一个单一的字符串

分割 —subtring_index

SUBSTRING_INDEX函数用于将字符串依据某个指定分隔符进行切分，并返回指定位置分隔符前的字符。(字段分割符,位置）负数位置代表从后向前取，-1代表最后一位

select SUBSTRING_INDEX('180,78kg',',','2') as height

select 
  substring_index(profile,',',-1) as gender,
  count(*) as number
from user_submit
group by gender

select 
  substring_index(substring_index(profile,',',3),',',-1) as age,
  count(*) as number
from user_submit
group by age

定位 —instr

instr(substr,str)：返回substr字符串在str里第一次出现的位置，从1开始，没有则返回0

截取 —substring

substr（string A，int start，int len）,返回字符串A从下标start位置开始，长度为len的字符串

substring（string A，int start）,在不指定返回字符串长度的情况下，返回字符串A从下标start位置到结尾的字符串

select substring(‘bacda’,2,2) 
select substring(‘bacda’,2)

upper表示将字符串大写

select cust_id,cust_name,
  upper(concat(substring(cust_contact,1,2),substring(cust_city,1,3))) as user_login
from Customers

（4）窗口函数

desc代表降序排列

窗口函数

函数为先分组再排序, row_number() over (partition by col1 order by col2)，表示根据col1分组，在分组内部根据col2排序

窗口函数() over (partition by 用于分组的列名
                order by 用于排序的列名)

窗口函数，比如rank, dense_rank, row_number等

rank 如果有并列名次的行，会占用下一名次的位置 1 1 3
dense_rank 如果有并列名次的行，不占用下一名次的位置 1 1 2
row_number 如果有并列名次的行，也不考虑并列名次的情况 1 2 3

select device_id,university,gpa
from(
    select device_id,university,gpa,
    row_number() over (partition by university order by gpa) as rk
    from user_profile
   ) as up
where up.rk=1

6.综合练习

select up.device_id,university,
  count(question_id) as question_cnt,
  sum(if(qpd.result='right',1,0)) as right_question_cnt
from user_profile as up

left join question_practice_detail as qpd
     on qpd.device_id=up.device_id and month(qpd.date)=8
where university='复旦大学'
group by up.device_id

select qd.difficult_level,
  round(sum(if(qpd.result='right',1,0))/ count(qpd.result),4) as correct_rate
from user_profile as up
inner join question_practice_detail as qpd
on up.device_id=qpd.device_id and up.university='浙江大学'
inner join question_detail as qd
on qd.question_id=qpd.question_id

group by qd.difficult_level
order by correct_rate

select
  count(distinct device_id) as did_cnt,
  count(question_id) as question_cnt
from question_practice_detail
where year(date)='2021' and month(date)='08'

SQL必知必会

SQL35 返回每个顾客不同订单的总金额

select o.cust_id,oi.total_ordered
from (
    select order_num,sum(item_price*quantity) as total_ordered
    from OrderItems
    group by order_num
)oi,Orders as o
where oi.order_num=o.order_num
order by total_ordered desc

select o.cust_id,sum(oi.item_price*oi.quantity) as total_ordered
from Orders as o

left join OrderItems as oi
on oi.order_num=o.order_num

group by o.cust_id
order by total_ordered desc

SQL36 从 Products 表中检索所有的产品名称以及对应的销售总数

select prod_name,
  (select sum(quantity) as quant_sold
   from  OrderItems
   where OrderItems.prod_id=Products.prod_id
  )
from Products

select prod_name,sum(o.quantity) as quant_sold
from  OrderItems as o,Products as p
where o.prod_id=p.prod_id
group by prod_name

SQL38 返回顾客名称和相关订单号以及每个订单的总价

注意两个分组

select c.cust_name,o.order_num,sum(oi.quantity*oi.item_price) as OrderTotal
from Customers as c 
inner join Orders as o 
on c.cust_id=o.cust_id
inner join OrderItems as oi
on oi.order_num=o.order_num

group by c.cust_name,o.order_num
order by c.cust_name,o.order_num

SQL39 确定哪些订单购买了 prod_id 为 BR01 的产品（二）

select cust_id,order_date 
from Orders
where order_num in (
  select order_num from OrderItems where prod_id='BR01') 

order by order_date

select cust_id,order_date 
from Orders
inner join OrderItems
on Orders.order_num=OrderItems.order_num
where prod_id='BR01'

order by order_date

SQL进阶挑战

1.增删改操作

2.表和索引操作

3.聚合分组查询

4.多表查询

（1）嵌套子查询

SQL20 月均完成试卷数不小于3的用户爱作答的类别

select tag,count(exam_id) as tag_cnt
from exam_record
left join examination_info using(exam_id)
where uid in(
    select uid
    from exam_record
    where submit_time is not null
    group by uid
    having count(exam_id)/count(distinct date_format(submit_time,'%Y%m'))>=3
  )
group by tag
order by tag_cnt desc

SQL21 试卷发布当天作答人数和平均分

select exam_id,count(distinct uid) as uv,round(avg(score),1) as avg_score
from exam_record
where (exam_id,date(start_time)) #筛选发布当天的作答记录
  in(select exam_id,date(release_time) #获取试卷ID和发布日期
     from examination_info where tag='SQL'
    )and uid in(select uid from user_info where level>5)
group by exam_id
order by uv desc,avg_score

（2）合并查询

SQL23 每个题目和每份试卷被作答的人数和次数

方法一：用tid字段的左边第一个数来排序

select exam_id as tid,count(distinct exam_record.uid) as uv,
       count(*) as pv
from exam_record
group by exam_id
union
select question_id as tid,count(distinct practice_record.uid) as uv,
       count(*) as pv
from practice_record
group by question_id

order by left(tid,1) desc,uv desc,pv desc;

方法二：union后的排序问题，order by子句只能在最后一次使用。如果想要在union之前分别单独排序，那么需要这样：

select * from
( select * from t1  order by 字段 ) newt1 # 一定要对表重新命名，否则报错 
union
select * from
( select * from t2  order by 字段 ) newt2

select * from
( select exam_id as tid,count(distinct exam_record.uid) as uv,
       count(*) as pv
from exam_record
group by exam_id
order by uv desc,pv desc ) t1 # 一定要对表重新命名，否则报错 
union
select * from
( select question_id as tid,count(distinct practice_record.uid) as uv,
       count(*) as pv
from practice_record
group by question_id
order by uv desc,pv desc ) t2

SQL24 分别满足两个活动的人

select uid,'activity1' as activity
from exam_record
where year(submit_time)=2021
group by uid
having min(score)>=85

union all

select uid,'activity2' as activity
from exam_record
join examination_info using(exam_id)
where year(submit_time)=2021 and difficulty='hard' and score>80 
      and timestampdiff(minute,start_time,submit_time)<duration/ 2

order by uid

（3）连接查询

5.窗口函数

（1）专用窗口函数

SQL27 每类试卷得分前3名

对每类标签使用分组聚合排名。知识点：row_number() over partition by 排名优先级先是每个用户的最大得分降序，然后是每个用户的最低得分降序，最后用户ID降序。知识点：order by、min()、max()

见这种近几、连续、每类前几、各个前几直接考虑窗口函数，这里说下常用的几个：

窗口函数语法都是一样的：

<窗口函数> OVER ( partition by <用于分组的列名> order by <用于排序的列名>)

1）窗口函数：有三种排序方式

rank() over() 1 2 2 4 4 6 (计数排名，跳过相同的几个，eg.没有3没有5)
row_number() over() 1 2 3 4 5 6 (赋予唯一排名)
dense_rank() over() 1 2 2 3 3 4 (不跳过排名，可以理解为对类别进行计数)

2）聚合函数：通常查找最大值最小值的时候，首先会想到使用聚合函数。

a.group by的常见搭配：常和以下聚合函数搭配

avg（）-- 求平均值
count（）-- 计数
sum（）-- 求和
max（） -- 最大值
min（）-- 最小值

b.group by 的进阶用法，和 with rollup 一起使用。

3）左右连接

左连接：表1 left join 表2 on 表1.字段=表2.字段（以表1为准，表2进行匹配）

右连接：表1 right join 表2 on 表1.字段=表2.字段（以表2为准，表1进行匹配）

全连接：表1 union all 表2 （表1 和表2的列数必须一样多，union 去除重复项，union all 不剔除重复项）

内连接：表1 inner join 表2（取表1和表2相交部分）

外连接：表1 full outer join 表2 （取表1和表2不相交的部分）

select tag,uid,ranking
from (
    select tag,e_r.uid,
    row_number() over (partition by tag order by tag,max(score) desc,min(score) desc,e_r.uid desc) as ranking
    from exam_record  e_r join examination_info e_i using(exam_id)
    group by tag,e_r.uid
)ranktable
where ranking <=3

SQL30 近三个月未完成试卷数为0的用户完成情况

select uid, count(score) as  exam_complete_cnt
from(
    select uid,start_time,score,
           dense_rank() over (partition by uid order by date_format(start_time,'%Y%m') desc) as recent_months
    from exam_record
)table1
where recent_months<=3
group by uid
having count(score) = count(uid)
order by exam_complete_cnt desc,uid desc

（2）聚合窗口函数

聚类窗口函数用法和GROUP BY 函数类似。

MIN() OVER() : 不改变表结构的前提下，计算出最小值
MAX() OVER(): 不改变表结构的前提下，计算出最大值
COUNT() OVER(): 不改变表结构的前提下，计数
SUM() OVER(): 不改变表结构的前提下，求和
AVG() OVER(): 不改变表结构的前提下，求平均值

SQL33 对试卷得分做min-max归一化

select uid,exam_id,round(avg(max_min),0) as avg_new_score
from(
    select uid,exam_id,score,
    if(max_x = min_x, score, 100*(score-min_x)/ (max_x-min_x)) as max_min
    from(
        select uid,e_r.exam_id as exam_id,score,
               min(score) over (partition by exam_id) as min_x,
               max(score) over (partition by exam_id) as max_x
        from exam_record e_r  join examination_info e_i using(exam_id)
        where difficulty = 'hard' and score is not NULL
    )table1
)table2
group by exam_id,uid
order by exam_id,avg_new_score desc

SQL34 每份试卷每月作答数和截止当月的作答总数

select exam_id,date_format(start_time,'%Y%m') as start_month,
       count(start_time) as month_cnt,
       sum(count(start_time)) over (partition by exam_id order by date_format(start_time,'%Y%m')) as cum_exam_cnt
from exam_record
group by exam_id, start_month;

6.其他常用操作

（1）空值处理

SQL37 0级用户高难度试卷的平均用时和平均得分

未完成的默认试卷最大考试时长和0分处理

对空值的处理有两种方法

#方法1：coalesce(A,B),如果A为空则返回B,不为空则返回A

#方法2：if(A is NULL ,B,C)如果A为空则返回B，不为空则返回C

select uid,
       round(avg(new_score),0) as avg_score,
       round(avg(cost_time),1) avg_time_took
from (
    select e_r.uid as uid,
           if(score is NULL,0,score) as new_score,
           if(submit_time is NULL,duration,timestampdiff(minute,start_time,submit_time)) as cost_time
    from exam_record e_r join examination_info e_i using(exam_id)
    join user_info u_i using(uid)
    where level=0 and difficulty = 'hard'
)table1
group by uid

（2）高级条件语句

SQL39 筛选昵称规则和试卷规则的作答记录

rlike 后面可以跟正则表达式，正则表达式 "^[0-9]+$" 的意思：

1、字符^ ：表示匹配的字符必须在最前边。

例如：^A不匹配“an A”中的‘A’，但匹配“An A”中最前面的‘A’。

2、字符$ ：与^类似，匹配最末的字符。

例如：t$不匹配“eater”中的‘t’，但匹配“eat”中的‘t’。

3、字符[0-9] ：字符列表，匹配列表中的任一个字符。你可以通过连字符 - 指出字符范围。

例如：[abc]跟[a-c]一样。它们匹配“brisket”中的‘b’和“ache”中的‘c’。

4、字符+ ：匹配 + 号前面的字符1次及以上。等价于{1,}。

例如：a+匹配“candy”中的‘a’和“caaaaaaandy”中的所有‘a’。

问题：找到昵称以"牛客"+纯数字+"号"或者纯数字组成的用户对于字母c开头的试卷类别（如C,C++,c#等）的已完成的试卷ID和平均得分，按用户ID、平均分升序排序。

select uid, exam_id, round(avg(score),0) avg_score 
from exam_record
where uid in (select uid from user_info 
              where nick_name rlike "^牛客[0-9]+号$" or nick_name rlike "^[0-9]+$") 
              and exam_id in (select exam_id 
                              from examination_info
                              where tag rlike "^[cC]") 
        and score is not null
group by uid, exam_id
order by uid, avg_score;

（3）限量查询

SQL43 注册当天就完成了试卷的名单第三页

问题：找到求职方向为算法工程师，且注册当天就完成了算法类试卷的人，按参加过的所有考试最高得分排名。排名榜很长，我们将采用分页展示，每页3条，现在需要你取出第3页（页码从1开始）的人的信息。

select uid,level,register_time,max(score) as max_score
from exam_record e_r join examination_info e_i using(exam_id)
                     join user_info u_i using(uid)
where job = '算法' and date(register_time)=date(submit_time) and tag='算法'
group by uid
order by max_score desc limit 6,3

（4）文件转换函数

SQL44 修复串列了的记录

问题：录题同学有一次手误将部分记录的试题类别tag、难度、时长同时录入到了tag字段，请帮忙找出这些录错了的记录，并拆分后按正确的列类型输出

筛选出录错了的记录：where tag like '%,%'
提取tag，第一个逗号前的值：substring_index(tag, ',', 1) as tag
提取难度，第二个逗号前倒数第一个逗号后：substring_index(substring_index(tag, ',', 2), ',', -1) as difficulty
提取时长，最后一个逗号后，并类型转换：cast( substring_index(tag, ',', -1) as decimal ) as duration

select exam_id,
       substring_index(tag, ',', 1) as tag,
       substring_index(substring_index(tag, ',', 2), ',', -1) as difficulty,
       cast( substring_index(tag, ',', -1) as decimal ) as duration
from examination_info
where tag like '%,%';

SQL45 对过长的昵称截取处理

问题：请输出字符数大于10的用户信息，对于字符数大于13的用户输出前10个字符然后加上三个点号：『...』。

对字符数大于13的用户昵称做处理：char_length(nick_name)>13,

前10个字符加上三个点号：concat(substr(nick_name, 1, 10),'...')

select uid, 
       if(char_length(nick_name)>13, concat(substr(nick_name, 1, 10),'...'), nick_name) as nick_name
from user_info
where char_length(nick_name)>10;

SQL46 大小写混乱时的筛选统计

问题：试卷的类别tag可能出现大小写混乱的情况，请先筛选出试卷作答数小于3的类别tag，统计将其转换为大写后对应的原本试卷作答数。

如果转换后tag并没有发生变化，不输出该条结果。

with t_tag_count as(
    select tag, count(uid) as answer_cnt
    from exam_record left join examination_info using(exam_id)
    group by tag
)

select a.tag as tag, b.answer_cnt as answer_cnt
from t_tag_count as a join t_tag_count as b       #表t_tag_count进行自连接
                   on upper(a.tag)=b.tag and a.tag!=b.tag and a.answer_cnt < 3 ;

牛客网SQL刷题笔记总结

非技术快速入门

1.基础查询

（1）基础查询

（2）简单处理查询结果

2.条件查询

（1）基础操作符

（2）高级操作符

（3）基础排序

3.高级查询

（1）计算函数

（2）分组查询

4.多表查询

（1）子查询

（2）链接查询

（3）组合查询

5.必会的常用函数

（1）条件函数

IF 条件函数

Case when

（2）日期函数

（3）文本函数

（4）窗口函数

6.综合练习

SQL必知必会

SQL35 返回每个顾客不同订单的总金额

SQL36 从 Products 表中检索所有的产品名称以及对应的销售总数

SQL38 返回顾客名称和相关订单号以及每个订单的总价

SQL39 确定哪些订单购买了 prod_id 为 BR01 的产品（二）

SQL进阶挑战

1.增删改操作

2.表和索引操作

3.聚合分组查询

4.多表查询

（1）嵌套子查询

（2）合并查询

（3）连接查询

5.窗口函数

（1）专用窗口函数

（2）聚合窗口函数

6.其他常用操作

（1）空值处理

（2）高级条件语句

（3）限量查询

（4）文件转换函数

SQL大厂面试真题

1.某音短视频

2.用户增长场景（某度信息流）

3.电商场景（某东商城）

4.出行场景（某滴打车）

5.某宝店铺分析（电商模式）

6.牛客直播课分析（在线教育行业）

7.某乎问答（内容行业）

猜你喜欢