Data Analyst ---- SQL Strengthening (1)

Data Analyst ---- SQL Strengthening (1)

written in front

Recently, when I was looking for a job, I found that most of the written exams for data analysts will involve SQL, but the difficulty of SQL in the written exams is not at the same level as what we usually encounter in our studies. The questions in the written exams are closer to the business. For fresh graduates It's still quite difficult (maybe I'm too good at it).
This SQL column will record the more valuable questions I encountered in interviews or brushing questions. I hope it can help you, and I hope you will like and pay more attention.

topic

Please use a sentence of SQL to extract the behavior characteristics of all users on the product. The characteristics are divided into purchased, purchased but not collected, collected but not purchased, collected and purchased

Order table : orders
insert image description here
collection table : favorites
insert image description here
final output:
insert image description here

Analysis of the meaning of the question:
Through the title, we can clearly know that this is a multi-table connection problem. After connecting the two tables, judge according to the content of the fields
About the knowledge points involved in multi-table query
insert image description here

The first step: table connection and table splicing

Through the analysis of the topic, we can find that this is a problem of full connection of two tables. Full connection can be performed directly in Oracle database, but full connection is not supported in MySQL database. We can consider splicing the contents of the two queries union allkeywords

	select o.user_id, o.item_id,o.pay_time,f.fav_time
	from orders o left join favorites f 
	on o.user_id = f.user_id and o.item_id = f.item_id
	UNION ALL
	select f.user_id, f.item_id,o.pay_time,f.fav_time
	from orders o right join favorites f 
	on o.user_id = f.user_id and o.item_id = f.item_id
	where o.user_id is null

insert image description here

Explain that the left outer join and right inner join are used in the splicing full join code, because the queried data will not be repeated, and it can be merged directly. union allFrom
the optimization point of view, the efficiency of using union all will be higher than that of union

Step 2: Create a new column and fill in the values

Through the query in the above table, we can find that the user's purchase and collection can be judged according to the payment time and collection time

Use case whento differentiate

select distinct user_id,item_id,
case when pay_time is not null then 1 else 0 end '已购买',
case when pay_time is not null  and fav_time is null then 1 else 0 end '购买未收藏',
case when pay_time is null  and fav_time is not null then 1 else 0 end '收藏未购买',
case when pay_time is not null and fav_time is null then 1 else 0 end '收藏且购买'
from (
	select o.user_id, o.item_id,o.pay_time,f.fav_time
	from orders o left join favorites f 
	on o.user_id = f.user_id and o.item_id = f.item_id
	UNION ALL
	select f.user_id, f.item_id,o.pay_time,f.fav_time
	from orders o right join favorites f 
	on o.user_id = f.user_id and o.item_id = f.item_id
	where o.user_id is null
) tmp
order by user_id, item_id;

ifIt is also possible to judge only

select distinct user_id,item_id,
if(pay_time,1,0) '已购买',
if(pay_time is not null  and fav_time is null,1,0) '购买未收藏',
if(pay_time is null  and fav_time is not null,1,0) '收藏未购买',
if(pay_time is not null and fav_time is null,1,0) '收藏且购买'
from (
	select o.user_id, o.item_id,o.pay_time,f.fav_time
	from orders o left join favorites f 
	on o.user_id = f.user_id and o.item_id = f.item_id
	UNION ALL
	select f.user_id, f.item_id,o.pay_time,f.fav_time
	from orders o right join favorites f 
	on o.user_id = f.user_id and o.item_id = f.item_id
	where o.user_id is null
) tmp
order by user_id, item_id;

insert image description here

Summarize

This question mainly examines multi-table query, full outer join, union, union all, case when, the difficulty is that at the beginning, I don’t know how to start, I don’t know how to combine the data of the two tables, I don’t know how to add columns, and I may only know Process data in a table.
In fact, when we encounter this kind of problem, we can take it step by step to disassemble the problem, first associate the two tables, and then add columns according to the field conditions.
insert image description here

Guess you like

Origin blog.csdn.net/qq_52007481/article/details/130160211