1. Basic concepts
Regarding the join keyword in SQL statements, it is a commonly used keyword but not easy to understand. The following example gives a simple explanation - creating tables user1, user2:
table1 : create table user2(id int, user_name varchar(10), over varchar(10));
insert into user1 values(1, ‘tangseng', ‘dtgdf');
insert into user1 values(2, ‘sunwukong', ‘dzsf');
insert into user1 values(1, ‘zhubajie', ‘jtsz');
insert into user1 values(1, ‘shaseng', ‘jslh');
table2 : create table user2(id int, user_name varchar(10), over varchar(10));
insert into user2 values(1, ‘sunwukong', ‘chengfo');
insert into user2 values(2, ‘niumowang', ‘chengyao');
insert into user2 values(3, ‘jiaomowang', ‘chengyao');
insert into user2 values(4, ‘pengmowang', ‘chengyao');
Types of Join in the SQL standard
- Inner join (inner join or join)
(1). Concept: Inner joins combine the columns of two tables based on join predicates to produce a new result table.
(2).Inner connection Venn diagram:
(3).sql statement
select a.id, a.user_name, b.over
from user1 a
inner join user2 b
on a.user_name=b.user_name;
result:
- outer join
Outer joins include left outer join, right outer join or full outer join
a. Left outer join: left join or left outer join
(1) Concept: The result set of a left outer join includes all rows of the left table specified in the LEFT OUTER clause, not just the rows matched by the join column. If a row in the left table has no matching row in the right table, all select list columns of the right table will be null in the associated result set row.
(2) Left outer join Venn diagram:
(3)sql statement:
select a.id, a.user_name, b.over
from user1 a
left join user2 b
on a.user_name=b.user_name;
result:
b. Right outer join: right join or right outer join
(1) Right outer join is the reverse join of left outer join. All rows from the right table will be returned. If a row in the right table has no matching row in the left table, NULL will be returned for the left table.
(2) Right outer join Venn diagram:
(3)sql statement
select b.user_name, b.over, a.over
from user1 a
right join user2 b
on a.user_name=b.user_name;
result:
c. Full outer join: full join or full outer join
(1) A complete outer join returns all rows in the left and right tables. When a row has no matching row in another table, the other table's select list column contains null values. If there are matching rows between tables, the entire result set row contains the data values from the base table.
(2) Right outer join Venn diagram:
(3)sql statement
select a.id, a.user_name, b.over
from user1 a
full join user2 b
on a.user_name=b.user_name
Querying full joins in mysql will report a 1064 error. Mysql does not support full join queries. Instead, use the following statement:
select a.user_name,a.over,b.over
from user1 a left join user2 b
on a.user_name = b.user_name
union all select b.user_name,b.over ,a.over
from user1 a
right join user2 b
on a.user_name = b.user_name;
result:
- Cartesian join (cross join)
1. Concept: A cross join without a WHERE clause will produce a Cartesian product of the tables involved in the join. The number of rows in the first table multiplied by the number of rows in the second table equals the size of the Cartesian product result set. (The cross connection of user1 and user2 produces 4*4=16 records)
2. Cross join: cross join (without condition on)
3. SQL statement:
select a.user_name,b.user_name, a.over, b.over
from user1 a cross join user2 b;
2. Usage skills
- Using join to update the table
We use the following statement to update the over field in the user1 table that contains records in both the user1 table and the user2 table to 'qtda'.
update user1 set over='qtds'
where user1.user_name in (
select b.user_name
from user1 a
inner join user2 b
on a.user_name = b.user_name);
This statement can be executed correctly in sql server and oracle, but an error is reported in mysql. Mysql does not support updating the subquery table, so we can use the following statement to do it.
update user1 a
join (
select b.user_name
from user1 a
join user2 b
on a.user_name = b.user_name) b
on a.user_name = b.user_name
set a.over = ‘qtds'
- Use join to optimize subqueries
Subquery efficiency is relatively inefficient, use the following statement to query
select a.user_name, a.over,(
select over
from user2 b
where a.user_name=b.user_name)
as over2 from user1 a;
Using join to optimize subqueries can achieve the same effect
select a.user_name, a.over, b.over as over2
from user1 a
left join user2 b
on a.user_name = b.user_name;
- Use join to optimize aggregate subqueries
Introduce a new table:
user_kills
create table user_kills(user_id int, timestr varchar(20), kills int(10));
insert into user_kills values(2, ‘2015-5-12', 20);
insert into user_kills values(2, ‘2015-5-15', 18);
insert into user_kills values(3, ‘2015-5-11', 16);
insert into user_kills values(3, ‘2015-5-14', 13);
insert into user_kills values(3, ‘2015-5-16', 17);
insert into user_kills values(4, ‘2015-5-12', 16);
insert into user_kills values(4, ‘2015-5-10', 13);
To query the date with the largest kills in the user_kills table corresponding to each person in user1, use an aggregate subquery statement:
select a.user_name,b.timestr, b.kills
from user1 a
join user_kills b
on a.id = b.user_id
where b.kills = (
select MAX(c.kills)
from user_kills c
where c.user_id = b.user_id);
Use join to optimize aggregate subqueries (avoid subqueries)
select a.user_name, b.timestr, b.kills
from user1 a
join user_kills b
on a.id = b.user_id
join user_kills c
on c.user_id = b.user_id
group by a.user_name, b.timestr, b.kills
having b.kills = max(c.kills);
result:
- Implement group selection data
(This requirement was very distressing at work at the time. The situation is changing. I will sort out the multi-situation version later.)
It is required to query the two days before the most kills for each person in user1.
First, we can use the following statement to query the two days when a person kills the most;
select a.user_name, b.timestr, b.kills
from user1 a
join user_kills b
on a.id = b.user_id
where a.user_name ='sunwukong'
order by b.kills desc limit 2;
So how do you query the two days with the most kills among all people using one statement? Look at the following statement:
WITH tmp AS (
select a.user_name, b.timestr, b.kills, ROW_NUMBER() over(
partition by a.user_name order by b.kills) cnt
from user1 a join user_kills b on a.id = b.user_id)
select * from tmp where cnt <= 2;
The above statement is supported by both sql server and oracle, but mysql version 5.7 and below does not support the grouping sorting function ROW_NUMBER() and with to create a temporary table. An alternative method is provided below:
select d.user_name,c.timestr, kills
from (select user_id, timestr, kills, (
select count(*)
from user_kills b
where b.user_id = a.user_id and a.kills <= b.kills) as cnt from user_kills a
group by user_id, timestr, kills) c
join user1 d
on c.user_id = d.id
where cnt <= 2;
Result:
Reprinted from Script House