MySQL: Union

1. Union

    1) Function: Combine the result sets from two query into one single result set.

# Single result set 1
select * from stu where stu_score > 40;

# Single result set 2
select * from stu where stu_score < 35;

# Combine result sets without union
select * from stu where stu_score < 35 || stu_score > 40;

# Combine result sets with union
select * from stu where stu_score > 40 union select * from stu where stu_score < 35;

     2) Compliance of Union

# The column number of the result sets of the two queries should be the same
# But it's recommended that the structure of the two result sets be the same

     3) Usage scenario

# In the ECShop project
# Comments are stored in two tables: 1) comment[For common comment] 2) feedback[For problem argue]
# The structure of the two tables are not that same.
# But the compliance is that the result set of the two queries should be the same.
# But in the message panel we need to see all the two types' comment. So it's best to use union instead of using two SQL statements and then combine the data in java/php/c code.

2. An example for exmplaining the usage of union

#1. Table ta
select * from ta;
+----+-----+
| id | num |
+----+-----+
| a  |   5 |
| b  |  10 |
| c  |  15 |
| d  |  10 |
+----+-----+

#2. Table tb
select * from tb;
+----+-----+
| id | num |
+----+-----+
| b  |   5 |
| c  |  10 |
| d  |  20 |
| e  |  99 |
+----+-----+

#3. Union result of the two tables
select * from ta union select * from tb;
+----+-----+
| id | num |
+----+-----+
| a  |   5 |
| b  |  10 |
| c  |  15 |
| d  |  10 |
| b  |   5 |
| c  |  10 |
| d  |  20 |
| e  |  99 |
+----+-----+

#4. Combine the num whose id are the same
select id, sum(num) from (select * from ta union select * from tb) as temp group by id;
+----+----------+
| id | sum(num) |
+----+----------+
| a  | 5        |
| b  | 15       |
| c  | 25       |
| d  | 30       |
| e  | 99       |
+----+----------+

3. A pitfall when using union

# 1. Table ta
select * from ta;
+----+-----+
| id | num |
+----+-----+
| a  |   5 |
| b  |  10 |
| c  |  15 |
| d  |  10 |
+----+-----+

# 2. Table tb
select * from tb;
+----+-----+
| id | num |
+----+-----+
| b  |   5 |
| c  |  15 |
| d  |  20 |
| e  |  99 |
+----+-----+

# 3. Union result
select * from ta union select * from tb;
+----+-----+
| id | num |
+----+-----+
| a  |   5 |
| b  |  10 |
| c  |  15 |
| d  |  10 |
| b  |   5 |
| d  |  20 |
| e  |  99 |
+----+-----+

# Pay attention to the row whose id = c;
# When ta.id == tb.id && ta.num == tb.num <Every column on current row are the same value>
# Then the union result will combine the two rows together to reduce redundancy;
# We can use key word union all instead of union to suppress auto combination.

select * from ta union all select * from tb;
+----+-----+
| id | num |
+----+-----+
| a  |   5 |
| b  |  10 |
| c  |  15 |
| d  |  10 |
| b  |   5 |
| c  |  15 |
| d  |  20 |
| e  |  99 |
+----+-----+

4. Another pitfall when using union

# 1. Common pitfall
select * from ta order by id desc union all select * from tb order by id desc;
ERROR 1221 : Incorrect usage of UNION and ORDER BY

# 2. Reason
# Because union all don't know how to order the two result set.

# 3. Solution: Add () to separate two result set to eliminate ambiguity.
# But that wouldn't affect the order of the final result. Ordered result set a union order result b, the result wouldn't be ordered even in each sub-result-set
# It's recommanded using order by or limit at the end of the final result set.
## Not recommanded:
(select * from ta order by id desc) union all (select * from tb order by id desc);
## Recommanded:
(select * from ta) union all (select * from tb) order by id desc;

# Comment:
(select * from ta order by id desc limit 4) union all (select * from tb order by id desc limit 5);
# It works!

# Reason:
# It's pointless union two ordered result set who is ordered wihout any purpose. Then it would be optimized by compiler saying order is useless in each sub-result-set.
# It's better practice union two ordered result set who is ordered with specific purpose. Such as get the largest three id or smallest three id. It wouldn't be optimized by compiler.

Comments:

    1) The column name of the SQL result are not forced to be the same. But the type of the column should be the same. And the result of the union operation, the name of the column is set as the left table's column name.

    2) In the previous example, if table ta's id column is primary key for ta table and if table tb's id column is primary key for tb table. When we union the two table there would be no error occurs. Because the result is just a result set (virtual table) not a real table whose id mustn't be duplicated.

猜你喜欢

转载自davyjones2010.iteye.com/blog/1846696