我们都知道,null值不能用在等值运算,也不能用来比较大小。道理都懂,但是在实际SQL中,你真的注意到了吗?
例如,我有如下一个tab表,共有5条数据,其中name='sam’有2条数据:
mysql> desc tab;
+-------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| id | int(11) | YES | | NULL | |
| col1 | int(11) | YES | | NULL | |
| name | varchar(10) | YES | | NULL | |
+-------+-------------+------+-----+---------+-------+
3 rows in set (0.00 sec)
mysql> select count(*) from tab;
+----------+
| count(*) |
+----------+
| 5 |
+----------+
1 row in set (0.00 sec)
mysql> select count(*) from tab where name='sam';
+----------+
| count(*) |
+----------+
| 2 |
+----------+
1 row in set (0.00 sec)
我想要把col2不是’sam’的数据都查出来,那么应该会查出3条数据。而你会不会把SQL写成下面这样呢?
select * from tab where name <> 'sam'
我们来实际执行一下:
mysql> select * from tab where name <> 'sam';
+------+------+------+
| id | col1 | name |
+------+------+------+
| 1 | 1 | tom |
| 3 | 1 | ken |
+------+------+------+
2 rows in set (0.00 sec)
实际数据返回2行,比预计的少了一行,为什么呢?原因是name列还有null值的情况:
mysql> select name from tab;
+------+
| name |
+------+
| sam |
| tom |
| sam |
| ken |
| NULL |
+------+
5 rows in set (0.00 sec)
那么为什么select * from tab where name <> 'sam'
会返回两行呢?因为在null和sam对比时,虽然null并不是’sam’,但是null跟任何值做比较都是null,所以当读到name列是null时,where name <> 'sam’实际等于where null,即where 0,所以该行数据并不会作为结果返回。那么正确的SQL应该写成:
mysql> select * from tab where name <> 'sam' or name is null;
+------+------+------+
| id | col1 | name |
+------+------+------+
| 1 | 1 | tom |
| 3 | 1 | ken |
| 4 | 1 | NULL |
+------+------+------+
3 rows in set (0.00 sec)
再例如有如下两个表,tab和tab1,tab1比tab少一条id=2的记录:
mysql> select * from tab;
+------+------+------+
| id | col1 | name |
+------+------+------+
| 1 | 1 | sam |
| 1 | 1 | tom |
| 2 | 1 | sam |
| 3 | 1 | ken |
| 4 | 1 | NULL |
+------+------+------+
5 rows in set (0.00 sec)
mysql> select * from tab1;
+------+------+------+
| id | col1 | name |
+------+------+------+
| 1 | 1 | sam |
| 1 | 1 | tom |
| 3 | 1 | ken |
| 4 | 1 | NULL |
+------+------+------+
4 rows in set (0.00 sec)
left join的结果是这样的:
mysql> select * from tab a left join tab1 b on a.id=b.id;
+------+------+------+------+------+------+
| id | col1 | name | id | col1 | name |
+------+------+------+------+------+------+
| 1 | 1 | sam | 1 | 1 | sam |
| 1 | 1 | tom | 1 | 1 | sam |
| 1 | 1 | sam | 1 | 1 | tom |
| 1 | 1 | tom | 1 | 1 | tom |
| 3 | 1 | ken | 3 | 1 | ken |
| 4 | 1 | NULL | 4 | 1 | NULL |
| 2 | 1 | sam | NULL | NULL | NULL |
+------+------+------+------+------+------+
7 rows in set (0.00 sec)
当你想在这基础上,把name相同的数据拿出来,写了下面的SQL:
select * from tab a left join tab1 b on a.id=b.id where a.name=b.name
是不是预想拿到下面的数据呢?
+------+------+------+------+------+------+
| id | col1 | name | id | col1 | name |
+------+------+------+------+------+------+
| 1 | 1 | sam | 1 | 1 | sam |
| 1 | 1 | tom | 1 | 1 | tom |
| 3 | 1 | ken | 3 | 1 | ken |
| 4 | 1 | NULL | 4 | 1 | NULL |
但实际结果是:
mysql> select * from tab a left join tab1 b on a.id=b.id where a.name=b.name;
+------+------+------+------+------+------+
| id | col1 | name | id | col1 | name |
+------+------+------+------+------+------+
| 1 | 1 | sam | 1 | 1 | sam |
| 1 | 1 | tom | 1 | 1 | tom |
| 3 | 1 | ken | 3 | 1 | ken |
+------+------+------+------+------+------+
3 rows in set (0.01 sec)
为什么呢?因为id=4的name都为null,而null = null并不成立。
实际上,select * from tab a left join tab1 b on a.id=b.id where a.name=b.name;这种left join是等效于inner join了。