HIVE中三种join的连接条件遇到过的坑

    最近在使用union all的过程中,遇到了一些问题,又GET了一个知识点,作为一个总结,记录一下~

where条件的坑:

案例,若一个字段的值为1,2,NULL(自然空,而不是'NULL')这时,如果代码这样写:

select id,data,name 
  from aa
 where name <> 'us'

那么,为空的这一条记录是不会被筛选出来的。

id date name
3914810511 2017-06-15 hk
3914851966 2017-06-15 hk

name为空的这一条就不会显示。

如果,这种情况就要通过转化才能达到想要的效果:

select *id,data,name 
  from a
 where coalesce(name,0) <> 'us'

结果:

id date name
3912115625 2017-06-15 NULL
3914810511 2017-06-15 hk
3914851966 2017-06-15

hk

下面来讲一下full join,也会出现这样的问题,现在同一张表full join:

select a.id   as aid,
       a.date as adate,
       coalesce(a.name,b.name) as name,
       b.id   as bid,
       b.date as bdate
  from aa a
  full join aa b
    on a.name = b.name

结果为:

aid adate name bid bdate
3912115625 2017-06-15 NULL NULL NULL
NULL NULL NULL 3912115625 2017-06-15
3914851966 2017-06-15 hk 3914851966 2017-06-15
3914851966 2017-06-15 hk 3914810511 2017-06-15
3914810511 2017-06-15 hk 3914851966 2017-06-15
3914810511 2017-06-15 hk 3914810511 2017-06-15
3920408638 2017-06-15 us 3920408638

2017-06-15

因此,也要做相应的处理才可以:

select a.id   as aid,
       a.date as adate,
       coalesce(a.name,b.name) as name,
       b.id   as bid,
       b.date as bdate
  from aa a
  full join aa b
    on coalesce(a.name,0) = coalesce(b.name,0)

结果为:

aid adate name bid bdate
3912115625 2017-06-15 NULL 3912115625 2017-06-15
3914851966 2017-06-15 hk 3914851966 2017-06-15
3914851966 2017-06-15 hk 3914810511 2017-06-15
3914810511 2017-06-15 hk 3914851966 2017-06-15
3914810511 2017-06-15 hk 3914810511 2017-06-15
3920408638 2017-06-15 us 3920408638 2017-06-15

left join不会有这种问题,我也测试了一下:

select a.id   as aid,
       a.date as adate,
       coalesce(a.name,b.name) as name,
       b.id   as bid,
       b.date as bdate
  from aa a
  left join aa b
    on a.name = b.name

结果为:

aid adate name bid bdate
3912115625 2017-06-15 NULL NULL NULL
3914810511 2017-06-15 hk 3914851966 2017-06-15
3914810511 2017-06-15 hk 3914810511 2017-06-15
3914851966 2017-06-15 hk 3914851966 2017-06-15
3914851966 2017-06-15 hk 3914810511 2017-06-15
3920408638 2017-06-15 us 3920408638

2017-06-15

那么join呢,我也测试了一下:

select a.id   as aid,
       a.date as adate,
       coalesce(a.name,b.name) as name,
       b.id   as bid,
       b.date as bdate
  from aa a
  join aa b
    on a.name = b.name

结果为:

aid adate name bid bdate
3914810511 2017-06-15 hk 3914851966 2017-06-15
3914810511 2017-06-15 hk 3914810511 2017-06-15
3914851966 2017-06-15 hk 3914851966 2017-06-15
3914851966 2017-06-15 hk 3914810511 2017-06-15
3920408638 2017-06-15 us 3920408638 2017-06-15

也有这样的问题,为null的结果不展示。

select a.id   as aid,
       a.date as adate,
       coalesce(a.name,b.name) as name,
       b.id   as bid,
       b.date as bdate
  from aa a
  join aa b
    on coalesce(a.name,0) = coalesce(b.name,0)
aid adate name bid bdate
3912115625 2017-06-15 NULL 3912115625 2017-06-15
3914810511 2017-06-15 hk 3914851966 2017-06-15
3914810511 2017-06-15 hk 3914810511 2017-06-15
3914851966 2017-06-15 hk 3914851966 2017-06-15
3914851966 2017-06-15 hk 3914810511 2017-06-15
3920408638 2017-06-15 us 3920408638

2017-06-15

这次探讨的问题就到这里啦,如果还有别的部分,我后续会在更新~

猜你喜欢

转载自blog.csdn.net/Jarry_cm/article/details/88067631