hive中where子句的注意事项

select a.keyword,count(distinct b.uid) as count 
from
(select uid,keyword from mbadp.t_dw_star_interest where dt = 20180613) a
join 
(select distinct imei,dt,keyword  
from mbadp.t_ods_news_user_behavior                                                             
where dt =20180613                                         
and ( keyword like '%奔驰GLA'
or keyword like '%奔驰gla%'
)                       
) b
on a.uid = b.imei 

group by b.keyword

没有括号的时候,会把写的条件解析为:

where (dt =20180613 and keyword like '%奔驰GLA') or keyword like '%奔驰gla%'                     

OR之后的条件独立于OR之前的条件,并没作分区限定!所以它直接要去取每一个分区的数据,做全表扫描,速度会很慢



猜你喜欢

转载自blog.csdn.net/weixin_38987362/article/details/80734504