hive left join lost data

Recently, when writing hql to fetch data, I found that different writing methods will blur left join and join, so that the results obtained are not expected.

Three paragraphs of hql are listed for reference:

1. The following is the first paragraph of hql, write the conditions of the associated two tables (or multiple tables) under the last where:

select a.cola1,a.cola2,b.colb1
from tablea a
left join  tableb b on a.id=b.id 
where
a.dt='20200826'
and b.dt='20200826'
;

Result: This way of writing will make the left join invalid, and the result obtained is the result of the join of two tables. The data in the main table will be lost .

2. The following is the second paragraph of hql. After where, there are only the conditions of the main table, and the conditions of other tables are written after on:

select a.cola1,a.cola2,b.colb1
from tablea a
left join  tableb b on a.id=b.id  and b.dt='20200826'
where
a.dt='20200826'
;

Result: This way of writing can use left join correctly and get the expected result. The data in the main table will not be lost.

3. The following is the third paragraph of hql, use the select statement and where to get the two required tables, and then perform left join:

select * from 
(select cola1,cola2 from tablea where dt='20200826')a
left join  
(select colb1 from tableb where dt='20200826')b 
on a.id=b.id 
;

Result: This way of writing can use left join correctly and get the expected result. The data in the main table will not be lost.

in conclusion:

The first way of writing seems to be a left join, but the actual execution is a join, and the result may not be the expected result, so it needs to be used with caution. The second and third ways of writing both get the expected results.

As for why this is the case, the deeper implementation method is due to lack of talent and learning, and I don’t have the energy to figure it out for the time being. I hope that someone who understands it will enlighten me. Welcome to leave a message in the comment area~~!

Supongo que te gusta

Origin blog.csdn.net/qq_32103261/article/details/108304625
Recomendado
Clasificación