Hive join operation optimization

1. a small table on the left join operation

Hive operations will join the left of the table in the cache, and then traverse the table to the right, a small table on the left will greatly reduce memory consumption

 

2. Use STREAMTABLE comments

The large table using STREAMTABLE marked out, thereby reducing memory consumption.

 

SELECT /*+ STREAMTABLE(a) */ a.val, b.val, c.val FROM a JOIN b ON (a.key = b.key1) JOIN c ON (c.key = b.key1)

  

Guess you like

Origin www.cnblogs.com/yanximin/p/11317457.html