hive query syntax

1, hive most similar to a query syntax with mysql

2, hive some sort

order by global ordering: All data is destined for a reduce inside, can cause inefficiency, caution
sort by each reduce internal sort, that is, local order, but global disorder
distribured by hash carried out in accordance with the specified field hash, a hash value is taken according to this field, and then evenly assigned different inside to reduce
(field .hasCode & Integer.MAX_VALUE)% numReduceTask

distributed by clustered by addition of functions, but also the specified sort fields alone can not specify only one row desc or asc method is asc

If a field sort by field and distributed by the same, can be clustered by alternative

3, having and where differences

( 1 ) the WHERE for the columns in the table to play a role, query data; the HAVING for the role in the query results columns, filter the data.

( 2 ) WHERE later can write packet function, and having back function can use the packet.

( 3 ) the HAVING only for the group by group statistics statement.

Guess you like

Origin www.cnblogs.com/nacyswiss/p/12611789.html