Tuning of the barrel points HIVE

  • Points barrels rules

    • Of sub-barrel field values ​​hash, hash value divided by the number of barrels remainder, the remainder of which determines which records in the bucket, which is the same as the remainder in a bucket

  • Points advantage barrel

    1. Improve efficiency join query:

      Tables A and B is assumed for join, join the id field conditions:

      • Two tables for the big table

      • Two tables are divided bucket list

      • A number of buckets is a multiple of the number of table B or table tub factor

      So join query time, each bucket of Table A and Table B can directly join the corresponding barrels, instead of the entire table join, to improve query efficiency

    2. Increase sampling efficiency

      • Points table behind the barrel can not bring on the field name, the default when the band is not by sub-barrel field, could bring, but no sub-barrel table you must bring

      • When sampling by sub-barrel field, because the partition table is to go directly to the corresponding bucket of the bucket to get the data in Table relatively large increase sampling efficiency

Guess you like

Origin www.cnblogs.com/xiangyuguan/p/11416043.html