Hive data organization

1, Hive data organization include databases, tables, views, partitions , sub-barrel , and the table data and the like. Databases, tables, partitions, etc., corresponding to a directory on HDFS. Kit of parts and a corresponding table data files in the directory corresponding to HDFS.

2, all the data are stored in the HDFS Hive, no specialized data storage format, because Hive read mode (Schema On Read), supports TextFile, SequenceFile, RCFile or custom format

3, only need to tell the Hive data when creating a table column delimiter and row delimiters , you can parse the data Hive

  • Separator description
    \n For text files, each row is a record, so the \ n to separate record
    ^A (Ctrl+A) Split field, you can also use \ 001 to represent
    ^B (Ctrl+B) Struct element for dividing Arrary or in, or for partitioning between the key map, can also be used \ 002 division.
    ^C Used to map keys and values ​​in their own division, you can also use \ 003 represents.

4, Hive model contains the following data:

  Database : performance of $ {hive.metastore.warehouse.dir} a folder directory in HDFS

  the Table : the performance of your database directory under a folder in HDFS

  table External : the table is similar, but which can specify any data storage location directory path HDFS

  Partition : performance subdirectory under the table directory in HDFS

  bucket multiple files in HDFS after the performance was carried out under the same hash hash table directory or a directory partition based on the value of a field:

  View : similar to traditional database, read-only, create a table based on the basic

5, Hive metadata stored in the RDBMS, all other data except the metadata are stored on HDFS.

6, Hive inner table into a table, external table, partition tables and sub-tables tub

Guess you like

Origin www.cnblogs.com/xiangyuguan/p/11099557.html