Hive architecture and basic knowledge

Insert picture description here

1. User interface: Client CLI (hive shell), JDBC/ODBC (java access hive), WEBUI (browser access
hive)
2.Metadata: Metastore metadata includes: table name, database to which the table belongs (default is default), table owner, column/partition field, table type (whether it is an external table), the directory where the table data is located, etc.; default storage In the built-in derby database, it is recommended to use MySQL to store the Metastore

3. Hadoop
uses HDFS for storage and MapReduce for calculations.

4. Driver: Driver
(1) Parser (SQL Parser): Convert SQL strings into abstract syntax tree AST. This step is generally done with a
third-party tool library, such as antlr; perform syntax analysis on the AST, such as whether the table exists, fields Whether it exists and whether the SQL semantics are wrong.
(2) Compiler (Physical Plan): Compile AST to generate a logical execution plan.
(3) Optimizer (Query Optimizer): optimize the logical execution plan.
(4) Execution: Convert the logical execution plan into a physical plan that can be run. For Hive
, it is MR/Spark.

The operating mechanism of hive

Through a series of interactive interfaces provided to users, Hive receives user instructions (SQL), uses its own Driver,
combined with metadata (MetaStore), translates these instructions into MapReduce, submits them to Hadoop for execution, and finally
returns the execution The result is output to the user interactive interface.

Guess you like

Origin blog.csdn.net/weixin_46457946/article/details/114315915