I. Introduction
Cloudera introduced, providing for HDFS, data Hbase high performance, low-latency interactive SQL queries. Hive uses memory-based computing, both data warehouses with real-time, batch processing, multiple concurrent advantage of. CDH platform of choice for real-time PB magnitude data analysis engine.
II. Advantages
1. The memory-based computing, enables interactive real-time queries to PB-level data and analysis
2. No conversion MR, data is read directly HDFS
Write 3.C ++, LLVM compiler to run unified
4. compatible HiveSQL
The data warehouse having features, the data can be analyzed directly on hive
6. Support data localization
7. Support columnar storage
8. Supports JDBC / ODBC remote connection
II. Disadvantages
1. Large memory demand
Write 2.C ++, not open source
3. completely dependent hive
4. Practice shows that, when a serious decline in the performance of the partition than 1W
The stable than hive