Impala Quick Start

I. Introduction

  Cloudera introduced, providing for HDFS, data Hbase high performance, low-latency interactive SQL queries. Hive uses memory-based computing, both data warehouses with real-time, batch processing, multiple concurrent advantage of. CDH platform of choice for real-time PB magnitude data analysis engine.

II. Advantages

  1. The memory-based computing, enables interactive real-time queries to PB-level data and analysis

  2. No conversion MR, data is read directly HDFS

  Write 3.C ++, LLVM compiler to run unified

  4. compatible HiveSQL

  The data warehouse having features, the data can be analyzed directly on hive

  6. Support data localization

  7. Support columnar storage

  8. Supports JDBC / ODBC remote connection

II. Disadvantages

  1. Large memory demand

  Write 2.C ++, not open source

  3. completely dependent hive

  4. Practice shows that, when a serious decline in the performance of the partition than 1W

  The stable than hive

 

Guess you like

Origin www.cnblogs.com/yszd/p/11408441.html