Impala 2.12.0 released, a high-performance distributed SQL engine

  

Impala 2.12.0 is released, and there is no related update information at present. You can follow the update homepage for updated information.

Impala is a high-performance distributed SQL engine, an MPP (massively parallel processing) SQL query engine for processing large amounts of data stored in Hadoop clusters. Written in C++ and Java, it provides high performance and low latency compared to other Hadoop SQL engines.

Impala combines the SQL support and multi-user performance of traditional analytical databases with the scalability and flexibility of Apache Hadoop by using standard components such as HDFS, HBase, Metastore, YARN, and Sentry.

  • With Impala, users can use SQL queries to communicate with HDFS or HBase in a faster way than other SQL engines such as Hive.

  • Impala can read almost all file formats used by Hadoop, such as Parquet, Avro, RCFile.

Impala uses the same metadata, SQL syntax (Hive SQL), ODBC driver, and user interface (Hue Beeswax) as Apache Hive, providing a familiar and unified platform for batch or real-time queries.

Unlike Apache Hive, Impala is not based on the MapReduce algorithm. It implements a daemon-based distributed architecture that is responsible for all aspects of query execution running on the same machine.

Therefore, it reduces the latency of using MapReduce, which makes Impala faster than Apache Hive.

click to download

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325079838&siteId=291194637