Hbase contrast with the Hive
1、Hive
1.1, data warehousing
Hive of nature in fact equivalent to the files stored in HDFS already had a two-shot relationship Mysql in order to facilitate the use HQL to manage queries (to map structured data into a table).
1.1, for data analysis, cleaning
Hive suitable for data analysis and off-line cleaning, high latency
1.2, based on HDFS, MapReduce
Hive data is still stored on DataNode, written in HQL statement will eventually be converted into MapReduce code execution. (Do not need to drill a dead end situation of the implementation of MapReduce code)
2、HBase
2.1 Database
It is a column for storing non-relational database.
2.2, for storing data structured and unstructured
Storage for single-table non-relational data, not suitable for related queries, and other similar JOIN operations.
2.3, based on HDFS
Embodied in the form of persistent storage of data is hFile, stored in DataNode in ResionServer be managed in the region.
2.4, low latency, access online services using
The face of a large number of enterprise data, HBase can be linear single table to store large amounts of data, while providing efficient data access speed.
3. Summary
Hive and Hadoop-based Hbase are two different technologies, Hive is a SQL-like engine and run MapReduce tasks, Hbase is a NoSQL on top of Hadoop Key / vale database. Both tools can be used simultaneously. Like with Google to search, socialize with FaceBook Like, Hive can be used for statistical inquiry, HBase can be used for real-time query, data can be written from the Hive HBase, Hive or write-back from HBase.
-----------------------------------------------------------------------------------------------
Well, the content of the article to end here. If you have more good perspective, welcome to readers to share small series. Our next issue to see
my small Rebels, Chilean students pass a training college. A programming industry amateurs ... ha ha ha
Success requires a friend, but a huge success needs enemies |