Basic knowledge of HBase (6): HBase docking with Hive

1. Comparison between HBase and Hive

1.Hive

(1) The essence of the data warehouse Hive is actually equivalent to making a bijection relationship between files already stored in HDFS in Mysql to facilitate the use of HQL to manage queries.

(2) Used for data analysis and cleaning Hive is suitable for offline data analysis and cleaning, with high latency.

(3) The data stored based on HDFS and MapReduce Hive is still on the DataNode, and the written HQL statements will eventually be converted into MapReduce code for execution.

2.HBase

(1) The database is a non-relational database oriented to column family storage.

(2) Used to store structured and unstructured data. It is suitable for the storage of single-table non-relational data and is not suitable for associated queries, similar to operations such as JOIN.

(3) The form of persistent storage of data based on HDFS is HFile, which is stored in DataNode and managed in the form of region by ResionServer.

(4) Low latency. When accessing online business, facing a large amount of enterprise data, HBase can store large amounts of data in a single table while providing efficient data access speed.

2. Integrated use of HBase and Hive

Screaming tip: The integration of HBase and Hive is incompatible in the latest two versions. Therefore, we can only bravely recompile: hive-hbase-handler-1.2.2.jar with tears in our eyes! ! So angry! !

<property>
    <name>hive.zookeeper.quorum</name>
    <value>hadoop100,hadoop101,hadoop102</value>
    <description>The list of ZooKeeper servers to talk to. This is only needed for read/write locks.</description>
</property>
​
<property>
    <name>hive.zookeeper.client.port</name>
    <value>2181</value>
    <description>The port of ZooKeeper servers to talk to. This is only needed for read/write locks.</description>
</property>

After configuration, distribute it to other servers

1. Case 1 Goal: Create a Hive table, associate it with the HBase table, and insert data into the Hive table while affecting the HBase table.

Step by step implementation:

(1) Create a table in Hive and associate it with HBase

CREATE TABLE hive_hbase_emp_table(
empno int,
ename string,
job string,
mgr int,
hiredate string,
sal double,
comm double,
deptno int)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = 
":key,info:ename,info:job,info:mgr,info:hiredate,info:sal,info:comm,info:deptno")
TBLPROPERTIES ("hbase.table.name" = "hbase_emp_table");

Tip: After completion, you can enter Hive and HBase respectively to view, and the corresponding tables are generated.

(2) Create a temporary intermediate table in Hive for loading data in the file

Tip: You cannot load data directly into the HBase table associated with Hive

Guess you like

Origin blog.csdn.net/zuodingquan666/article/details/135228873