Web environment
Dedicated line: Users need to configure the VPC-related network information of the hbase cluster to the dedicated line, which can directly connect to the hbase environment
Public cloud virtual machine VPC environment: choose to communicate with hbase VPC
Others: need to open hbase public network
Note: Import hbase data by default, and use the community package for the dependent hbase-common, hbase-client, hbase-server, and hbase-protocol. If it is a public network, you need to use the relevant package released by cloud hbase
Option 1: hive association hbase table
Applicable scenarios: The amount of data is less than 4T (because you need to import data through hbase's api)
Obtain the zk connection address from the hbase page, and start the hive client in the following way
hive --hiveconf hbase.zookeeper.quorum=xxxx
The case where the hbase table does not exist
Create the hive table hive_hbase_table to map the hbase table base_table, the hbase table hbase_table will be automatically created, and will be deleted as the hive table is deleted. Here you need to specify the mapping relationship from hive schema to hbase schema. For the type, please refer to Hive/HBaseIntegration
CREATE TABLE hive_hbase_table(key int, value string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:val") TBLPROPERTIES ("hbase.table.name" = "hbase_table", "hbase.mapred.output.outputtable" = "hbase_table");
Create an original hive table and prepare some data
create table hive_data (mykey int,myval string);insert into hive_data values(1,"www.ymq.io");
Import the data in the original hive table hive_data into the hbase table hbase_table through the hive table hive_hbase_table
insert into table hive_hbase_table select * from hive_data;
Check whether there is data in the hbase table hbase_table
The existence of hbase table
Create hive external table to associate hbase table, pay attention to the mapping relationship between hive schema and hbase schema. Deleting the external table will not delete the corresponding hbase table
CREATE EXTERNAL TABLE hive_hbase_external_table(key int, value string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:val") TBLPROPERTIES ("hbase.table.name" = "hbase_table", "hbase.mapred.output.outputtable" = "hbase_table");
Other imported data is related to 2
Solution 2: Generate hfile from hive table and import to hbase through bulkload
Applicable scenarios: large amount of data (above 4T)
Convert hive data to hfile
Start hive and add the relevant hbase jar package
add jar /usr/lib/hive-current/lib/hive-hbase-handler-2.3.3.jar;add jar /usr/lib/hive-current/lib/hbase-common-1.1.1.jar;add jar /usr/lib/hive-current/lib/hbase-client-1.1.1.jar;add jar /usr/lib/hive-current/lib/hbase-protocol-1.1.1.jar;add jar /usr/lib/hive-current/lib/hbase-server-1.1.1.jar;
Create a hive table whose outputformat is HiveHFileOutputFormat
其中/tmp/hbase_table_hfile/cf_0是hfile保存到hdfs的路径,cf_0是hbase family的名字
create table hbase_hfile_table(key int, cf_0_c0 string) stored asINPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat'OUTPUTFORMAT 'org.apache.hadoop.hive.hbase.HiveHFileOutputFormat'TBLPROPERTIES ('hfile.family.path' = '/tmp/hbase_table_hfile/cf_0');
把原始数据表的数据通过hbase_hfile_table表保存为hfile
insert into table hbase_hfile_table select * from hive_data;
查看对应hdfs路径是否生成了hfile
通过bulkload将数据导入到hbase表中
使用阿里云hbase客户端创建具有上面对应family的hbase表
hbase(main):012:0> create 'hbase_hfile_load_table','cf_0'
下载云hbase客户端,配置hbase-site.xml,并将hdfs-site.xml、core-site.xml拷贝到hbase/conf目录
wget http://public-hbase.oss-cn-hangzhou.aliyuncs.com/installpackage/alihbase-1.1.4-bin.tar.gz . vi conf/hbase-site.xml <property> <name>hbase.zookeeper.quorum</name> <value>xxx</value> </property>
Execute bulkload to import into the hbase table
bin/hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles hdfs://maste:port/tmp/hbase_table_hfile/ hbase_hfile_load_table
Check whether the data is imported in the hbase table hbase_hfile_load_table