The core idea of the rowkey design of the table:
- The fastest query based on rowkey
- Range query range on rowkey
- prefix match
Three ways to create a pre-partition
create 'ns1:t1', 'f1', SPLITS => ['10', '20', '30', '40']
create 't1', 'f1', SPLITS => ['10', '20', '30', '40']
create 't1', 'f1', SPLITS_FILE => '/home/hadoop/data/splits.txt', OWNER => 'johndoe'
# 在 splits.txt 文件中指定rowkey:
10,
20,
30,
40,
50
create 't1', {NAME => 'f1', VERSIONS => 5}, METADATA => { 'mykey' => 'myvalue' }
# 指定java预分区类名称
create 't1', 'f1', {NUMREGIONS => 15, SPLITALGO => 'HexStringSplit'}
tail -f Use tail -f in the command line control window, it will track in real time at a certain time.
Query HBase based on SQL syntax
Phoenix implements querying HBase with SQL
http://www.cnblogs.com/hbase-community/category/1181796.html
hbase secondary index
-
Use solr to build an hbase secondary index:
-
Build HBase secondary index using phoenix
HBase table data compression
- snappy
HBase data read and write process
https://blog.csdn.net/u011490320/article/details/50814967
Data Management in HBse
Data deletion in hbase is not a real deletion, but only a deletion mark; it will only be deleted during the compaction process. Data that meets the deletion conditions: 1. Marked for deletion 2. Exceeding the version number limit 3. Expired data lifetime
Two kinds of compaction:
1. Merge (minor)
2. Compression merge (major)
Hive and HBase integration
- Data is stored in HBase
- The description information of the hive table is stored in hive
-
corresponding element
- hive-table hbase-table
- hive-column hbase-rowkey,hbase-cf-column
- storehandler
-
In the integration mode, if there is no related jar in the hive/lib directory, you need to softly link the related jar to this directory
https://blog.csdn.net/victory0508/article/details/69258686
management table
When creating a hive table, the specified data is stored in the hbase table.
CREATE TABLE hbase_table_1(key int, value string)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:val")
TBLPROPERTIES ("hbase.table.name" = "xyz");
External table
Now that there is an HBase table, the data in the table needs to be analyzed.
CREATE EXTERNAL TABLE hbase_user(id int, name string,age int)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,info:name,info:age")
TBLPROPERTIES ("hbase.table.name" = "user");
Nature
Hive is the HBase client.
sqoop import relational database data into hive
HBase and Hue integration
If you need to start the thrift server across languages