Hbase Summary

(A) .Hbase basic introduction

1.hbase hdfs above is established to provide high reliability, high performance, storage columns, scalable, real-time database system to read and write

2.hbase Features:
  HBase Everything is stored in bytes
  of HBase RowKey be sorted byte order, and adds the index
  HBase automatically cut according to the number of row Region, maintaining load balancing and redundancy

3.hbase storage structure:
  RowKey: a Byte array, each record in the table is the "master key", easy to quickly find, Rowkey design is very important;
  Column Family: column families, has a name (string), contains one or more than correlation column; the column under the same column group having the same properties
  column: belong to a columnfamily, familyName: columnName, each record can be dynamically added;
  the Cell: wherein the timestamp is a time stamp, value is the value of the corresponding column rowkey

  hbase(main):009:0> scan 'User'

  ROW                                       COLUMN+CELL

  id001 column=personInfo:name, timestamp=1502368030841, value=xiaoming
  id001 column=personInfo:age, timestamp=1502368069926, value=18
  id001 column=personInfo:sex, timestamp=1502368093636, value=man

 

(B) .Hbase common commands

1. Enter the shell: hbase shell

[hadoop@indb-3-136-hzifc bin]$ echo $HBASE_HOME

/data/program/hbase

[hadoop@indb-3-136-hzifc bin]$ /data/program/hbase/bin/hbase shell


2. Check all the tables: list

hbase(main):003:0> list
T
TABLE
S
SYSTEM.CATALOG
S
SYSTEM.FUNCTION
S
SYSTEM.SEQUENCE
S
SYSTEM.STATS
T
TEST.USER
U
User

6 row(s) in 0.0340 seconds


3. To view a table below for details: describe

hbase(main):004:0> describe 'User'
T
Table User is ENABLED
U
User
C
COLUMN FAMILIES DESCRIPTION
{
{NAME => 'info', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE',

DATA_BLOCK_ENCODING => 'NONE', TTL => 'FORE
V
VER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE =>

'0'}

Row. 1 (S) in 0.1410 seconds The



4. Create a table: create

Syntax: create <table>, {NAME => <family>, VERSIONS => <VERSIONS>}
Create a User table, columns can be one or more aromatic info

hbase (main): 002: 0 > create 'User', ' the INFO1 '
0 Row (S) in 1.5890 seconds The


5. Remove the specified column family: delete

语法: alter 表名,'delete' =>'列族'

hbase(main):002:0> alter 'User', 'delete' => 'info'
U
Updating all regions with the new schema...

1/1 regions updated.
D
Done.

Row 0 (S) in 2.5340 seconds The



6. The insert data: put

语法:put <table>,<rowkey>,<family:column>,<value>

hbase(main):005:0> put 'User', 'row1', 'info:name', 'xiaoming'
0 row(s) in 0.1200 seconds

hbase(main):006:0> put 'User', 'row2', 'info:age', '18'
0 row(s) in 0.0170 seconds

hbase(main):007:0> put 'User', 'row3', 'info:sex', 'man'
0 row(s) in 0.0030 seconds




The rowKey query a record: get

语法:get <table>,<rowkey>,[<family:column>,....]

hbase(main):008:0> get 'User', 'row2'

COLUMN CELL

info:age timestamp=1502368069926, value=18
1 row(s) in 0.0280 seconds

 

hbase(main):028:0> get 'User', 'row3', 'info:sex'

COLUMN CELL

info:sex timestamp=1502368093636, value=man

 

hbase(main):036:0> get 'User', 'row1', {COLUMN => 'info:name'}

COLUMN CELL

info:name timestamp=1502368030841, value=xiaoming

1 row(s) in 0.0120 seconds



8. query all records: scan

Syntax: scan <table>, {COLUMNS => [<family: column>, ....], LIMIT => num}

scans recorded
hbase (main): 009: 0 > scan 'User'

ROW COLUMN+CELL

row1 column=info:name, timestamp=1502368030841, value=xiaoming

row2 column=info:age, timestamp=1502368069926, value=18
row3 column=info:sex, timestamp=1502368093636, value=man

3 row(s) in 0.0380 seconds

扫描前2条
hbase(main):037:0> scan 'User', {LIMIT => 2}
R
ROW COLUMN+CELL

row1 column=info:name, timestamp=1502368030841, value=xiaoming

row2 column=info:age, timestamp=1502368069926, value=18
2 row(s) in 0.0170 seconds

范围查询
hbase(main):011:0> scan 'User', {STARTROW => 'row2'}
R
ROW COLUMN+CELL

row2 column=info:age, timestamp=1502368069926, value=18
row3 column=info:sex, timestamp=1502368093636, value=man

2 row(s) in 0.0170 seconds

hbase(main):012:0> scan 'User', {STARTROW => 'row2', ENDROW => 'row2'}
R
ROW COLUMN+CELL

row2 column=info:age, timestamp=1502368069926, value=18
1 row(s) in 0.0110 seconds

 

HBase (main): 013: 0> Scan 'the User', {StartRow => 'ROW2', endRow => 'Row3'}
R & lt
the ROW the COLUMN + the CELL

ROW2 column = info: Age, timestamp = 1502368069926, value = 18 is
. 1 Row (s) in 0.0120 seconds

in addition, you can also add advanced features such as TIMERANGE and FITLER
STARTROW, eNDROW must be capitalized, otherwise an error; the query results do not include the results set equal eNDROW
 

9. Statistics Number of records: count

Syntax: COUNT <Table>, {the INTERVAL => intervalNum, the CACHE =>} cacheNum

the INTERVAL row shows a set number and the corresponding RowKey default 1000; CACHE each fetch buffer area size, default is 10, the parameters can be adjusted speed up the search
HBase (main): 020: 0> COUNT 'the User'
. 3 Row (S) in 0.0360 seconds The




10. The delete: delete

Remove column
HBase (main): 008: 0> Delete 'the User', 'ROW1', 'info: Age'
0 Row (S) in 0.0290 seconds The

delete rows
hbase (main): 014: 0 > deleteall 'User', 'ROW2'
0 Row (S) in 0.0090 seconds the

clear all data in
HBase (main): 016: 0> TRUNCATE 'the User'
T
Truncating 'the User' table (IT On May Take the while A):

- Disabling table ...

- Truncating table ...

Row 0 (S) in 3.6610 seconds The


11. Check whether table exists: exists

hbase(main):022:0> exists 'User'
T
Table User does exist

Row 0 (S) in 0.0150 seconds The


12. The disable table: disable

hbase(main):014:0> disable 'User'
0 row(s) in 2.2660 seconds



Table 13. Enable: enable

hbase(main):017:0> enable 'User'
0 row(s) in 1.3470 seconds



14. Delete the table: drop

Before deleting, you must first disable

hbase(main):031:0> disable 'TEST.USER'
0 row(s) in 2.2640 seconds
hbase(main):033:0> drop 'TEST.USER'
0 row(s) in 1.2490 seconds

 

(C) .scala hbase operation of api

import org.apache.hadoop.hbase.{HTableDescriptor,HColumnDescriptor,HBaseConfiguration,TableName}
import org.apache.hadoop.hbase.client.{ConnectionFactory,Put,Get,Delete,Scan}
import org.apache.hadoop.hbase.util.Bytes
import scala.collection.JavaConversions._
import java.util



val conf=HBaseConfiguration.create()
//Connection 的创建是个重量级的工作,线程安全,是操作hbase的入口
val conn=ConnectionFactory.createConnection(conf)
//从Connection获得 Admin 对象(相当于以前的 HAdmin)
val admin=conn.getAdmin
//本例将操作的表名
val userTable=TableName.valueOf("user_score_table")


val cf1="scoreInfo"
val cf2="addressInfo"
val cn1="math"
val cn2="physics"
val cn3="Addr"


if(admin.tableExists(userTable)){
  println("Table exists!")
  //admin.disableTable(userTable)
  //admin.deleteTable(userTable)
  //exit()
}else{
  val tableDesc=new HTableDescriptor(userTable)
  tableDesc.addFamily(new HColumnDescriptor("scoreInfo".getBytes))
  tableDesc.addFamily(new HColumnDescriptor("addressInfo".getBytes))
  admin.createTable(tableDesc)
  println("Create table success!")
}



//插入一条rowkey 为 IromMan 的数据
val p=new Put("IromMan".getBytes())
//为put操作指定 column 和 value (以前的 put.add 方法被弃用了)
p.addColumn(cf1.getBytes,cn1.getBytes,"98".getBytes) // scoreInfo:math  98
p.addColumn(cf1.getBytes,cn2.getBytes,"87".getBytes) // scoreInfo:physics  87
p.addColumn(cf2.getBytes,cn3.getBytes,"Beijing".getBytes) // addressInfo
table.put(p)


//按rowkey查询数据
val listGet=new util.ArrayList[Get]
val get=new Get(Bytes.toBytes("id002_Thor"))
val get2=new Get(Bytes.toBytes("id003_jack"))
listGet.add(get)
listGet.add(get2)
val resultArr=myTable.get(listGet).flatMap(z=>{
  val cellArr=z.rawCells()
  val valueArr=cellArr.map(n=>(Bytes.toString(z.getRow()),(Bytes.toString(CellUtil.cloneQualifier(n)),Bytes.toString(CellUtil.cloneValue(n)))))
  valueArr
})


userTable.close()
conn.close()

 

Published 53 original articles · won praise 40 · views 40000 +

Guess you like

Origin blog.csdn.net/u012761191/article/details/105311437