hbase初识---hbase shell操作以及对应java API开发

hbase说白了就是数据库，那么数据库一般都有增、删、改、查操作，我们下面就通过hbase shell 和java API ，对比看看他们分别是怎么操作hbase的。

启动hbase：

[root@dev-02 bin]# ./start-hbase.sh

使用shell连接你的hbase：

[zhangshk@fonova-hadoop1 ~]$ hbase shell
Java HotSpot(TM) 64-Bit Server VM warning: Using incremental CMS is deprecated and will likely be removed in a future release
17/12/17 12:10:46 INFO Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 1.0.0-cdh5.5.2, rUnknown, Mon Jan 25 16:27:11 PST 2016
并列出所有的表：
hbase(main):012:0> list

java API连接hbase的代码为：
并列出所有的table


package com.zhangshk;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.client.HBaseAdmin;

public class TestConnectionToHbase {
    public static Configuration conf = HBaseConfiguration.create();
    public static void main(String[] args) throws Exception{
        HBaseAdmin hBaseAdmin = new HBaseAdmin(conf);
        TableName[] tableNames = hBaseAdmin.listTableNames();
        for (TableName tableName:
             tableNames) {
            System.out.println(tableName.toString());
        }
    }
}

hbase shell 创建一张表：

我们可以先看一下create的语法：   hbase shell命令行下输入create
hbase(main):030:0> create

Examples:

Create a table with namespace=ns1 and table qualifier=t1
  hbase> create 'ns1:t1', {NAME => 'f1', VERSIONS => 5}

Create a table with namespace=default and table qualifier=t1
  hbase> create 't1', {NAME => 'f1'}, {NAME => 'f2'}, {NAME => 'f3'}
  hbase> # The above in shorthand would be the following:
  hbase> create 't1', 'f1', 'f2', 'f3'
  hbase> create 't1', {NAME => 'f1', VERSIONS => 1, TTL => 2592000, BLOCKCACHE => true}
  hbase> create 't1', {NAME => 'f1', CONFIGURATION => {'hbase.hstore.blockingStoreFiles' => '10'}}

Table configuration options can be put at the end.
Examples:

  hbase> create 'ns1:t1', 'f1', SPLITS => ['10', '20', '30', '40']
  hbase> create 't1', 'f1', SPLITS => ['10', '20', '30', '40']
  hbase> create 't1', 'f1', SPLITS_FILE => 'splits.txt', OWNER => 'johndoe'
  hbase> create 't1', {NAME => 'f1', VERSIONS => 5}, METADATA => { 'mykey' => 'myvalue' }
  hbase> # Optionally pre-split the table into NUMREGIONS, using
  hbase> # SPLITALGO ("HexStringSplit", "UniformSplit" or classname)
  hbase> create 't1', 'f1', {NUMREGIONS => 15, SPLITALGO => 'HexStringSplit'}
  hbase> create 't1', 'f1', {NUMREGIONS => 15, SPLITALGO => 'HexStringSplit', REGION_REPLICATION => 2, CONFIGURATION => {'hbase.hregion.scan.loadColumnFamiliesOnDemand' => 'true'}}

You can also keep around a reference to the created table:

  hbase> t1 = create 't1', 'f1'

Which gives you a reference to the table named 't1', on which you can then
call methods.

语法十分的详细，我们下面开始创建一张表，表明为zhangshk：tb9, columnfamily为info



hbase(main):014:0* create 'zhangshk:tb9','info'
0 row(s) in 0.4140 seconds

=> Hbase::Table - zhangshk:tb9

验证表是否存在
hbase(main):019:0> exists 'zhangshk:tb9'
Table zhangshk:tb9 does exist                                                                         
0 row(s) in 0.0200 seconds

对应的JAVA API为：

注意：hbase shell创建表的时候必须至少需要制定column family 而java api可以不需要指定，只创建一个空表。
    public static void main(String[] args) throws Exception {
        HBaseAdmin hBaseAdmin = new HBaseAdmin(conf);
        createTable(hBaseAdmin,"zhangshk:tb10");

    }
    /**
     * 创建一张hbase表，并返回表名
     *
     * @param tableName
     * @return
     */
    public static String createTable(HBaseAdmin hBaseAdmin,String tableName)throws Exception{

        if (!hBaseAdmin.tableExists(tableName)){
            hBaseAdmin.createTable(new HTableDescriptor(tableName));
        }
        return tableName;
    }

我们往刚刚创建的表里面put一下数据：下面查一下put的用法

hbase(main):034:0> create 'zhangshk:tb11'

Here is some help for this command:
Creates a table. Pass a table name, and a set of column family
specifications (at least one), and, optionally, table configuration.
Column specification can be a simple string (name), or a dictionary
(dictionaries are described below in main help output), necessarily 
including NAME attribute. 
Examples:

Create a table with namespace=ns1 and table qualifier=t1
  hbase> create 'ns1:t1', {NAME => 'f1', VERSIONS => 5}

Create a table with namespace=default and table qualifier=t1
  hbase> create 't1', {NAME => 'f1'}, {NAME => 'f2'}, {NAME => 'f3'}
  hbase> # The above in shorthand would be the following:
  hbase> create 't1', 'f1', 'f2', 'f3'
  hbase> create 't1', {NAME => 'f1', VERSIONS => 1, TTL => 2592000, BLOCKCACHE => true}
  hbase> create 't1', {NAME => 'f1', CONFIGURATION => {'hbase.hstore.blockingStoreFiles' => '10'}}

Table configuration options can be put at the end.
Examples:

  hbase> create 'ns1:t1', 'f1', SPLITS => ['10', '20', '30', '40']
  hbase> create 't1', 'f1', SPLITS => ['10', '20', '30', '40']
  hbase> create 't1', 'f1', SPLITS_FILE => 'splits.txt', OWNER => 'johndoe'
  hbase> create 't1', {NAME => 'f1', VERSIONS => 5}, METADATA => { 'mykey' => 'myvalue' }
  hbase> # Optionally pre-split the table into NUMREGIONS, using
  hbase> # SPLITALGO ("HexStringSplit", "UniformSplit" or classname)
  hbase> create 't1', 'f1', {NUMREGIONS => 15, SPLITALGO => 'HexStringSplit'}
  hbase> create 't1', 'f1', {NUMREGIONS => 15, SPLITALGO => 'HexStringSplit', REGION_REPLICATION => 2, CONFIGURATION => {'hbase.hregion.scan.loadColumnFamiliesOnDemand' => 'true'}}

You can also keep around a reference to the created table:

  hbase> t1 = create 't1', 'f1'

Which gives you a reference to the table named 't1', on which you can then
call methods.


hbase(main):035:0> put

ERROR: wrong number of arguments (0 for 4)

Here is some help for this command:
Put a cell 'value' at specified table/row/column and optionally
timestamp coordinates.  To put a cell value into table 'ns1:t1' or 't1'
at row 'r1' under column 'c1' marked with the time 'ts1', do:

  hbase> put 'ns1:t1', 'r1', 'c1', 'value'
  hbase> put 't1', 'r1', 'c1', 'value'
  hbase> put 't1', 'r1', 'c1', 'value', ts1
  hbase> put 't1', 'r1', 'c1', 'value', {ATTRIBUTES=>{'mykey'=>'myvalue'}}
  hbase> put 't1', 'r1', 'c1', 'value', ts1, {ATTRIBUTES=>{'mykey'=>'myvalue'}}
  hbase> put 't1', 'r1', 'c1', 'value', ts1, {VISIBILITY=>'PRIVATE|SECRET'}

The same commands also can be run on a table reference. Suppose you had a reference
t to table 't1', the corresponding command would be:

  hbase> t.put 'r1', 'c1', 'value', ts1, {ATTRIBUTES=>{'mykey'=>'myvalue'}}

语法十分的详细，我们下面开始给zhangshk：tb9表，put一些数据。


hbase(main):038:0> put 'zhangshk:tb9','10002','info:age','22'
0 row(s) in 0.0170 seconds

hbase(main):039:0> put 'zhangshk:tb9','10003','info:name','zhangshk'
0 row(s) in 0.0130 seconds

hbase(main):040:0> put 'zhangshk:tb9','10002','info:name','zhangshk'
0 row(s) in 0.0120 seconds

hbase(main):041:0> put 'zhangshk:tb9','10002','info:sex','male'
0 row(s) in 0.0120 seconds

查看执行结果：
hbase(main):043:0> scan 'zhangshk:tb9'
ROW                        COLUMN+CELL                                                                 
 10001                     column=info:age, timestamp=1513486347992, value=10                          
 10002                     column=info:age, timestamp=1513487143038, value=22                          
 10002                     column=info:name, timestamp=1513487176361, value=zhangshk                   
 10002                     column=info:sex, timestamp=1513487191856, value=male                        
 10003                     column=info:name, timestamp=1513487165412, value=zhangshk                   
3 row(s) in 0.0130 seconds

如果用java API put数据的方式为：

 public static void main(String[] args) throws Exception {
        HBaseAdmin hBaseAdmin = new HBaseAdmin(conf);
        //createTable(hBaseAdmin,"zhangshk:tb11");
        HTable hTable = new HTable(conf,"zhangshk:tb9");
        putData(hTable);

    }

    /**
     * 插入数据到hbase表中
     * @param htable
     * @throws Exception
     */
    public static void putData(HTable htable) throws Exception{
        Put put = new Put(Bytes.toBytes("10003"));
        put.addColumn(Bytes.toBytes("info"),Bytes.toBytes("age"),Bytes.toBytes("18"));
        htable.put(put);
    }
查看执行结果：
hbase(main):047:0> scan 'zhangshk:tb9'
ROW                        COLUMN+CELL                                                                 
 10001                     column=info:age, timestamp=1513486347992, value=10                          
 10002                     column=info:age, timestamp=1513487143038, value=22                          
 10002                     column=info:name, timestamp=1513487176361, value=zhangshk                   
 10002                     column=info:sex, timestamp=1513487191856, value=male                        
 10003                     column=info:age, timestamp=1513491759194, value=18                          
 10003                     column=info:name, timestamp=1513491742964, value=zhangshk                   
3 row(s) in 0.0180 seconds

scan 和get都可以查询数据，scan是全表扫描或者范围扫描，所以我们一般不会用这种方式查询数据，而是用get的方式，通过添加条件，这样查询就比较高效。

先来看看scan的hbase shell 使用说明：

hbase(main):048:0> scan

Here is some help for this command:
Scan a table; pass table name and optionally a dictionary of scanner
specifications.  Scanner specifications may include one or more of:
TIMERANGE, FILTER, LIMIT, STARTROW, STOPROW, TIMESTAMP, MAXLENGTH,
or COLUMNS, CACHE or RAW, VERSIONS

If no columns are specified, all columns will be scanned.
To scan all members of a column family, leave the qualifier empty as in
'col_family:'.

The filter can be specified in two ways:
1. Using a filterString - more information on this is available in the
Filter Language document attached to the HBASE-4176 JIRA
2. Using the entire package name of the filter.

Some examples:

  hbase> scan 'hbase:meta'
  hbase> scan 'hbase:meta', {COLUMNS => 'info:regioninfo'}
  hbase> scan 'ns1:t1', {COLUMNS => ['c1', 'c2'], LIMIT => 10, STARTROW => 'xyz'}
  hbase> scan 't1', {COLUMNS => ['c1', 'c2'], LIMIT => 10, STARTROW => 'xyz'}
  hbase> scan 't1', {COLUMNS => 'c1', TIMERANGE => [1303668804, 1303668904]}
  hbase> scan 't1', {REVERSED => true}
  hbase> scan 't1', {FILTER => "(PrefixFilter ('row2') AND
    (QualifierFilter (>=, 'binary:xyz'))) AND (TimestampsFilter ( 123, 456))"}
  hbase> scan 't1', {FILTER =>
    org.apache.hadoop.hbase.filter.ColumnPaginationFilter.new(1, 0)}
  hbase> scan 't1', {CONSISTENCY => 'TIMELINE'}
For setting the Operation Attributes 
  hbase> scan 't1', { COLUMNS => ['c1', 'c2'], ATTRIBUTES => {'mykey' => 'myvalue'}}
  hbase> scan 't1', { COLUMNS => ['c1', 'c2'], AUTHORIZATIONS => ['PRIVATE','SECRET']}
For experts, there is an additional option -- CACHE_BLOCKS -- which
switches block caching for the scanner on (true) or off (false).  By
default it is enabled.  Examples:

  hbase> scan 't1', {COLUMNS => ['c1', 'c2'], CACHE_BLOCKS => false}

Also for experts, there is an advanced option -- RAW -- which instructs the
scanner to return all cells (including delete markers and uncollected deleted
cells). This option cannot be combined with requesting specific COLUMNS.
Disabled by default.  Example:

  hbase> scan 't1', {RAW => true, VERSIONS => 10}

Besides the default 'toStringBinary' format, 'scan' supports custom formatting
by column.  A user can define a FORMATTER by adding it to the column name in
the scan specification.  The FORMATTER can be stipulated: 

 1. either as a org.apache.hadoop.hbase.util.Bytes method name (e.g, toInt, toString)
 2. or as a custom class followed by method name: e.g. 'c(MyFormatterClass).format'.

Example formatting cf:qualifier1 and cf:qualifier2 both as Integers: 
  hbase> scan 't1', {COLUMNS => ['cf:qualifier1:toInt',
    'cf:qualifier2:c(org.apache.hadoop.hbase.util.Bytes).toInt'] } 

Note that you can specify a FORMATTER by column only (cf:qualifier).  You cannot
specify a FORMATTER for all columns of a column family.

Scan can also be used directly from a table, by first getting a reference to a
table, like such:

  hbase> t = get_table 't'
  hbase> t.scan

Note in the above situation, you can still provide all the filtering, columns,
options, etc as described above.

使用尝试：

hbase(main):051:0> scan 'zhangshk:tb9'
ROW                        COLUMN+CELL                                                                 
 10001                     column=info:age, timestamp=1513486347992, value=10                          
 10002                     column=info:age, timestamp=1513487143038, value=22                          
 10002                     column=info:name, timestamp=1513487176361, value=zhangshk                   
 10002                     column=info:sex, timestamp=1513487191856, value=male                        
 10003                     column=info:age, timestamp=1513491759194, value=18                          
 10003                     column=info:name, timestamp=1513491742964, value=zhangshk                   
3 row(s) in 0.0170 seconds

hbase(main):054:0> scan 'zhangshk:tb9',{LIMIT=>2}
ROW                        COLUMN+CELL                                                                 
 10001                     column=info:age, timestamp=1513486347992, value=10                          
 10002                     column=info:age, timestamp=1513487143038, value=22                          
 10002                     column=info:name, timestamp=1513487176361, value=zhangshk                   
 10002                     column=info:sex, timestamp=1513487191856, value=male                        
2 row(s) in 0.0140 seconds

hbase(main):055:0> scan 'zhangshk:tb9',{LIMIT=>2,STARTROW=>'10002'}
ROW                        COLUMN+CELL                                                                 
 10002                     column=info:age, timestamp=1513487143038, value=22                          
 10002                     column=info:name, timestamp=1513487176361, value=zhangshk                   
 10002                     column=info:sex, timestamp=1513487191856, value=male                        
 10003                     column=info:age, timestamp=1513491759194, value=18                          
 10003                     column=info:name, timestamp=1513491742964, value=zhangshk                   
2 row(s) in 0.0250 seconds

hbase(main):057:0> scan 'zhangshk:tb9',{LIMIT=>2,STARTROW=>'10002',COLUMNS=>'info:age'}
ROW                        COLUMN+CELL                                                                 
 10002                     column=info:age, timestamp=1513487143038, value=22                          
 10003                     column=info:age, timestamp=1513491759194, value=18                          
2 row(s) in 0.0300 seconds

hbase(main):058:0> scan 'zhangshk:tb9',{LIMIT=>2,STARTROW=>'10002',STOPROW=>'10003',COLUMNS=>'info:age'}
ROW                        COLUMN+CELL                                                                 
 10002                     column=info:age, timestamp=1513487143038, value=22                          
1 row(s) in 0.0200 seconds

java API操作


 public static void main(String[] args) throws Exception {
        HBaseAdmin hBaseAdmin = new HBaseAdmin(conf);
        //createTable(hBaseAdmin,"zhangshk:tb11");
        HTable hTable = new HTable(conf,"zhangshk:tb9");
        //putData(hTable);
        scanTable(hTable);

    }
/**

     * 通过scan ，全表扫描数据
     * @param hTable
     * @throws Exception
     */
    public static void scanTable(HTable hTable)throws Exception{

        Scan scan = new Scan();
        scan.setStartRow(Bytes.toBytes("10002")).setStopRow(Bytes.toBytes("10003")).addColumn(Bytes.toBytes("info"),Bytes.toBytes("age"));
        ResultScanner scanner = hTable.getScanner(scan);
        Result results = scanner.next();
        for (Cell cell:
             results.rawCells()) {

            System.out.println(
                    Bytes.toString(CellUtil.cloneFamily(cell)) + "->" + Bytes.toString(CellUtil.cloneQualifier(cell))
                            + "->" + Bytes.toString(CellUtil.cloneValue(cell)) + "->" + cell.getTimestamp());

        }

    }

结果为：
info->age->22->1513487143038

和hbase shell 查询结果一致。

下面看看get查询：
先看看帮助文档：

hbase(main):059:0> get

Here is some help for this command:
Get row or cell contents; pass table name, row, and optionally
a dictionary of column(s), timestamp, timerange and versions. Examples:

  hbase> get 'ns1:t1', 'r1'
  hbase> get 't1', 'r1'
  hbase> get 't1', 'r1', {TIMERANGE => [ts1, ts2]}
  hbase> get 't1', 'r1', {COLUMN => 'c1'}
  hbase> get 't1', 'r1', {COLUMN => ['c1', 'c2', 'c3']}
  hbase> get 't1', 'r1', {COLUMN => 'c1', TIMESTAMP => ts1}
  hbase> get 't1', 'r1', {COLUMN => 'c1', TIMERANGE => [ts1, ts2], VERSIONS => 4}
  hbase> get 't1', 'r1', {COLUMN => 'c1', TIMESTAMP => ts1, VERSIONS => 4}
  hbase> get 't1', 'r1', {FILTER => "ValueFilter(=, 'binary:abc')"}
  hbase> get 't1', 'r1', 'c1'
  hbase> get 't1', 'r1', 'c1', 'c2'
  hbase> get 't1', 'r1', ['c1', 'c2']
  hbase> get 't1', 'r1', {COLUMN => 'c1', ATTRIBUTES => {'mykey'=>'myvalue'}}
  hbase> get 't1', 'r1', {COLUMN => 'c1', AUTHORIZATIONS => ['PRIVATE','SECRET']}
  hbase> get 't1', 'r1', {CONSISTENCY => 'TIMELINE'}
  hbase> get 't1', 'r1', {CONSISTENCY => 'TIMELINE', REGION_REPLICA_ID => 1}

Besides the default 'toStringBinary' format, 'get' also supports custom formatting by
column.  A user can define a FORMATTER by adding it to the column name in the get
specification.  The FORMATTER can be stipulated: 

 1. either as a org.apache.hadoop.hbase.util.Bytes method name (e.g, toInt, toString)
 2. or as a custom class followed by method name: e.g. 'c(MyFormatterClass).format'.

Example formatting cf:qualifier1 and cf:qualifier2 both as Integers: 
  hbase> get 't1', 'r1' {COLUMN => ['cf:qualifier1:toInt',
    'cf:qualifier2:c(org.apache.hadoop.hbase.util.Bytes).toInt'] } 

Note that you can specify a FORMATTER by column only (cf:qualifier).  You cannot specify
a FORMATTER for all columns of a column family.

The same commands also can be run on a reference to a table (obtained via get_table or
create_table). Suppose you had a reference t to table 't1', the corresponding commands
would be:

  hbase> t.get 'r1'
  hbase> t.get 'r1', {TIMERANGE => [ts1, ts2]}
  hbase> t.get 'r1', {COLUMN => 'c1'}
  hbase> t.get 'r1', {COLUMN => ['c1', 'c2', 'c3']}
  hbase> t.get 'r1', {COLUMN => 'c1', TIMESTAMP => ts1}
  hbase> t.get 'r1', {COLUMN => 'c1', TIMERANGE => [ts1, ts2], VERSIONS => 4}
  hbase> t.get 'r1', {COLUMN => 'c1', TIMESTAMP => ts1, VERSIONS => 4}
  hbase> t.get 'r1', {FILTER => "ValueFilter(=, 'binary:abc')"}
  hbase> t.get 'r1', 'c1'
  hbase> t.get 'r1', 'c1', 'c2'
  hbase> t.get 'r1', ['c1', 'c2']
  hbase> t.get 'r1', {CONSISTENCY => 'TIMELINE'}
  hbase> t.get 'r1', {CONSISTENCY => 'TIMELINE', REGION_REPLICA_ID => 1}

hbase shell:

hbase(main):063:0> get 'zhangshk:tb9','10003',{COLUMN=>'info:age'}
COLUMN                     CELL                                                                        
 info:age                  timestamp=1513491759194, value=18                                           
1 row(s) in 0.0080 seconds

java API:

 public static void main(String[] args) throws Exception {
        HBaseAdmin hBaseAdmin = new HBaseAdmin(conf);
        //createTable(hBaseAdmin,"zhangshk:tb11");
        HTable hTable = new HTable(conf,"zhangshk:tb9");
        //putData(hTable);
        //scanTable(hTable);
        getData(hTable);

    }
public static void getData(HTable hTable) throws  Exception{
        Get get = new Get(Bytes.toBytes("10003"));
        get.addColumn(Bytes.toBytes("info"),Bytes.toBytes("age"));
        Result result = hTable.get(get);
        Cell[] cells = result.rawCells();
        for (Cell cell:
             cells) {
            System.out.println(Bytes.toString(CellUtil.cloneRow(cell))+"->"+
                    Bytes.toString(CellUtil.cloneFamily(cell))+"->"+Bytes.toString(CellUtil.cloneQualifier(cell))+"->"+Bytes.toString(CellUtil.cloneValue(cell))
            );
        }
    }
查询结果为：
10003->info->age->18

结果和hbase shell一样

update ，在hbase中可以使用put来实现，

hbase(main):064:0> put 'zhangshk:tb9','10003','info:name','zhangshk_update'
0 row(s) in 0.0160 seconds

hbase(main):065:0> get 'zhangshk:tb9','10003'
COLUMN                     CELL                                                                        
 info:age                  timestamp=1513491759194, value=18                                           
 info:name                 timestamp=1513493686439, value=zhangshk_update                              
2 row(s) in 0.0090 seconds

DELETE ，最后讲讲删除操作。

delete有两个命令，一个是delete ，一个是deleteall 我们先来看看delete：

hbase(main):068:0> delete

Here is some help for this command:
Put a delete cell value at specified table/row/column and optionally
timestamp coordinates.  Deletes must match the deleted cell's
coordinates exactly.  When scanning, a delete cell suppresses older
versions. To delete a cell from  't1' at row 'r1' under column 'c1'
marked with the time 'ts1', do:

  hbase> delete 'ns1:t1', 'r1', 'c1', ts1
  hbase> delete 't1', 'r1', 'c1', ts1
  hbase> delete 't1', 'r1', 'c1', ts1, {VISIBILITY=>'PRIVATE|SECRET'}

The same command can also be run on a table reference. Suppose you had a reference
t to table 't1', the corresponding command would be:

  hbase> t.delete 'r1', 'c1',  ts1
  hbase> t.delete 'r1', 'c1',  ts1, {VISIBILITY=>'PRIVATE|SECRET'}

我们可以看到delete 的时候必须要指定timestamp，但是一般来说我们都不知道timestamp
所以我们来看看deleteall
hbase(main):071:0> deleteall

Here is some help for this command:
Delete all cells in a given row; pass a table name, row, and optionally
a column and timestamp. Examples:

  hbase> deleteall 'ns1:t1', 'r1'
  hbase> deleteall 't1', 'r1'
  hbase> deleteall 't1', 'r1', 'c1'
  hbase> deleteall 't1', 'r1', 'c1', ts1
  hbase> deleteall 't1', 'r1', 'c1', ts1, {VISIBILITY=>'PRIVATE|SECRET'}

The same commands also can be run on a table reference. Suppose you had a reference
t to table 't1', the corresponding command would be:

  hbase> t.deleteall 'r1'
  hbase> t.deleteall 'r1', 'c1'
  hbase> t.deleteall 'r1', 'c1', ts1
  hbase> t.deleteall 'r1', 'c1', ts1, {VISIBILITY=>'PRIVATE|SECRET'}

deleteall就可以多种方式删除了。

hbase(main):073:0> deleteall 'zhangshk:tb9','10003'
0 row(s) in 0.0320 seconds

hbase(main):074:0> scan 'zhangshk:tb9'
ROW                        COLUMN+CELL                                                                 
 10001                     column=info:age, timestamp=1513486347992, value=10                          
 10002                     column=info:age, timestamp=1513487143038, value=22                          
 10002                     column=info:name, timestamp=1513487176361, value=zhangshk                   
 10002                     column=info:sex, timestamp=1513487191856, value=male                        
2 row(s) in 0.0130 seconds

java API中就没有delete 和deleteall 的概念，只有一个delete ，应该是把他们统一起来了。

public static void deleteData(HTable hTable) throws Exception{
        Delete delete = new Delete(Bytes.toBytes("10002"));
        hTable.delete(delete);
    }

执行结果：
hbase(main):075:0> scan 'zhangshk:tb9'
ROW                        COLUMN+CELL                                                                 
 10001                     column=info:age, timestamp=1513486347992, value=10                          
1 row(s) in 0.0190 seconds

我们只讲了删除了，那么要做就做的不留痕迹，我们只是删除了数据，那怎么删除表呢，对于hbase中的表我们需要先disable表，之后再执行删除操作,删除我们使用的是drop关键字，用来删除整张表。

看一下disable的使用说明：

hbase(main):076:0> disable

Here is some help for this command:
Start disable of named table:
  hbase> disable 't1'
  hbase> disable 'ns1:t1'

hbase(main):077:0> disable 'zhangshk:tb9'
0 row(s) in 1.2640 seconds

hbase(main):080:0> drop 'zhangshk:tb9'
0 row(s) in 0.3400 seconds

hbase(main):081:0> describe 'zhangshk:tb9'

ERROR: Unknown table zhangshk:tb9!

表已经不存在了。

java API 删除表：

/**
     * 删除表
     * @param hBaseAdmin
     * @throws Exception
     */
    public  static void deleteTable(HBaseAdmin hBaseAdmin) throws Exception{
        if(hBaseAdmin.tableExists("zhangshk:tb9")){
            hBaseAdmin.disableTable("zhangshk:tb9");
            hBaseAdmin.deleteTable("zhangshk:tb9");
        }

    }

最后关闭shell

hbase(main):084:0* exit
[zhangshk@fonova-hadoop1 hbaseTest-1.0-SNAPSHOT]$

停掉hbase 进程

[zhangshk@fonova-hadoop1 ~]$ sh stop-hbase.sh
stopping hbase........

下面是完整的demo代码：
需要说明一下，这里是在公司的完全分布式集群上做的，所以HBaseConfiguration可以直接拿到我的hbase-site.xml中的配置信息，如果你使用的是单机版，或者伪分布式的环境，那么你需要手动指定相关的zk地址等configuration信息。

package com.zhangshk;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.*;
import org.apache.hadoop.hbase.client.*;
import org.apache.hadoop.hbase.util.Bytes;

public class TestConnectionToHbase {
    public static Configuration conf = HBaseConfiguration.create();

    public static void main(String[] args) throws Exception {
        HBaseAdmin hBaseAdmin = new HBaseAdmin(conf);
        //createTable(hBaseAdmin,"zhangshk:tb11");
        HTable hTable = new HTable(conf,"zhangshk:tb9");
        //putData(hTable);
        //scanTable(hTable);
        //getData(hTable);
        deleteData(hTable);

    }

    /**
     * 创建一张hbase表，并返回表名
     *
     * @param tableName
     * @return
     */
    public static String createTable(HBaseAdmin hBaseAdmin,String tableName)throws Exception{

        if (!hBaseAdmin.tableExists(tableName)){
            hBaseAdmin.createTable(new HTableDescriptor(tableName).addFamily(new HColumnDescriptor("info")));

        }
        return tableName;
    }

    /**
     * 删除表
     * @param hBaseAdmin
     * @throws Exception
     */
    public  static void deleteTable(HBaseAdmin hBaseAdmin) throws Exception{
        if(hBaseAdmin.tableExists("zhangshk:tb9")){
            hBaseAdmin.disableTable("zhangshk:tb9");
            hBaseAdmin.deleteTable("zhangshk:tb9");
        }

    }

    /**
     * 插入数据到hbase表中
     * @param htable
     * @throws Exception
     */
    public static void putData(HTable htable) throws Exception{
        Put put = new Put(Bytes.toBytes("10003"));
        put.addColumn(Bytes.toBytes("info"),Bytes.toBytes("age"),Bytes.toBytes("18"));
        htable.put(put);
    }

    /**
     * 通过scan ，全表扫描数据
     * @param hTable
     * @throws Exception
     */
    public static void scanTable(HTable hTable)throws Exception{

        Scan scan = new Scan();
        scan.setStartRow(Bytes.toBytes("10002")).setStopRow(Bytes.toBytes("10003")).addColumn(Bytes.toBytes("info"),Bytes.toBytes("age"));
        ResultScanner scanner = hTable.getScanner(scan);
        Result results = scanner.next();
        for (Cell cell:
             results.rawCells()) {

            System.out.println(
                    Bytes.toString(CellUtil.cloneFamily(cell)) + "->" + Bytes.toString(CellUtil.cloneQualifier(cell))
                            + "->" + Bytes.toString(CellUtil.cloneValue(cell)) + "->" + cell.getTimestamp());

        }

    }

    /**
     * get方式获取数据
     * @param hTable
     * @throws Exception
     */
    public static void getData(HTable hTable) throws  Exception{
        Get get = new Get(Bytes.toBytes("10003"));
        get.addColumn(Bytes.toBytes("info"),Bytes.toBytes("age"));
        Result result = hTable.get(get);
        Cell[] cells = result.rawCells();
        for (Cell cell:
             cells) {
            System.out.println(Bytes.toString(CellUtil.cloneRow(cell))+"->"+
                    Bytes.toString(CellUtil.cloneFamily(cell))+"->"+Bytes.toString(CellUtil.cloneQualifier(cell))+"->"+Bytes.toString(CellUtil.cloneValue(cell))
            );
        }
    }

    /**
     * 删除数据
     * @param hTable
     * @throws Exception
     */
    public static void deleteData(HTable hTable) throws Exception{
        Delete delete = new Delete(Bytes.toBytes("10002"));
        hTable.delete(delete);
    }
}

hbase初识---hbase shell操作以及对应java API开发

hbase说白了就是数据库，那么数据库一般都有增、删、改、查操作，我们下面就通过hbase shell 和java API ，对比看看他们分别是怎么操作hbase的。

猜你喜欢