Hbase shell 重用命令

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/weixin_40294332/article/details/81237176
  1. 启动脚本
# ./hbase shell
  1. 创建表
> create 'test', 'cf'
  1. 添加数据
> put 'test', 'row1','cf:a', 'value1'
  1. 查询数据
hbase(main):006:0> scan 'test'
ROW                                   COLUMN+CELL                                                                                               
 row1                                 column=cf:a, timestamp=1525675291901, value=value1                                                        
1 row(s)
Took 0.0475 seconds                                        
  1. 查看表
hbase(main):005:0> list
TABLE                                                                                                                                           
test                                                                                                                                            
1 row(s)
Took 0.0114 seconds                                                                                                                             
=> ["test"]
  1. 表虚拟名称
hbase(main):007:0> tab = get_table 'test'
Took 0.0004 seconds                                                                                                                             
=> Hbase::Table - test
hbase(main):008:0> tab.scan
ROW                                   COLUMN+CELL                                                                                               
 row1                                 column=cf:a, timestamp=1525675291901, value=value1                                                        
1 row(s)
Took 0.0203 seconds                                                                                                                             hbase(main):009:0> 

hbase(main):016 > tables = list(‘t.*’)
TABLE
t
1 row(s) in 0.1040 seconds
  1. 日期格式化
 LOG data to timestamp
To convert the date '08/08/16 20:56:29' from an hbase log into a timestamp, do:

hbase(main):021:0> import java.text.SimpleDateFormat
hbase(main):022:0> import java.text.ParsePosition
hbase(main):023:0> SimpleDateFormat.new("yy/MM/dd HH:mm:ss").parse("08/08/16 20:56:29", ParsePosition.new(0)).getTime() => 1218920189000
To go the other direction:

hbase(main):021:0> import java.util.Date
hbase(main):022:0> Date.new(1218920189000).toString() => "Sat Aug 16 20:56:29 UTC 2008"
To output in a format that is exactly like that of the HBase log format will take a little messing with SimpleDateFormat.
  1. 查询shell配置
hbase(main):001:0> @shell.hbase.configuration.get("hbase.rpc.timeout")
=> "60000"
To set a config in the shell:

hbase(main):005:0> @shell.hbase.configuration.setInt("hbase.rpc.timeout", 61010)
hbase(main):006:0> @shell.hbase.configuration.get("hbase.rpc.timeout")
=> "61010"

9.通过shell对表进行预分区

You can use a variety of options to pre-split tables when creating them via the HBase Shell create command.

The simplest approach is to specify an array of split points when creating the table. Note that when specifying string literals as split points, these will create split points based on the underlying byte representation of the string. So when specifying a split point of '10', we are actually specifying the byte split point '\x31\30'.

The split points will define n+1 regions where n is the number of split points. The lowest region will contain all keys from the lowest possible key up to but not including the first split point key. The next region will contain keys from the first split point up to, but not including the next split point key. This will continue for all split points up to the last. The last region will be defined from the last split point up to the maximum possible key.

hbase>create 't1','f',SPLITS => ['10','20',30']
In the above example, the table 't1' will be created with column family 'f', pre-split to four regions. Note the first region will contain all keys from '\x00' up to '\x30' (as '\x31' is the ASCII code for '1').

You can pass the split points in a file using following variation. In this example, the splits are read from a file corresponding to the local path on the local filesystem. Each line in the file specifies a split point key.

hbase>create 't14','f',SPLITS_FILE=>'splits.txt'
The other options are to automatically compute splits based on a desired number of regions and a splitting algorithm. HBase supplies algorithms for splitting the key range based on uniform splits or based on hexadecimal keys, but you can provide your own splitting algorithm to subdivide the key range.

# create table with four regions based on random bytes keys
hbase>create 't2','f1', { NUMREGIONS => 4 , SPLITALGO => 'UniformSplit' }

# create table with five regions based on hex keys
hbase>create 't3','f1', { NUMREGIONS => 5, SPLITALGO => 'HexStringSplit' }
As the HBase Shell is effectively a Ruby environment, you can use simple Ruby scripts to compute splits algorithmically.

# generate splits for long (Ruby fixnum) key range from start to end key
hbase(main):070:0> def gen_splits(start_key,end_key,num_regions)
hbase(main):071:1>   results=[]
hbase(main):072:1>   range=end_key-start_key
hbase(main):073:1>   incr=(range/num_regions).floor
hbase(main):074:1>   for i in 1 .. num_regions-1
hbase(main):075:2>     results.push([i*incr+start_key].pack("N"))
hbase(main):076:2>   end
hbase(main):077:1>   return results
hbase(main):078:1> end
hbase(main):079:0>
hbase(main):080:0> splits=gen_splits(1,2000000,10)
=> ["\000\003\r@", "\000\006\032\177", "\000\t'\276", "\000\f4\375", "\000\017B<", "\000\022O{", "\000\025\\\272", "\000\030i\371", "\000\ew8"]
hbase(main):081:0> create 'test_splits','f',SPLITS=>splits
0 row(s) in 0.2670 seconds

=> Hbase::Table - test_splits
Note that the HBase Shell command truncate effectively drops and recreates the table with default options which will discard any pre-splitting. If you need to truncate a pre-split table, you must drop and recreate the table explicitly to re-specify custom split options.
  1. debug模式
To enable DEBUG level logging in the shell, launch it with the -d option.

$ ./bin/hbase shell -d
  1. 统计总数
Count command returns the number of rows in a table. It’s quite fast when configured with the right CACHE

hbase> count '<tablename>', CACHE => 1000
The above count fetches 1000 rows at a time. Set CACHE lower if your rows are big. Default is to fetch one row at a time.

猜你喜欢

转载自blog.csdn.net/weixin_40294332/article/details/81237176