HBase(二)之常用操作及读写数据原理

HBase(二)之常用操作及读写数据原理

HBase 命令

HBase中保存的都是二进制数据

1. 客户端进出命令

# 进入客户端:
	hbase shell
# 退出客户端命令:
	quit
# 帮助
	help

2. namespace操作

注:默认存在一个default的namespace

#1. 查看namespace
  list_namespace

#2. 创建namespace
  create_namespace "命名空间名字"

#3. 删除namespace
  drop_namespace "命令空间名字"

3. 表操作

# 1. 查看所有表
hbase(main):024:0> list
TABLE
user_namespace:user # namespace:表
t_person # default:表 default被省略了
2 row(s) in 0.1140 seconds

# 2. 查看某个namespace下的所有表
hbase(main):027:0> list_namespace_tables "user_namespace"
TABLE
user
1 row(s) in 0.3970 seconds

# 3. 创建表
语法:create "namespace:表名","列族1","列族2"
hbase(main):023:0> create "user_namespace:user","info","edu"
0 row(s) in 9.9000 seconds

# 4. 查看表结构
hbase(main):030:0> desc "user_namespace:user"
Table user_namespace:user is ENABLED
user_namespace:user
COLUMN FAMILIES DESCRIPTION
{
    
    NAME => 'edu', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE',
 DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE =>
 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
{
    
    NAME => 'info', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE'
, DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE =
> 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
2 row(s) in 1.6400 seconds

# 5. 删除表和禁用表
hbase(main):002:0> disable "namespace:表"
0 row(s) in 4.4790 seconds

hbase(main):002:0> drop "namespace:表"
0 row(s) in 4.4790 seconds

4.数据增删改查

# 1. 添加数据(每次只能添加一个列)
	put "namespace:表","rowkey","列族1:列名1","值"
# 2. 根据rowkey查找数据
	get "namespace:表名","rowkey"
# 3. 根据rowkey和列族查找数据
	get "namespace:表名","rowkey","列族:列"
# 4. scan 查询表中所有数据
	hbase(main):019:0> scan "user_namespace:user"
    ROW                         COLUMN+CELL
    1001                       column=info:age, timestamp=1586790192297, value=18
    1001                       column=info:name, timestamp=1586790138031, value=zhangsan1
    1002                      column=info:age, timestamp=1586790893380, value=20
    1002                      column=info:name, timestamp=1586790884872, value=zhangsan2
# 5. scan 查询表中前2条数据
	hbase(main):022:0> scan "user_namespace:user",{
    
    LIMIT=>2}
    ROW                         COLUMN+CELL
    1001                       column=info:age, timestamp=1586790192297, value=18
    1001                       column=info:name, timestamp=1586790138031, value=zhangsan1
    1002                      column=info:age, timestamp=1586790893380, value=20
    1002                      column=info:name, timestamp=1586790884872, value=zhangsan2
    1 row(s) in 0.5400 seconds
# 6. 使用start row 和 end row 范围查找
	hbase(main):029:0> scan "user_namespace:user",{
    
    STARTROW=>"1001",ENDROW=>"1002"}
    ROW                         COLUMN+CELL
    1001                       column=info:age, timestamp=1586790192297, value=18
    1001                       column=info:name, timestamp=1586790138031, value=zhangsan1
    1 row(s) in 0.4420 seconds
# 7. 使用start row和limit查找
	hbase(main):032:0> scan "user_namespace:user",{
    
    STARTROW=>"1001",LIMIT=>2}
    ROW                         COLUMN+CELL
    1001                       column=info:age, timestamp=1586790192297, value=18
    1001                       column=info:name, timestamp=1586790138031, value=zhangsan1
    1002                      column=info:age, timestamp=1586790893380, value=20
    1002                      column=info:name, timestamp=1586790884872, value=zhangsan2
# 8. 修改数据(本质上是覆盖)
	put "namespace:表","rowkey","列族:列名","值"
# 9. 删除数据(删除某个cell)
	delete "namespace:表","rowkey","列族:列名"
# 10. 删除某个rowkey对应的数据
	deleteall "namespace:表","rowkey"
# 11. 统计表中所有数据
	count "namespace:表"

5. 多版本问题

# 1. 创建表
hbase(main):013:0> create "user_namespace:user","info"
# 2. 修改版本数
hbase(main):016:0> alter "user_namespace:user",{NAME=>'info',VERSIONS=>2}
# 3. 同一个cell添加2次数据
hbase(main):014:0> put "user_namespace:user","1001","info:name","aaa"
0 row(s) in 0.2620 seconds

hbase(main):015:0> put "user_namespace:user","1001","info:name","bb"
0 row(s) in 0.0290 seconds
# 4. 查看多版本
hbase(main):017:0> get "user_namespace:user","1001",{COLUMN=>'info:name',VERSIONS=>3}
COLUMN                      CELL
 info:name                  timestamp=1586795010367, value=bb
 info:name                  timestamp=1586795004085, value=aaa
# 表的列族的VERSIONS=>2表示的该列族的数据,要保存2个版本。如果put3次,则保留最新的版本。

HBase API

环境准备

  • 依赖

    <properties>
        <hbase.version>1.5.0</hbase.version>
      </properties>
    <dependency>
        <groupId>org.apache.hbase</groupId>
        <artifactId>hbase-client</artifactId>
        <version>${hbase.version}</version>
    </dependency>
    <dependency>
        <groupId>org.apache.hbase</groupId>
        <artifactId>hbase-common</artifactId>
        <version>${hbase.version}</version>
    </dependency>
    <dependency>
        <groupId>org.apache.hbase</groupId>
        <artifactId>hbase-protocol</artifactId>
        <version>${hbase.version}</version>
    </dependency>
    <dependency>
        <groupId>org.apache.hbase</groupId>
        <artifactId>hbase-server</artifactId>
        <version>${hbase.version}</version>
    </dependency>
    <!--如果出现 jacksonmapper的异常-->
    <dependency>
        <groupId>org.codehaus.jackson</groupId>
        <artifactId>jackson-mapper-asl</artifactId>
        <version>1.9.13</version>
    </dependency>
    
  • 初始化配置

    将hbase中的conf中的 hbase-site.xml放到resource配置文件目录中。

    conf.addResource("/hbase-site.xml")

  • windows配置ip映射

API介绍

API 含义 创建
Configuration 配置文件 HBaseConfiguration.create();
Connection 连接,用来操作数据 ConnectionFactory.createConnection(conf);
Admin 客户端,用来操作元数据
(namespace和table结构)
conn.getAdmin();
NamespaceDescriptor 命名空间相当于database NamespaceDescriptor.create(“user_namespace”).build();
TableName 表名 TableName.valueOf(“user_namespace:user”);
HTableDescriptor new HTableDescriptor(tablename);
HColumnDescriptor 列族 new HColumnDescriptor(“info”);
Put 添加数据 new Put(Bytes.toBytes(“1001”));
Delete rowkey的删除条件 new Delete(Bytes.toBytes(“1001”));
Get scan多行查询器 new Get(Bytes.toBytes(“1001”));
Scan scan多行查询器 new Scan();
Result 查询结果集(单条结果) table.get(get);
ResultScanner 查询结果集(N条结果) table.getScanner(scan);
Bytes 类型转化工具类,HBase中数据类型为字节,
所有类型存入后都变成字节,需要相互转化。

HBase客户端连接

//获得客户端
//1.读取配置文件
Configuration conf = HBaseConfiguration.create();
conf.set("hbase.zookeeper.quorum","192.168.242.30");
//打印日志信息
BasicConfigurator.configure();
//2.建立连接
Connection conn = ConnectionFactory.createConnection(conf);
//3.获得客户端
Admin admin = connection.getAdmin();
//4.释放资源
admin.close();

常用API

1. 创建namespace

//1.构建namespace信息
NamespaceDescriptor namespace = NamespaceDescriptor.create("student_namespace").build();
//2.创建namespace
admin.createNamespace(namespace);

2. 表操作

操作表,使用Admin

  • 创建表

    //1.初始化表名
    TableName student = TableName.valueOf("student_namespace:student");
    //2.初始化列族信息
    HColumnDescriptor info = new HColumnDescriptor("info");
    HColumnDescriptor edu = new HColumnDescriptor("edu");
    //3.绑定表名,绑定列族
    HTableDescriptor hTableDescriptor = new HTableDescriptor(student);
    hTableDescriptor.addFamily(info);
    hTableDescriptor.addFamily(edu);
    //4.创建表
    admin.createTable(hTableDescriptor);
    
  • 判断表是否存在

    //1.创建表名
    TableName tableName = TableName.valueOf("student_namespace:student");
    //2.判断表是否存在
    boolean b = admin.tableExists(tableName);
    System.out.println(b);
    

3. 添加

操作数据使用Connection

//1.初始化要操作的表
Table table = conn.getTable(TableName.valueOf("student_namespace:student"));
//2.添加数据
Put put = new Put(Bytes.toBytes("1001"));//构造rowkey
//Bytes是HBase提供的进行字节和java数据类型转化的工具类
put.addColumn(Bytes.toBytes("info"), Bytes.toBytes("name"), Bytes.toBytes("张三"));
put.addColumn(Bytes.toBytes("info"), Bytes.toBytes("age"), Bytes.toBytes(18));
//3.将put数据添加
table.put(put);
//4.释放资源
table.close();

4. 修改

//1.初始化要操作的表
Table table = conn.getTable(TableName.valueOf("student_namespace:student"));
//2.修改的本质就是添加,利用时间戳覆盖旧的数据而已
Put put = new Put(Bytes.toBytes("1001"));//构造row key
put.addColumn(Bytes.toBytes("info"), Bytes.toBytes("name"), Bytes.toBytes("zhangsan"));
//3.添加到表中
table.put(put);
//4.关闭table
table.close();

5. 删除

//1.获得要操作的表
Table table = conn.getTable(TableName.valueOf("student_namespace:student"));
//2.创建要删除的条件,以rowkey为条件
Delete delete = new Delete(Bytes.toBytes("1001"));
//3.执行删除
table.delete(delete);

6. 查询

  • 根据row key单条查询

    //1.获得要操作的表
    Table table = conn.getTable(TableName.valueOf("student_namespace:student"));
    //2.使用row key作为查询条件
    Get get = new Get(Bytes.toBytes("1001"));
    //3.执行查询
    Result result = table.get(get);
    //4.处理结果集:result.getValue()
    byte[] nameBytes = result.getValue(Bytes.toBytes("info"), Bytes.toBytes("name"));
    byte[] ageBytes = result.getValue(Bytes.toBytes("info"), Bytes.toBytes("age"));
    //获得rowkey
    byte[] rowBytes = result.getRow();
    System.out.println(Bytes.toString(nameBytes));
    System.out.println(Bytes.toInt(ageBytes));
    System.out.println(Bytes.toString(rowBytes));
    //5.释放资源
    table.close();
    
  • 多条查询

    //1.获得要操作的表
    Table table = conn.getTable(TableName.valueOf("student_namespace:student"));
    //2.创建scan扫描器,多行查询
    Scan scan = new Scan();
    //3.指定要投射的列族
    scan.addFamily(Bytes.toBytes("info"));
    //4.设置起始和查询条数
    scan.withStartRow(Bytes.toBytes("1001"));
    scan.setLimit(10);
    //5.执行查询
    ResultScanner result = table.getScanner(scan);
    //6.处理结果集
    for (Result res:result){
          
          
    	byte[] nameBytes = result.getValue(Bytes.toBytes("info"), Bytes.toBytes("name"));
    	byte[] ageBytes = result.getValue(Bytes.toBytes("info"), Bytes.toBytes("age"));
    	byte[] rowBytes = result.getRow();
    	String name = Bytes.toString(nameBytes);
    	int age = Bytes.toInt(ageBytes);
    	String rowKey = Bytes.toString(rowBytes);
    	System.out.println(rowKey + ":" + name + ":" + age);
    }
    //7.释放资源
    table.close();
    

读写数据操作原理

读数据

在这里插入图片描述

写数据

在这里插入图片描述
在这里插入图片描述

Guess you like

Origin blog.csdn.net/weixin_44191814/article/details/121390258