Namespace for common operations in HBase

Refer to
http://blog.csdn.net/opensure/article/details/46470969

1. Introduction

In HBase, namespace namespace refers to the logical grouping of a set of tables, similar to database in RDBMS, which is convenient for business division of tables . Apache HBase supports namespace-level authorization operations since versions 0.98.0 and 0.95.2. HBase global administrators can create, modify, and reclaim namespace authorization.

2. The namespace
HBase system defines two default namespace

    hbases by default: built-in tables in the system, including namespace and meta table
    default: Tables that do not specify a namespace when users create tables are created here.

Create namespace
hbase>create_namespace 'ai_ns'

column out all namespaces

hbase>list_namespace

create tables under namespace

hbase>create 'ai_ns:testtable', 'fm1'


http://blog.jobbole.com/83614/

Base is not a relational database, it needs a different way to define your Data model, HBase actually defines a four-dimensional data model, the following is the definition of each dimension:

row key: each row has a unique row key, the row key has no data type, it is internally considered as a byte array.
Column family: Data is organized into column families in rows, each row has the same column family, but between rows, the same column family does not need to have the same column modifier. In the engine, HBase stores the column families in its own data files, so they need to be defined in advance, and it is not easy to change the column families.
Column modifiers: Column clusters define real columns, called column modifiers, you can think of column modifiers as the columns themselves.
Versions: Each column can have a configurable number of versions, and you can get data by specifying the version of the column modifier.
Gets a specified row by row key, which consists of one or more column families, each column family has one or more column modifiers (referred to as columns in Figure 1), and each column can have one or more versions . In order to get the specified data, you need to know its row key, column family, column modifier and version. When designing an HBase data model, it is helpful to consider how the data is retrieved. You can obtain HBase data in two ways:

by their row keys, or by a table scan of a range of row keys.
Batch operations using map-reduce
This dual fetching of data makes HBase very powerful, and typically storing data in Hadoop means it is beneficial for offline or batch mode analysis (especially batch analysis), However, it is not necessary for real-time acquisition. HBase supports real-time analysis through key/value storage and batch analysis through map-reduce. Let's first look at real-time data acquisition, stored as key/value, where key is a row key and value is a collection of column clusters, as shown in Figure 2.

Create a table called PageViews with a column family called info:
hbase(main):002:0> create 'PageViews', 'info'

0 row(s) in 5.3160 ​​seconds

=> Hbase::Table - PageViews
Every table needs at least one column family, so we created info, now, looking at our table, execute the following list command:

hbase(main):002:0> list

TABLE

PageViews

1 row(s) in 0.0350 seconds

= > ["PageViews"]

As you can see, the list command returns a table named PageViews, we can get more information about the table with the describe command:

hbase(main):003:0> describe 'PageViews'

DESCRIPTION ENABLED

'PageViews ', {NAME => 'info', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW',

REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE true

', MIN_VERSIONS => ' 0', TTL => 'FOREVER', KEEP_DELETED_CELLS => 'false',

BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}

1 row(s) in 0.0480 seconds

Here we create only one: info, now add the following data to the table, the following command is to add a new row in info:
hbase(main):004:0> put ' PageViews', 'rowkey1', 'info:page', '/mypage'

0 row(s) in 0.0850 seconds

Put command inserts a row key as rowkey1 For the new record, specify the page column under info, insert the record whose value is /mypage, we can then query this record through the row key rowkey1 through the get command:

hbase(main):005:0> get 'PageViews', 'rowkey1'

COLUMN CELL

info:page timestamp=1410374788088, value=/mypage

1 row(s) in 0.0250 seconds

you can see the column info:page, or more specific columns, the value of /mypage with time The stamp indicates when the record was inserted. Let's add one more line before doing the table scan:

hbase(main):006:0> put 'PageViews', 'rowkey2',

0 row(s) in 0.0050 seconds



Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326752876&siteId=291194637