HBase stand-alone configuration (official website):
HBase download http://www.apache.org/dyn/closer.cgi/hbase/
Unzip it, then go to the directory you want to unzip.
$ tar xfz hbase-××××.tar.gz
$ cd hbase-×××××
Now you can start HBase. But you may need to edit the conf/hbase-site.xml
configuration first hbase.rootdir
to choose which directory HBase writes data to.
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>hbase.rootdir</name>
<value>file:///DIRECTORY/hbase</value>
</property>
</configuration>
Will be DIRECTORY
replaced with the directory where you want to write the file. The default hbase.rootdir
is to point to /tmp/hbase-${user.name}
, which means that you will lose data after restarting (the operating system will clean the /tmp
directory when restarting)
1. Create a table
Like other databases, hbase also has the concept of a table.
hbase(main):001:0> create 'test','cf'
0 row(s) in 1.5210 seconds
=> Hbase::Table - test
The table name of this table is called test, and there is a column family called cf
(Note that all names in the shell must be enclosed in quotation marks. Unlike traditional databases, HBASE tables do not need to define which columns (fields, Columns) are, because columns can be dynamically added and deleted. But HBase tables need to be defined Column family. Each table has one or more column families, and each column must belong to only one column family. Column families are mainly used to group related columns in storage, thereby reducing access to unrelated columns to improve performance.)
Let's see what this table is (describe command)
hbase(main):003:0> describe 'test'
Table test is ENABLED
test
COLUMN FAMILIES DESCRIPTION
{NAME => 'cf', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP
_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMP
RESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '6553
6', REPLICATION_SCOPE => '0'}
1 row(s) in 0.1610 seconds
You can see that HBase sets many default properties for this table.
- Version: The default value is 3, that is, three historical versions are saved by default. That is, if a value is overwritten, unlike traditional databases, HBase not only saves the new value, but also the two most recent values.
- TTL: Lifetime, the time limit for which a data is stored in HBase. This means that if you set the TTL to be two days, the data will be automatically cleared by HBase after two days. If you want to keep it forever, just set the TTL to a larger value.
2. Insert data
Insert some data into table test:
hbase(main):004:0> put 'test','row1','cf:a','value1'
0 row(s) in 0.1330 seconds
hbase(main):005:0> put 'test','row2','cf:b','value2'
0 row(s) in 0.0150 seconds
hbase(main):006:0> put 'test','row3','cf:c','value3'
0 row(s) in 0.0130 seconds
hbase(main):007:0> put 'test','row1','cf:c','value4'
0 row(s) in 0.0100 seconds
hbase(main):008:0> put 'test','row1','cf:c','value4'
0 row(s) in 0.0100 seconds
hbase(main):009:0> put 'test','row1','cf:c','value4'
0 row(s) in 0.0100 seconds
The above command puts 3 rows of data in the table test. The command put is to insert or update a piece of data into the table. Each row of data in an HBase table is identified by a row primary key, so we use strings such as row1, row2 to identify the corresponding row. Each row is identified by a combination of "column family:column name", so cf:a is the column named a in column family cf. The last argument to the command is the value of the column.
3. Read data
Read all data below
hbase(main):010:0> scan 'test'
ROW COLUMN+CELL
row1 column=cf:a, timestamp=1499600631906, value=value1
row1 column=cf:c, timestamp=1499600731612, value=value4
row2 column=cf:b, timestamp=1499600686798, value=value2
row3 column=cf:c, timestamp=1499600707212, value=value3
3 row(s) in 0.0480 seconds
You can see a total of four data
Each piece of data has a timestamp, which is the system time recorded by HBase when it was written.
describe the table with a table
row key | column family: cf | ||
a | b | c | |
row1 | value1 | value4 | |
row2 | value2 | ||
row3 | value3 |
An empty table does not mean that the cell exists here. In traditional databases, a blank cell means that the cell exists but its value is empty (traditional databases are always structured). But the two-dimensional table is drawn here for ease of understanding, and it is completely unstructured in nature.
For example: we use get to get the data of a specific row:
hbase(main):019:0> get 'test','row2'
COLUMN CELL
cf:b timestamp=1499600686798, value=value2
1 row(s) in 0.0400 seconds
4. Update data
Update a unit of test (update the cf:a column of row1 to value5):
hbase(main):020:0> put 'test' , 'row1' ,'cf:a' , 'value5'
0 row(s) in 0.0210 seconds
result:
hbase(main):021:0> scan 'test'
ROW COLUMN+CELL
row1 column=cf:a, timestamp=1499602903513, value=value5
row1 column=cf:c, timestamp=1499601334793, value=value5
row2 column=cf:b, timestamp=1499600686798, value=value2
row3 column=cf:c, timestamp=1499600707212, value=value3
3 row(s) in 0.0350 seconds
5. Delete data
Delete this table with the following command:
hbase(main):022:0> disable 'test'
0 row(s) in 2.2800 seconds
hbase(main):023:0> drop 'test'
0 row(s) in 1.2600 seconds
To delete a table in HBase, the table must be disabled first, and then the table can be deleted.
Run the following command to exit the shell:
hbase(main):024:0> exit