The relationship between hadoop and hbase and hbase installation and verification

As you can see from the homepage of the hadoop official website, there are now five modules that come with the hadoop project, namely:

hadoop common
hdfs
yarn
mapReduce
hadoop Ozone

The first item can be seen from the name is the basic function module, hdfs is the file storage system, yarn is the scheduling and cluster management, and mapReduce is the data calculation and processing. These are the ones that will inevitably come into contact with the beginning of learning to use hadoop.
The last Hadoop Ozone is a distributed object storage system. This is a supplement to hdfs. It is a relatively new content. It appears relatively infrequently in people's mouths. Many people may not know it at first, including me.

It can also be seen from here that people who hear about big data generally hear about hbase, hive, spark, etc., which do not actually belong to the hadoop project itself, but other related projects that can be used based on hadoop.
As for the modules of Hadoop itself, as long as the corresponding version of Hadoop supports, these module functions can be used as long as Hadoop is installed. Associated projects must be installed and deployed independently, and then associated in the configuration.

Hadoop can do a lot of things, but when you mention it, you must think of big data the first time, and when it comes to big data, you must think of data processing, calculation and data storage.
If you put aside hadoop Ozone first, the data storage in the big data system I know currently cannot be separated from hdfs and hbase. Personal understanding, hbase can be regarded as a supplement to hdfs in a sense, because from the front If you understand it, you can know that hdfs does not support file content modification, and there are bound to be disadvantages as well as advantages.
We know that the database is finally stored in the form of files. Whether it is mysql, mongodb or hbase, the current underlying file system of hbase is supported by hdfs. I have verified that there is no problem, but the Internet says that it is also possible to change to other file systems. No problem, this requires further verification.
Based on the above ideas, after initially learning hdfs, it should be a basic literacy, installation, configuration and basic operation of hbase.

hbase download

Hbase can find the installation package download page http://hbase.apache.org/downloads.html from the official website. This page lists many versions. By checking the release notes, I see that version 2.2.5 already supports version 3.2.x hadoop, and my hadoop is 3.1.3, so I chose this version.
There are many ways to obtain the installation package, which have been mentioned in the redis installation and hadoop installation articles. If you want to know, you can move to the
hadoop installation environment preparation and related knowledge analysis
. Redis installation in Linux and software installation related Linux knowledge points
I will directly here Use the fastest way:

wget https://downloads.apache.org/hbase/2.2.5/hbase-2.2.5-bin.tar.gz

Unzip

tar -zxvf hbase-2.2.5-bin.tar.gz

hbase-env.sh configuration

Configure it in time after decompression. The first thing to configure is the file hbase-env.sh, which is in the conf directory of the installation directory. For example, if my hbase installation directory is /root/soft/bigdata/hbase/hbase-2.2.5, the file path is /root/soft/bigdata/hbase/hbase-2.2.5/conf/hbase-env.sh. The file needs to be configured as follows

export JAVA_HOME=/root/soft/jdk1.8.0_261
export HBASE_CLASSPATH=/root/soft/bigdata/hbase/hbase-2.2.5/conf
export HBASE_MANAGES_ZK=true

It should be noted that the actual operation is to replace the installation path of your own jdk and hbase.

hbase-sit.xml configuration

Then you want to configure is hbase-sit.xml, where you need to specify the directory to store hbase hdfs system data, due to my hadoop distributed mode, the configuration also need to turn on a distributed mode, these configurations need to be placed <configuration>and </configuration>middle :

<property>
      <name>hbase.rootdir</name>
      <value>hdfs://192.168.139.9:9000/hbase</value>
</property>
<property>
      <name>hbase.cluster.distributed</name>
      <value>true</value>
</property>
<!--
<property>
      <name>hbase.tmp.dir</name>
      <value>./tmp</value>
</property>
-->
<property>
      <name>hbase.unsafe.stream.capability.enforce</name>
      <value>false</value>
</property>

Of the above several configurations, the first one points to the hdfs file system I built, where hbase is still a non-existent directory and will be created automatically during subsequent use.
The second configuration is for my hadoop distributed mode configuration, hbase defaults this configuration to false, that is, stand-alone mode.
The third item seems to be a temporary directory. The hbase configuration file originally comes with it. I am not sure of its usefulness for the time being. I will comment it first.
The fourth item is also included in the hbase configuration file. Looking at the online explanation, it is to avoid startup errors. I will leave it unchanged for now.

hbase environment variable configuration

This is a habitual operation that can make the operation more convenient. You can execute hbase-related commands in any directory, and you don’t need to configure it if you don’t want to configure it:

export HBASE_HOME=/root/soft/bigdata/hbase/hbase-2.2.5
export PATH=$PATH:$HBASE_HOME/bin

ServerNotRunningYetException

According to the online tutorial, you should be able to start with the above configuration, provided that the hadoop has been started.
After I started hadoop, I executed the hbase startup command and start-hbase.shfound that the result was an error:

ERROR: org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: Server is not running yet 

I searched the Internet and said that it was because hdfs opened the safe mode, but after I used to hdfs dfsadmin -safemode getcheck the hdfs safe mode status, I found that I was closed here, and the reason for this error must be more than this one. (Note: It hdfs dfsadmin -safemode entercan be turned on hdfs dfsadmin -safemode leavemanually or turned off manually)
So I checked the startup log of hbase, and finally found that I was careless. When configuring the ip of hdfs, 192.168.139.9it was typed 12.168.139.9incorrectly. After changing the ip to the correct one, it succeeded. start up.

multiple SLF4J bindings

Although hbase started successfully, an alarm reminder appeared when it started:

SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/root/soft/bigdata/hadoop/hadoop-3.1.3/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]

This reminder is still very clear, that is, there is a conflict between hbase and hadoop-related log dependencies, so I changed the name of the log jar in hbase, and it was fine.

verification

shell connection

After hbase is started successfully, like mysql, mongodb, redis, hbase also has its own shell client tool, which can be connected and operated:

hbase shell

After executing the above command, you can enter the command line interface of hbase. Maybe it is because of my machine. It takes a long time to enter.

Create table

The simple operation of creating a table structure in hbase is as follows:

create 'user','name','age','addr','phone','email'

The meaning of the above command is to create a table named user, which contains attributes such as name, age, addr, phone, and email.

Master is initializing

When creating the table above, an episode occurred. The following exception was thrown during the first creation:

ERROR: org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
        at org.apache.hadoop.hbase.master.HMaster.checkInitialized(HMaster.java:2811)
        at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:2018)
        at 

Literally, the master is being started, so I waited a while and re-executed it, and it was successfully created.

View table structure

After creating the hbase table, you can use the describe command to view the table structure, for example

describe 'user'

The output of my command above is as follows:

Table user is ENABLED                                                       
user                                                                        
COLUMN FAMILIES DESCRIPTION                                                 
{NAME => 'addr', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERS
ION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE 
=> 'false', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS =>
 '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE =
> 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false', PREFETCH_
BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOC
KSIZE => '65536'}                                                           
{NAME => 'age', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSI
ON_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE =
> 'false', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => 
'0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE =>
 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false', PREFETCH_B
LOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCK
SIZE => '65536'}                                                            
{NAME => 'email', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VER
SION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE
 => 'false', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS =
> '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE 
=> 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false', PREFETCH
_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLO
CKSIZE => '65536'}                                                          
{NAME => 'name', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERS
ION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE 
=> 'false', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS =>
 '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE =
> 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false', PREFETCH_
BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOC
KSIZE => '65536'}                                                           
{NAME => 'phone', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VER
SION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE
 => 'false', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS =
> '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE 
=> 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false', PREFETCH
_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLO
CKSIZE => '65536'}                                                          
5 row(s)
QUOTAS                                                                      
0 row(s)
Took 4.6373 seconds                     

So far, it proves that the hbase installation configuration is indeed available.

Stop hbase

The hbase stop is originally very simple, start it start-hbase.sh, then it is normally said to stop stop-hbase.sh, in fact it is indeed.
But I encountered a small problem when I stopped. When I stopped for the first time, it was always in the stopping state.
The Internet said that it needs to be executed first hbase-daemons.sh stop regionserver. I tried it and it was really effective. Then the execution stop-hbase.shstopped immediately. However, this phenomenon only appeared once, and stop-hbase.shit stopped quickly every time afterwards.

For the above part, please refer to the following article:
http://dblab.xmu.edu.cn/blog/2442-2/

Guess you like

Origin blog.csdn.net/tuzongxun/article/details/107915720