HBase acquaintance and pseudo-distributed Cluster Setup

Outline

Here Insert Picture Description

  • HBase is an open source Apache top-level project organization distributed, scalable, big data store products
  • Apache HBase distributed on the establishment of HDFS, based on non-relational database column stores; has a reliable, stable, automatic fault tolerance features such as multi-version
  • HBase is an open source implementation of Google BigTable, web crawling reptiles
  • When HBase run one hundred million data queries, efficiency can reach second grade, real-time processing millisecond online processing

Feature

  • Large: table can be billions of rows, one million for large-scale structured data storage
  • Column-oriented: List (clusters) for storage and access control, column (cluster) independent retrieval.
  • Sparse structure: for the column is empty (NULL), and does not take up storage space, so the table can be designed very sparse.
  • No Mode: Each row has a primary sort key and can be any number of columns, the column can be increased dynamically as needed, with a different row in the table can have very different columns.
  • Multiple versions of data: the data in each cell may have multiple versions, by default, automatically assigned version number, the version number of the time stamp when the cell is inserted.
  • Single Data Type: The data HBase used when the underlying storage byte[], no type.

NoSQL Features

1. 部分NoSQL In-Memory 内存型 (Redis)
2. Schema-Less NoSchema 弱格式 无格式
3. 杜绝表连接
4. 弱化事务,没有事务 (Redis有事务,MongoDB(4.x 没事务 4.x后有事务了)
5. 搭建集群方便 

NoSQL classification

1. key value 类型 redis
2. document  类型 mongodb
3. column    类型 HBase Cassandra
4. 图         类型 neo4j (金融 知识图谱) 

HBase data model

Here Insert Picture Description

The difference between the column and row memory storage

Here Insert Picture Description
Most relational databases, storage systems are line; line data storage system in a progressive additional physical media (disks) storage organization; when good OLTP (online transaction processing) operation, the specified field inquiries:

  • It may cause some unnecessary IO waste
  • Normally downstream storage systems are stand-alone Server, the scale of strict hardware restrictions
  • Query latency higher

HBase is a column storage system, the data in columns distributed storage cluster; good OLAP (Online Analytical Processing), mass data storage structure;

HBase pseudo-distributed cluster environment set up

  1. linux server ip host hostname mapping firewall selinux ssh-free dense jdk
  2. hadoop installation
   解压缩
   6个配置文件
   格式化
   启动进程
  1. Installation zookeeper
  解压缩
  配置conf/zoo.cfg 
  创建临时目录 data ---> myid文件(集群)
  启动服务

4.HBase installation

  • Upload the installation package
  • Unzip installation
 tar -zxf /root/hbase-1.2.4-bin.tar.gz -C /usr
  • Create hdfs / hbase folders and hbase_home / data / tmp folder
  • Configuring the site.xml-HDFS
    [
root@HadoopNode00 hbase-1.2.4]# vi conf/hbase-site.xml
<property>
    <name>hbase.rootdir</name>
    <value>hdfs://HadoopNode00:9000/hbase</value>
</property>
<property>
    <name>hbase.cluster.distributed</name>
    <value>true</value>
</property>
<property>
    <name>hbase.zookeeper.quorum</name>
    <value>HadoopNode00</value>
</property>
<property>
    <name>hbase.zookeeper.property.clientPort</name>
    <value>2181</value>
</property>
  • Configuration regionservers
[root@HadoopNode00 hbase-1.2.4]# vi conf/regionservers
HadoopNode00
  • Configuration environment variable
[root@HadoopNode00 hbase-1.2.4]# vi ~/.bashrc
export HBASE_HOME=/usr/hbase-1.2.4
export JAVA_HOME=/home/java/jdk1.8.0_181
export HADOOP_HOME=/home/hadoop/hadoop-2.6.0
export PROTBUF_HOME=/home/protobuf/protobuf-2.5.0
export FINDBUGS_HOME=/home/findbugs/findbugs-3.0.1
export MAVEN_HOME=/home/maven/apache-maven-3.3.9
export M2_HOME=/home/maven/apache-maven-3.3.9
export PATH=$PATH:$JAVA_HOME/bin:$MAVEN_HOME/bin:$M2_HOME/bin:$FINDBUGS_HOME/bin:$PROTBUF_HOME/bin:$HADOOP_HOME/sbin:$HADOOP_HOME/bin:$HBASE_HOME/bin
[root@HadoopNode00 hbase-1.2.4]# source ~/.bashrc
  • Start Service
[root@HadoopNode00 hbase-1.2.4]# start-hbase.sh
[root@HadoopNode00 hbase-1.2.4]# jps
2480 SecondaryNameNode
2113 NameNode
8082 HRegionServer  # 从服务
8438 Jps
7927 HMaster        # 主服务
2231 DataNode
1851 QuorumPeerMain
  • Access web UI
    Here Insert Picture Description
Published 24 original articles · won praise 1 · views 500

Guess you like

Origin blog.csdn.net/Mr_YXX/article/details/105022615