Big Data: Hbase

Big Data: Hbase

  • What is Hbase

Hbase is a distributed, extensible, NoSQL database that supports massive data storage, physical structure storage structure (KV).

  • If there is no Hbase

How to return hundreds of millions of data in seconds in a big data scenario. (Conditional: single data, range data)

hbase.apache.org

1 Hbase structure and data type

  • Logical structure

  • Physical structure

The whole watch will be cut according to the Row Key (Region) in the horizontal direction. Then press ColumnFamily to cut (Store) in the vertical direction,

  • Name Space: Name space

    • Similar to the database concept in a relational database, multiple tables can be placed under each namespace, and there are two namespaces by default: hbase and default. Hbase stores Hbase's built-in tables. The default table is the default namespace used by users. (For example, the namespace test is assigned to the order table, which can be written as test: order)
  • Row: Row

    • Each row of data in Hbase consists of a RowKey and multiple columns.
  • Column: Column

    • Each column in Hbase is qualified by ColumnFamily (column family) and ColumnQualifier (column qualifier) ​​(for example: personal_info: name, personal_info: city)
  • Cell: cell

    • The unit uniquely determined by {RowKey, ColumnFamily, ColumnQualifier, TimeStamp}, the data in the Cell is not typed, and all are stored in byte code.
  • Row Key: Row key

    • Row Key must be unique in the table and must exist.
    • Row keys are arranged in order according to the lexicographic order (one value is greater than no value). For example, row_key11 is arranged between row_key1 and row_ley2.
    • All access to the table must go through Row Key. (Single RowKey access, or RowKey range access, or full table scan)
  • ColumnFamily: column family

    • When creating an Hbase table, you only need to specify the CF. When inserting data, the columns (fields) can be dynamically increased as needed.
    • Each CF can have one or more column members (ColumnQualifier).
    • Different column families are stored in different folders in hdfs.
  • TimeStamp: timestamp

    • Used to identify different versions of data. If you do not specify a timestamp, Hbase will automatically add the current system timestamp to this field value when writing data.

2 Hbase architecture

The following explains the functions of the components in the above diagram from small to large.

  • StoreFile

    • StoreFile is the file that HBase really stores, and finally stored in the DataNode through the HDFS client. (That is, in the linux disk)
  • Store

    • It can be understood as a set of column families in a sliced ​​region. (As shown above, there are multiple stores in a Region)
    • Store contains Mem Store (memory storage), StoreFile (data flashed from memory, more will be merged, and larger data will be divided)
  • Region

    • Region can be understood as a slice of a table. Region is divided according to the data size threshold and Row key.
    • HBase automatically divides the table horizontally (by row) into multiple regions (regions), and each region will store a continuous piece of data in a table.
    • At the beginning of each table, there is only one region. With the continuous insertion of data, the region continues to increase. When it reaches a threshold, the region will be divided into two new regions according to the Row key, and so on.
    • As the number of rows in the table increases, there will be more and more regions, and the data of one table will be saved in multiple regions.
  • Love

    • Hbase's pre-write log prevents data loss under special circumstances.
  • RegionServer

    • Data operations (DML): get, put, delete
    • Management Region: SplitRegion (split), CompactRegion (merged)
  • Master

    • Table level operations (DDL): create, delete, alter
    • Manage RegionServer: monitor the status of RegionServer and assign Regions to RegionServer (if there are machines rs1, rs2, rs3, data is written to Region on rs1, rs2, r3 is idle ---> then rs1 is written a lot of data to reach the upper limit of Region After rs1 divides the region equally, it will notify the master to send one of them to rs3 for management.)

3 Command line operation

3.1 Link hbase

  • Link hbase
hbase shell
  • View help commands or use commands in detail
help
help '命令'

3.2 Namespace operation

3.2.1 Query the namespace

list_namespace

3.2.2 Query the table under the namespace

list_namespace_tables '命名空间名'

3.2.3 Create Namespace

create_namespace '命名空间名'

3.2.4 Delete namespace (requires namespace to be empty)

drop_namespace '命名空间名'

3.3 DDL operation

3.3.1 Query all user tables

list

3.3.2 Create Table

create '命名空间:表', '列族1', '列族2', '列族3','列族4'... 

As shown in the figure, there is a series of out-of-order folders. This series of out-of-order folders represents the Region number

3.3.3 View table details

describe '命名空间:表'

It can be seen that VERSIONS is 1, which means that this table can only store one version of data.

3.3.4 Change table information

It is mainly used to modify the version saving information of the table, and can also be specified when the table is created, but the shell command is complicated, so the change command is generally used.

alter '命名空间:表',{NAME=>'列族名',VERSIONS=>3}

3.3.5 Modify the table status (the table must be invalid before deletion)

  • Failure table
disable '表'
  • Enable table
enable '表'

3.3.6 Delete table

delete '表'

3.4 DML operation

3.4.1 Inserting data

put '命名空间:表','RowKey','列族:列','值'
put '命名空间:表','RowKey','列族:列','值',时间戳(版本控制) 

image-20200407233815062

As shown in the figure, no data file is generated, because the data is in memory, you need to flush the 'table', and then you can see the data landing. (Flush is to generate a StoreFile once)

3.4.2 Scanning table

#全表扫描
scan '命名空间:表'
#范围扫描(左闭右开)
scan '命名空间:表',{STARTROW => 'RowKey',STOPROW=>'RowKey'} #扫描N个版本的数据 scan '命名空间:表',{RAW=>true,VERSIONS=>10} 

3.4.3 Flush

flush '命名空间:表'
  • Data version retention mechanism

From the above, it is known that flush is to generate a StoreFile once, then the data will store the most recent data according to the number of versions of the table reservation.

For example: if the number of reserved versions is 2, then if you insert v1, v2, v3 three data, after flush, only two data v2, v3 remain, then insert v4, v5, v6 three data, after flush, remaining The following data are four versions of v2, v3, v5, and v6 (in this case, two StoreFile files). If a Region merge or split occurs, the StoreFile file will be merged and placed in the corresponding Region. At this time, the data It will be deleted according to the number of reserved versions, and v2, v3, v5, v6 will become v5, v6. (If there is no manual flush, or the set automatic flush time, the data will not be deleted according to the number of versions) (By default, more than 3 StoreFile files will be merged)

  • One column family corresponds to one MemStore
  • Each MemStore generates an independent StoreFile when flashing to HDFS
  • RegionServer global MemStore refresh time: hbase.regionserver.global.memstore.size

  • Single Memstore refresh time: hbase.hregion.memstore.flush.size

3.4.3 Query data

get '命名空间:表','RowKey'
get '命名空间:表','RowKey','列族' get '命名空间:表','RowKey','列族:列' #获取N个版本的数据 get '命名空间:表','RowKey',{COLUMN=>'列族:列',VERSIONS=>10} 

3.4.4 Empty the table

truncate '命名空间:表'

3.4.5 Delete data

#delete '命名空间:表','RowKey','列族'(此命令行删除有问题,但是API可以)
delete '命名空间:表','RowKey','列族:列'
deleteall  '命名空间:表','RowKey' 

4 Reading and writing process

4.1 Writing process

  1. The client queries the location of the RegionServer where the metadata storage table is located through ZK and returns

  1. Query metadata and return the RegionServer that needs the table

  1. The client caches information for easy use next time

  2. Send a PUT request to the RegionServer, write the operation log (WAL), and then write to the memory, and then synchronize wal to HDFS, then it is over. (In this step, the transaction is rolled back to ensure that the logs and memory are successfully written)

4.2 Reading process

When reading data, MemStore and StoreFile read together, put the data in StoreFile into BlockCache, then Merge the memory data and BlockCache timestamp, and fetch the latest data and return.

5 Merging and splitting

  • Compaction

Because Memstore will generate a new HFile every time it is flashed, and different versions and different types of the same field may be distributed in different HFiles, so all HFiles need to be traversed during query. In order to reduce the number of HFiles and clean up expired and deleted data, StoreFile merge will be performed.

Compaction is divided into Minor Compaction and Major Compaction.

Minor Compaction will merge several adjacent smaller HFiles into one larger HFile, but will not clean up the expired and deleted data.

Major Compaction will merge all HFiles in a Store into one large HFile, and will clean up expired and deleted data.

parameter settings:

hbase.hregion.majorcompaction=0

hbase.hregion.majorcompaction.jitter=0

hbase.hstore.compactionThreshold=3

  • Segmentation

By default, there is only one Region at the beginning of each Table. As the data continues to be written, the Region will be automatically split. When it is split, the two sub regions are located in the current Region Server, but for load balancing considerations, HMaster has A Region may be transferred to another Region Server.

parameter settings:

hbase.hregion.max.filesize = 5G (Max1 in the following formula) (this value can be reduced to increase concurrency)

hbase.hregion.memstore.flush.size = 258M (Max2 in the following formula)

Each split will compare the value of Max1 and Max2, whichever is smaller. [min (Max1, Max2 * Number of Regions * 2)], where the number of Regions is the number of Regions of the Table in the current Region Server.

Since automatic segmentation cannot avoid hot spots, we often use pre-partitioning and designing RowKey to avoid hot spots in production.

6 Optimization

6.1 Try not to use multiple column families

In order to avoid generating multiple small files during flushing.

6.2 Memory optimization

The main function is to cache Table data, but GC will be used when flushing, not too big, according to the cluster resources, generally allocate 70% of the entire Hbase cluster memory, 16-> 48G is enough

6.3 Allow additional content in HDFS

dfs.support.append=true (hdfs-site.xml、hbase-site.xml)

6.4 Optimize DataNode allows the maximum number of open files

dfs.datanode.max.transfer.threads=4096 (HDFS配置)

In a merge operation at the level of Region Server, Region Server is unavailable. You can adjust this value according to cluster resources to increase concurrency.

6.5 Increase the number of RPC monitors

hbase.regionserver.handler.count=30

According to the situation of the cluster, this value can be increased appropriately, the main decision is the number of client requests.

6.6 Optimize client cache

hbase.client.write.buffer = 100M (write buffer)

Increasing this value can reduce the number of RPC calls, singular will consume more memory, set according to the situation of cluster resources.

6.7 Optimization of merge and segmentation

Reference 5 merge and split

6.8 Pre-partition

  • Add parameter SPLITS when creating table
create '命名空间:表', '列族1', '列族2', '列族3','列族4'...,SPLITS=>['分区号','分区号','分区号','分区号'] 

The number of pre-partitions is selected based on the amount of data estimated from half a year to one year, and the maximum value of Region.

6.9 RowKey

  • Hashability: evenly divided into different regions
  • Uniqueness: will not repeat
  • Length: 70-100

Option 1: Random numbers, hash values, but this cannot be range-queried, and there is no data concentration.

Option 2: String inversion, for example, the hashability is achieved after the timestamp is inverted, but the concentration is only better than the first one when viewing.

  • Recommended production plan:
#设计预分区键(如比如200个区) | ASCLL码为124只有 } 和 ~ 比它大,那么不管以后的RowKey使用什么字符,都是小于这个字符的,所以可以有效的得到RowKey规律
000|
001|
......
199|


# 1 设计RowKey键_ASCLL码为95
000_
001_
......
199_
# 2 根据业务唯一标识(如用户ID,手机号,身份证)和时间维度(比如按月:202004)计算后根据分区数取余(13408657784^202004)%199=分区号
# 想以什么时间进行查询就把什么往前提,如下数据需要查1月数据范围就是 000_13408657784_2020-04  -> 000_13408657784_2020-04|
000_13408657784_2020-04-01 12:12:12
......
199_13408657784_2020-04-01 24:12:12

Guess you like

Origin www.cnblogs.com/554552f/p/12705423.html