Big Vernacular Explains the core knowledge of big data HBase in detail, Lao Liu is really attentive (2)

Insert picture description here

Foreword: Lao Liu is currently working hard for next year's school recruitment. The main purpose of writing the article is to explain in detail the big data knowledge points he reviewed in the vernacular, and refuse to use mechanical methods in the materials to have his own understanding!

01 HBase knowledge points

Insert picture description here
Point 6: HRegionServer architecture

Insert picture description here
Why should we understand the architecture of HRegionServer? Because the storage of data in the HBase cluster has a very large relationship with HRegionServer, only by understanding its architecture can the logic of data storage be clarified.

Then let Liu introduce the HRegionServer architecture well.

StoreFile

In the HRegionServer architecture diagram, StoreFile is a physical file that saves actual data. StoreFile is stored on HDFS in the form of HFile. Each Store has one or more StoreFiles, and the data is ordered in each StoreFile. Why is orderly, will be introduced in MEMStore.

As for what HFile is, Lao Liu can only understand that it is the storage format of StoreFile on HDFS, which means that StoreFile is stored on HDFS in the form of HFile.

MemStore

It means write cache. Since HFile requires the data to be ordered, it needs to be sorted according to rowkey according to the data stored on HDFS, so the data is stored in MemStore first, after sorting, it will be flushed to StoreFile after reaching the threshold, and each flush is generated A new StoreFIle.

How does MemStore sort the data? The teaching materials of many institutions are not mentioned. After checking a lot of materials, Liu Liu only has a relatively shallow understanding. The data structure that realizes the MemStore model is SkipList, which can realize efficient query, insertion, and deletion. operating. Because jump tables are essentially composed of ordered linked lists, many KV databases use jump tables to implement ordered data collection. So, after the data is transferred to MemStore, it will use the jump table to realize the order of these data.

WAL

WAL, the full name is Write Ahead Log, means to write a log in advance. Since data must be sorted by MemStore before being flushed to HFile, there is a high probability that data will be lost if the data is stored in the memory, so in order to solve this problem, the data will first be written in a file called Write Ahead Log , And then write it to MemStore. So when the system fails, the data can be reconstructed through this log file to avoid data loss.

BlockCache

It means read cache. The data that is queried each time will be cached in the BlockCache to facilitate the next data query.

Point 7: HBase read data process

Insert picture description here
The first thing to say is that there is only one meta table (metadata table) in HBase, this table has only one region table, and the region data is stored on an HRegionServer.

Remember the role of ZooKeeper in the first article of HBase? ZooKeeper stores HBase metadata meta. Then think about whether we have to read the data. Is it equivalent to visiting a teacher at the school? Is it necessary to register at the school gate first to find out which department and office the teacher is in?

So, the HBase read data process is as follows:

1. HBase reads data first to connect with zk. Find the region location of the meta table from zk, which is to find which HRegionServer the meta table is stored in. After receiving the message, the client will immediately establish a connection with this HRegionServer, and then read the meta The data in the table, the information in this meta table includes: which tables are in the HBase cluster, which regions each table has, which RegionServers are stored on, and the region information of all user tables is stored.

2. According to the namespace (namespace equivalent to the database name of a relational database), table name and rowkey information to be queried. Find the Region information corresponding to the written data.

3. Then find the RegionServer corresponding to this Region, and then send the request.

4. Now you can find and locate the corresponding Region.

5. We will first look for data from MemStore. If not, we will read it from BlockCache. If it is not found in BlockCache, we will read from the last StoreFile. After reading the data from StoreFile, Instead of directly returning the result data to the client, the data is written to the BlockCache first, in order to speed up subsequent queries; then the result is returned to the client.

Point 8: HBase write data process

Insert picture description here
1. The client first finds the region location of the meta table from zk, and then reads the data in the meta table. The meta table stores the region information of the user table.

2. According to the namespace (namespace, equivalent to the database name of the relational database), table name and rowkey information of the data to be written. Find the Region information corresponding to the written data.

3. Then find the RegionServer corresponding to this Region, and then send the request.

4. Now you can write (append) data sequentially to WAL.

5. Write the data to the corresponding MemStore, and the data will be sorted in the MemStore.

6. After reaching the flashing time of MemStore, flash the data to StoreHFile.

But think about it, if you keep writing data, will there be many StoreFiles?

In the HBase cluster, for the many StoreFiles generated, it will merge them into one large StoreFile (this is a small merge), and finally all StoreFiles will undergo a large merge. However, in the process of continuous data writing, the data in the Region will become larger and larger. At this time, the Region will be split (divided into two). The splitting process consumes more performance, and there is a pre-partition.

Old Liu roughly talked about why there are flush and compact mechanisms, Region split and merge, and pre-partitioning in HBase. Let's talk about these knowledge points in detail below.

Point 9: flush and compact mechanisms in HBase

Flush mechanism

Insert picture description here
Data needs to be flashed to disk. There are many flashing opportunities, not just flashing.

The first is to reach the Memtore level limit. When the size of any MemStore in the Region reaches the upper limit, the default is 128M, it will trigger the MemStore refresh. The upper limit is set as follows:

<property>
  <name>hbase.hregion.memstore.flush.size</name>
  <value>134217728</value>
</property>

The second is to reach the limit of Region level. When the total size of all MemStore in the Region reaches the upper limit, the default is 2*128M=256M, which will trigger the MemStore refresh. The upper limit is set as follows:

<property>
  <name>hbase.hregion.memstore.flush.size</name>
  <value>134217728</value>
</property>
<property>
  <name>hbase.hregion.memstore.block.multiplier</name>
  <value>2</value>
</property>

The third is to reach the RegionServer level limit. When the total size of all MemStores in a RegionServer exceeds the low-water threshold, the RegionServer starts to force flush; Flush the Region with the largest MemStore first, and then execute the next largest, and execute in turn;

If the write speed is greater than the flush write speed, the total MemStore size will eventually exceed the high watermark threshold (default is 40% of the JVM memory). At this time, the RegionServer will block the update and force flush until the total MemStore size is lower than the low watermark threshold. . The settings of these thresholds are as follows:

<property>
  <name>hbase.regionserver.global.memstore.size.lower.limit</name>
  <value>0.95</value>
</property>
<property>
  <name>hbase.regionserver.global.memstore.size</name>
  <value>0.4</value>
</property>

The fourth is when the number of WAL files exceeds hbase.regionserver.max.logs, regions will be flushed in chronological order until the number of WAL files is reduced to hbase.regionserver.max.log (this attribute name has been abandoned, No need to set manually now, the maximum value is 32).

The fifth is to refresh the MemStore regularly, the default period is 1 hour, to ensure that MemStroe will not be persistent for a long time. In order to avoid the problems caused by flushing all MemStores at the same time, the regular flush operation will have a random delay of about 20,000.

Compact merger mechanism

Why is there a Compact merge mechanism?

In order to prevent too many small files and ensure query efficiency, hbase needs to merge these small store files into a relatively large store file when necessary. This process is called compaction.

There are mainly two types of compaction merging in hbase:

Insert picture description here
minor compaction

Insert picture description here
It will merge multiple HFiles in the Store into one HFile. In this process, it will select some small, adjacent StoreFiles to merge them into a larger StoreFile.

Data that exceeds TTL (time to live), updated data, and deleted data are only marked. There is no physical deletion. The result of a Minor Compaction is to produce a smaller number of StoreFiles and a larger memory. The trigger frequency of this combination is very high.

The trigger conditions for minor compaction are determined by the following parameters:

<!--表示至少需要三个满足条件的store file时,minor compaction才会启动-->
<property>
  <name>hbase.hstore.compactionThreshold</name>
  <value>3</value>
</property>

<!--表示一次minor compaction中最多选取10个store file-->
<property>
  <name>hbase.hstore.compaction.max</name>
  <value>10</value>
</property>

<!--默认值为128m,
表示文件大小小于该值的store file 一定会加入到minor compaction的store file中
-->
<property>
  <name>hbase.hstore.compaction.min.size</name>
  <value>134217728</value>
</property>

<!--默认值为LONG.MAX_VALUE,
表示文件大小大于该值的store file 一定会被minor compaction排除-->
<property>
  <name>hbase.hstore.compaction.max.size</name>
  <value>9223372036854775807</value>
</property>

major compaction

Insert picture description here
Combining all StoreFiles into one StoreFile will clean up three types of meaningless data in the process: deleted data, TTL expired data, and data whose version number exceeds the set version number. Its merging frequency is relatively low, it is executed once every 7 days by default, and the performance consumption is very large. Generally, it is recommended to shut down production and trigger it manually during application idle time. Generally, it can be manually controlled to merge, which can prevent it from appearing during peak business periods.

Major compaction trigger time condition (7 days)

<!--默认值为7天进行一次大合并,-->
<property>
  <name>hbase.hregion.majorcompaction</name>
  <value>604800000</value>
</property>

Manual trigger

#使用major_compact命令
major_compact tableName

Point 10: Region split

A large amount of rowkey data is stored in the region. When there are too many data items in the region, it directly affects the query efficiency. When the region is too large. Hbase will split the region, which is also an advantage of Hbase.

HBase's region split strategy has the following types, let's talk about three types first.

1、ConstantSizeRegionSplitPolicy

It is the default segmentation strategy before version 0.94.

When the size of the region is greater than a certain threshold (hbase.hregion.max.filesize=10G), the segmentation will be triggered, and a region will be divided into 2 regions equally.

However, this segmentation strategy has considerable drawbacks: the segmentation strategy does not distinguish between large tables and small tables. Larger threshold settings are more friendly to large tables, but small tables may not trigger splits, and in extreme cases there may be one, which is not a good thing for business. If the setting is small, it is friendly to small tables, but a large table will generate a large number of regions in the entire cluster, which is not a good thing for cluster management, resource usage, and failover.

2、IncreasingToUpperBoundRegionSplitPolicy

It is the default segmentation strategy from version 0.94 to version 2.0.

The segmentation strategy is a bit more complicated. Generally speaking, the idea is the same as ConstantSizeRegionSplitPolicy. A region whose size is larger than the set threshold will trigger segmentation. But this threshold is not a fixed value like ConstantSizeRegionSplitPolicy, but will be constantly adjusted under certain conditions. The adjustment rules are related to the number of regions in the current regionserver of the table to which the region belongs.

3、SteppingSplitPolicy

It is the default segmentation strategy of version 2.0

The segmentation threshold of this segmentation strategy has changed again. Generally, there will be some unreasonable places when it is just designed. With the slow development, the designers gradually improve.

Compared with the previous IncreasingToUpperBoundRegionSplitPolicy, it is simpler, and it is still related to the number of regions in the current regionserver of the table to be split. If the number of regions is equal to 1, the segmentation threshold is flush size * 2, otherwise it is MaxRegionFileSize. This segmentation strategy is more friendly to large tables and small tables in a large cluster than IncreasingToUpperBoundRegionSplitPolicy. Small tables will no longer generate a large number of small regions, but just enough.

There are still a few left, and Old Liu can hardly remember.

Point 11: Region merge

When do I need to merge Regions?

For example, after a table is split, it is divided into two, but in the use process, the data of these two tables is deleted, and the amount of data becomes smaller. To facilitate management and the like, you can merge them.

Another example is to delete a large amount of data. At this time, each Region becomes very small, and storing multiple Regions is wasted. At this time, the Regions can be merged to reduce some Region server nodes.

In short, Region merging is not for performance, but to facilitate maintenance and management.

Point 12: Pre-partitioning of HBase tables

When a table is just created, HBase will assign a region to the table by default. This is equivalent to saying that at this time, all read and write requests will access the same region of the same regionServer. At this time, the effect of load balancing cannot be achieved, because other regionServers in the cluster may be in comparison. Idle state. There was a phenomenon where one person worked and the others watched the play.

To solve this problem, you can use pre-partitioning, which is configured when the table is created to generate multiple regions.

Pre-partitioning principle

Each region maintains startRow and endRowKey. If the added data meets the rowKey range maintained by a certain region, the data will be handed over to this region for maintenance.

How to manually specify pre-partitions

method one:

create 'person','info1','info2',SPLITS => ['1000','2000','3000','4000']

There will be the following effect:
Insert picture description here
Method two:

Create partition rules in files

cd /kkb/install

vim split.txt

Add something like this inside:

aaa
bbb
ccc
ddd

When executing this command:

create 'student','info',SPLITS_FILE => '/kkb/install/split.txt'

Will get this effect:
Insert picture description here

02 Summary

Well, the second part of the big data HBase knowledge points are almost summarized, the content is more, you need to understand carefully, and strive to tell these knowledge points in your own words.

Finally, if you feel that there is something wrong or wrong, you can contact the official account: Lao Liu who works hard, and communicate! I hope to be helpful to students who are interested in big data development, and hope to get their guidance.

Guess you like

Origin blog.csdn.net/qq_36780184/article/details/110287761
Recommended