HBase高级优化配置

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/bryce123phy/article/details/82015477

hbase.regionserver.optionalcacheflushinterval

默认是1hour,regionserver每个region上的memstore会每个小时flush一次,这也是flush的触发条件之一,代码在PeriodicMemstoreFlusher中

hbase.regionserver.regionSplitLimit

一个regionserver上的最大region个数,超过该值时,regionserver上的region不再发生split,默认是2147483647,见HBASE-13008

hbase.master.logcleaner.ttl

一个hlog在oldLog文件中最长存活时间,默认是10分钟;

hbase.ipc.server.callqueue.handler.factor

hbase ipc队列的个数=factor * handlercount,默认是0.1

hbase.ipc.server.callqueue.read.ratio

hbase ipc读请求队列的个数=hbase ipc队列的个数 * read.ratio,默认是0,不区分读写

hbase.ipc.server.callqueue.scan.ratio

hbase ipc scan请求队列的个数=hbase ipc读请求队列的个数 * scan.ratio,默认是0,不做区分

hbase.regionserver.logroll.errors.tolerated

可容忍的log roll错误,默认是2,超过此值时,regionserver会自动挂掉;

hbase.normalizer.period

hbase做region归一化的周期执行时间,默认是300000ms,region归一化时对每个table列出它的region并计算region的平均size,对大于2*平均size的region做split,或者对region1 + region2 < avgsize的两个region做合并;

hbase.regions.slop

任何regionserver有超过average + (average * slop)或小于average - (average * slop)数量的regions则对集群的所有region做重平衡

hbase.server.thread.wakefrequency

线程的sleep interval,这里的线程是指log roll在内的线程,默认是1000

hbase.hregion.percolumnfamilyflush.size.lower.bound

If FlushLargeStoresPolicy is used, then every time that we hit the

total memstore limit, we find out all the column families whose memstores

exceed this value, and only flush them, while retaining the others whose

memstores are lower than this limit. If none of the families have their

memstore size more than this, all the memstores will be flushed

(just as usual). This value should be less than half of the total memstore

threshold (hbase.hregion.memstore.flush.size)

default:16777216

hbase.hregion.memstore.block.multiplier

Block updates if memstore has hbase.hregion.memstore.block.multiplier

times hbase.hregion.memstore.flush.size bytes.  Useful preventing

runaway memstore during spikes in update traffic.  Without an

upper-bound, memstore fills such that when it flushes the

resultant flush files take a long time to compact or split, or

worse, we OOME.

默认:4

注意:当memstore的大小超过hbase.hregion.memstore.block.multiplier * hbase.hregion.memstore.flush.size时,该region上的写入会被block住,此时抛出RegionTooBusyException,日志里消息如下:

Above memstore limit, regionName=XXX, server=XXX, memstoreSize=XXX, blockingMemStoreSize=XXX

hbase.hregion.max.filesize

最大hfile的大小,如果region内有hfile文件大于该值,那么该region会split为两个,默认是10737418240(10G)

hbase.hstore.compactionThreshold

一个store的compact阈值,大于此值时,store会参与compact,默认是3

hbase.hstore.flusher.count

每个regionserver上参与flush的线程数,默认是2;

hbase.hstore.blockingStoreFiles

每个store里storefile的数量,任何store超过该值时,其所在region的写入会被阻塞,默认是10;

hbase.hstore.compaction.max & hbase.hstore.compaction.min

minor compact最多compact的hfile文件数量和最少compact的hfile文件数量,默认分别是10和3,后者表示region中超过3个storefile便会compact

hbase.hstore.compaction.max.size

大于该值的store file文件集合不再参与compact,默认是MAX_INTEGER,表示不做限制

hbase.hstore.compaction.max.size.offpeak

在offpeak时间段(低锋时间段),大于该值的store file文件集合不再参与compact,相当于offpeak时间段的hbase.hstore.compaction.max.size

hbase.hstore.compaction.ratio

For minor compaction, this ratio is used to determine whether a given StoreFile

which is larger than `hbase.hstore.compaction.min.size` is eligible for compaction.

Its effect is to limit compaction of large StoreFile. The value of

`hbase.hstore.compaction.ratio` is expressed as a floating-point decimal.

Minor compact会用RatioBased算法选出一批等待compact的文件集合,集合中的每个文件需要满足如下条件:

FileSize(i) <= ( Sum(0,N,FileSize(_)) - FileSize(i) ) * Ratio

上述表达式中的Ratio即是,ratio越大那么更大的文件可能会参与compact

默认值:1.2

hbase.hstore.compaction.ratio.offpeak

用于低锋时段的ratio参数,可以高于hbase.hstore.compaction.ratio

hbase.server.compactchecker.interval.multiplier

英文原文说得很好了

The number that determines how often we scan to see if compaction is necessary.

Normally, compactions are done after some events (such as memstore flush), but if

region didn't receive a lot of writes for some time, or due to different compaction

policies, it may be necessary to check it periodically. The interval between checks is

hbase.server.compactchecker.interval.multiplier multiplied by

hbase.server.thread.wakefrequency.

hbase.hstore.compaction.kv.max

在flush和compact里都会用scanner遍历读取cell,这个参数限制了调用一次next最多读取的cell个数,默认是10

How many KeyValues to read and then write in a batch when flushing or compacting

hbase.regionserver.handler.abort.on.error.percent

The percent of region server RPC threads failed to abort RS.

    -1 Disable aborting; 0 Abort if even a single handler has died;

    0.x Abort only when this percent of handlers have died;

    1 Abort only all of the handers have died.

默认0.5

hbase.regionserver.lease.period

已经废弃(deprecated)

新的配置名称是hbase.client.scanner.timeout.period,默认值是60000,客户端配置,表示客户端scan时的超时时间

猜你喜欢

转载自blog.csdn.net/bryce123phy/article/details/82015477
今日推荐