Hbase of BlockCache

HBase provides two cache structure in implementation: MemStore and BlockCache, memstore mainly for write caching, and blockcache for read caching. After which MemStore called write caching, HBase data write operation will first be written MemStore, and sequential write HLog, data memstore water level reaches the value set by the system, it will trigger flush memstore the data written to disk in the brush. This design can greatly enhance the write performance of HBase. Not only that, MemStore also crucial for read performance, if not MemStore, just read the written data will need to find the IO from the file, this is clearly the price is expensive!

BlockCache referred to as read cache. Customers will first memstore read request to search the data, finding out if it is to blockcache in check, and then finding out it will read from the disk, and the read data simultaneously into blockcahce. We know that the cache has three different update strategy, namely, FIFO (First In First Out - First Input First Output), LRU (least recently used - Least recently used) and LFU (least recently used infrequently - Least Frequently Used), hbase of block using the LRU policy, when the size of the BlockCache reaches the upper limit, it will trigger buffer elimination mechanism, the oldest batch of data eliminated.

And N has a BlockCache Memstore one RegionServer.

HBase BlockCache follows in which the position shown in FIG:

Achieve BlockCache is based On-heap ConcurrentHashMap. The key is an object BlockCacheKey map type, including offset, hfileName like member variables, the map value LruCachedBlock type of object, the cache represents an entity, which defines the object member variable accesstime, for comparison based upon the LRU eliminated. BlockCache are fixed in size, the parameters determined by the hfile.block.cache.size, default 40% of RegionServer heap memory.

Before introducing BlockCache, briefly review the concept in Block HBase, detailing poke here. HBase Block is the smallest data storage unit, the default is 64K, the parameter can be specified by the construction of the table statement BlockSize. HBase in Block divided into four types: Data Block, Index Block, Bloom Block and Meta Block. Wherein Data Block is used to store the actual data, usually each Data Block KeyValue can store a plurality of data; Bloom Block Index Block and are used to optimize the search paths random read, wherein Index Block look up data by storing the index data, and Bloom Block algorithms can be filtered off through a certain portion must be examined KeyValue data file does not exist, to reduce unnecessary IO operations; meta data meta Block HFile the entire main storage.

 

BlockCache is Region Server level, a Region Server has only one Block Cache, complete Block Cache initialization when the Region Server starts. So far, HBase has achieved three kinds Block Cache program, LRUBlockCache was the original implementation plan, is also the default implementations; HBase 0.92 version implements the second option SlabCache, see HBASE-4027 ; after HBase 0.96 provides another official One option BucketCache, see HBASE-7404 .

 

The difference between the three schemes in that the memory management mode, in which all data is LRUBlockCache into the JVM Heap, to manage the JVM. The latter two use different mechanisms to part of the data stored in a heap outside, to HBase own management. This evolution is due to LRUBlockCache programs JVM garbage collection procedures often lead to long pauses, while the use of external heap memory data management can prevent this from happening.

 


About LruBlockCache BucketCache see podcasts
LruBlockCache default, (on-heap) present in the heap memory
BucketCache exist outside the heap memory (OFF-heap)
https://www.cnblogs.com/zackstang/p/10061379.html
HTTPS: / /www.cnblogs.com/panfeng412/archive/2012/09/24/hbase-block-cache-mechanism.html
 

发布了131 篇原创文章 · 获赞 79 · 访问量 31万+

Guess you like

Origin blog.csdn.net/qq_31780525/article/details/100581239