hbase-memstore flush -1 overview

as described in title,there some memory buffers in hbase called 'memstore' which will be stuffed when writing.this policy provides a asynchronization operations in writes(if ignore the wal ,of course) and high speed retrieving in reads.both the memstore and block cache consists of a sly trick of 'double buffer':



this is the hbase component structure below



 

 of course,as some components like wal,memstore provides two styles to flush:manual or period checking

how to 

  memstore flushing is a bit complex,as it involves certain consistent operations of  some other appropriate compoents  ,like wal and mvcc.so it is very important to coplete this oper as soon as possible for avoiding blocking writings.

here i will only consider some steps about memstore but wal and mvcc,

1.take a snapshot per memstore

2.flush all underlying mutations (data,meta,index,trailer etc) to hfile

3.inline new flushed hfile and clear snapshot(ie. swith snapshot with hfile)

4.append a flag 'COMPLETE_CACHE_FLUSH' to wal that means if a later failure occurs ,the hlog will be replayed to here only

5.notify some threads who are waiting on this region to continue to mutation

  'snapshot' here is used for supplying continuous/uninterrupt service for readings when 'flush'.

trigger conditions

no case meaning  
1

memstore size > hbase.hregion.memstore.flush.size                 

when total memstore size belong one region is bigger than flush.size  
2 over global memstore lower water TODO  
3 too many hlogs TODO  
       

after a flush memstore ,i notified the mem usage is varied from below:

  memstore:uncompressed-file:comprssed-file = 4:2:1

for my page table.

TODO so i think it is a bit unnormal for the ratio of first pair memstore:uncompressed

ref:

hbase-hfile format

hbase-hlog sync flow

hbase-mvcc principle

hbase guide

猜你喜欢

转载自leibnitz.iteye.com/blog/2100577