hbase-region balancer

why

  along with the time goes,as unreasonable table design,or some crash nodes etc,these will cause some regionservers have overloaded regions and/or much underloaded regions,so a balancer is necessary to balance the load(regions count) per regionserver to avoid uncomplete resource usages.

how to 

  in hbase ,there is a balancer which can be run by manual or a period checker.for the later,the period will be adjusted by below property:

  <property>
    <name>hbase.balancer.period</name>
    <value>300000</value>
    <description>Period at which the region balancer runs in the Master.
    </description>
  </property>

look at the balance flow in HMaster, u will see a approx process like below:

1.generate region plans.that is which and where region to be moved.

2.close the region by regionserver notified by master

3.assign the closed region(s) by master in a normal procedure.

  1.generate region plans in master



  there is a slop called "hbase.regions.slop" to accommodate the balance accuracy.the less this value is,the more accurate results in.

 also,the ceil and floor are generated by the averge load * (1 +/- slop).

 but at last ,the calculation of underloaded regions and overloaded is crude:

   use region count / number of regionservers as the floor,and 

   use (region count-1+number of servers) / number of servers as the ceil

  2.close the region by regionserver

  when all region plans are all ready,then a loop will iterate per plan to send close info to regionserver.after the later closed the region and updated the zk state,then a zk-related event is received in master in AssignmentManager#nodeDataChanged(),so a normal region assign process is issued.

  3.assign the closed reigon to a new regionserver

  after finishing the step 2 above,but how the master knows which regionserver to assign?

  in the step 1,before sending a close info to regionserver,the master has kept up the region-plan(which contains the src and dest server),so when receiving a zk event,this plans info are stayed in memory in master also.

 FAQs:

rs的负载不均衡,已经有同事做了些改进,将同一table的region尽量分配到不同rs上

--decrease the slop factor below to 0.1,or adjust the crude min and max calculation.

hot region的均匀分布。考虑根据region最近所服务的请求数作为balance的依据,使每台rs上的region所服务的请求数相

--i think this is a design fault.in general ,a even-consistent algorithm will not result in this case.

  in the other hand,un-even region assigment will cause unconsistent resource usage.

 

? ignore zk event,let regionserver notify master to assign directly

--this is a improved feature in 0.96+?

? fix some regions in a regionserver,that is exclude new regions to be assigned to specified reginservers

--look at [3]

ref:

[1]5。hbase高级部分:compact/split/balance及其它维护原理--the flow of creating table

[2]改善HBase的Balance策略

[3] Support to drain RS nodes through ZK :suppress assignment of new regions

猜你喜欢

转载自leibnitz.iteye.com/blog/2110395
今日推荐