hbase troubleshooting ideas

HBCK - What HBCK check?

(1) HBase Region consistency

  • All cluster region are assign, and deploy to the only one RegionServer

  • The state of the region in memory, hbase: meta table and zookeeper three places need to be consistent

(2) HBase table integrity

  • For an arbitrary cluster tables, each rowkey are present only in a region interval

 

HBCK - Common inspection order

  • ./bin/hbase hbck

  • ./bin/hbase hbck –details

  • ./bin/hbase hbck TableFoo TableBar

 

HBCK - low risk of local repair

  • -fixAssignments: repair not assign, assign incorrect or simultaneously assign to multiple problems RegionServer region.

  • -fixMeta: major repair .regioninfo file and hbase: inconsistent meta metadata table. HDFS file repair is based on the principle prevail: if there is in the region HDFS, but does not exist in hbase.meta table, will be in hbase: add a record meta table. On the other hand, if does not exist on HDFS, and in hbase: meta table exists, it will be hbase: meta table corresponding record is deleted.

 

HBCK - high risk Repair 

  • region range overlap fix issues related to high risk repair operation, such as repair usually need to modify files on HDFS, and sometimes require manual intervention.

  • For such high-risk repair operation, it is recommended to perform hbck -details more detailed understanding of the details of the problem, and then perform the appropriate repair commands

  • -repair | -fix command line is strongly not recommended for production use

 

HBCK - Case 

 

RIT processing routines

  • A routine: pending_open (or pending_close) state region can usually be repaired using the command hbck

  • Two routines: region failed_open (or failed_close) state usually can not be repaired using the command hbck

  • Three routines: region to be checked log failed_open (or failed_close) region can not be opened to confirm the state of the specific cause of the shutdown

  • Four routines: region in the state but hbck RIT normal, the region-in-transaction node on the relevant region zk deleted, restart the master

 

HBase- log analysis

  • Monitoring and analysis can only tell you what might be the causes, indirect causes

  • Log analysis will tell you the exact cause of the problem, the most direct cause.

       General questions can be found in the direct cause of the log, and then find the answers based on reason.

  • Can clear the whole matter through log analysis, monitoring will not tell you that much

Published 57 original articles · won praise 33 · Views 140,000 +

Guess you like

Origin blog.csdn.net/u014156013/article/details/82628551