HBase operation and maintenance basis--metadata reverse repair principle

background

        In view of the feedback from readers in the last article - "Cloud HBase team successfully rescued a company's self-built HBase cluster and saved 30+T data", they are more interested in the reverse engineering of HBase, and ask how to use corresponding tools for operation and maintenance, etc. Wait. In general, I want to have a deeper understanding of HBase operation and maintenance principles, improve the ability to operate and maintain the HBase production environment, and deal with various common abnormal phenomena. Different readers have different degrees of understanding of HBase. This article does not intend to focus on how to use a tool, but to explain the basic knowledge of HBase operation and maintenance. In order to help most readers improve HBase operation and maintenance capabilities, a series of articles on the topic of "HBase Operation and Maintenance Series" will be written in the future. Welcome to pay attention to the latest developments of the community official account.

introduce

       I believe that many companies that build their own HBase will often encounter various HBase operation and maintenance problems. For example, when using HBase, the RegionServer node starts to hang after HBase is written for a period of time. After restarting the RegionServer, it is found that the startup is very slow, and RTI problems occur in many regions, causing the business of reading and writing a region to hang. There are also some people who have tried to operate and maintain their HBase cluster for many times, but HBase cannot be started directly, and the meta table starts to report errors when it goes online, resulting in a series of problems such as the final business not running normally. This article starts with the basic principles of HBase operation and maintenance, focusing on data integrity, and the principles and methods of metadata "reverse engineering" to restore data integrity. Start the follow-up series of HBase operation and maintenance knowledge explanations.

HBase directory structure

    This article explains the 1.x version, and the different versions are roughly the same. HBase will use a separate directory on HDFS as the root directory of the HBase file directory, usually "/hbase". Based on this directory, there will be the following directory organization structure:

     /hbase/archive (1)

     /hbase/corrupt (2) 

     /hbase/data/default/TestTable/.tabledesc/.tableinfo.0000000001 (3)

     /hbase/data/default/TestTable/fc06f27a6c5bc2ff57ea38018b4dd399/info/2e58b3e274ba4d889408b05e526d4b7b (4)

     /hbase/data/default/TestTable/fc06f27a6c5bc2ff57ea38018b4dd399/recovered.edits/340.seqid (5)

     /hbase/data/default/TestTable/fc06f27a6c5bc2ff57ea38018b4dd399/.regioninfo (6)

     /hbase/data/default/TestTable/fc06f27a6c5bc2ff57ea38018b4dd399/.tmp (7)

     /hbase/data/default/TestTable/fc06f27a6c5bc2ff57ea38018b4dd399/.splits (8)

     /hbase/data/default/TestTable/fc06f27a6c5bc2ff57ea38018b4dd399/.merges (9)

     /hbase/data/hbase/acl (10)

     /hbase/data/hbase/meta (11)

     /hbase/hbase.id (12)

     /hbase/hbase.version (13)

     /hbase/MasterProcWALs (14)

     /hbase/oldWALs (15)

     /hbase/.tmp (16)

     /hbase/.trashtables/data (17)

     /hbase/WALs/tins-donot-rm-test-hb1-004.hbase.9b78df04-b.rds.aliyuncs.com,16020,1523502350378/tins-donot-rm-test-hb1-004.hbase.9b78df04-b.rds.aliyuncs.com%2C16020%2C1523502350378.default.1524538284034 (18)

 

     (1) The archive directory used for snapshot or upgrade. When compaction deletes the hfile, it will also archive the existing hfile here.

     (2) The corrupt directory of splitlog, and the directory of corrupt hfile.

     (3) The basic attribute information metafile tableinfo of the table.

     (4) The hfile data file under the corresponding table.

     (5) When splitlog occurs, the wal of an RS will split WALs at the region level and write to the recovered.edits directory in the corresponding directory, so that when the region is opened again, these recovered.edits logs are played back.

     (6) regioninfo file.

     (7) Temporary tmp directory for compaction, etc.

     (8) Temporary directory during split. If the last region split is not completed and interrupted, this directory will be automatically cleaned up when the region is opened again, generally without manual intervention.

     (9) The temporary directory during merges, like split, if it is interrupted when it is not completed normally, it will be automatically cleaned up the next time it is opened. Human intervention is generally not required.

     (10) acl Permission record system table when HBase permission control is enabled

     (11) meta metadata table, recording region related information

     (12) hbase.id The unique id of the created cluster when the cluster is initialized. can be regenerated

     (13) hbase.version hbase software version file, the static version of the code, is now 8

     (14) The state of the master execution process program is saved, which is used for interrupt recovery execution.

     (15) OldWALs historical wal, that is, the data recorded by wal has been confirmed to be persistent, then these wals will be moved here. The logs completed by splitlog will also be put here.

     (16) tmp temporary auxiliary directory, such as writing a hbase.id file, after successful writing here, rename to /hbase/hbase.id

     (17) /hbase/.trashtables/data When truncate table or delete table, these data will be temporarily placed here, and will be cleared within 1 hour by default

     (18) Records the WAL log file on a RegionServer. It can be seen that the name of the regionserver has time, that is, the wal directory of the RS will use the new directory structure to store the wal the next time it starts, and the old RS wal directory will be split and played back by the splitlog process.

 

Main files and uses involved in HBase

HDFS static files, HBase data integrity on HDFS

    1. hfile file: data file, the current highest version is also the default common version of 3. For details of the hfile file structure, please refer to the official website http://hbase.apache.org/book.html#_hfile_format_2. Here we mainly use the firstkey and lastkey information of HFile fileinfo to generate metadata in reverse.

    2. hfilelink file: used in hbase snapshot, migration upgrade will also be used. The operation and maintenance problems of such files are rarely encountered, so I won't introduce them too much here.

    3. Reference file: used to specify the half hfile. When a region has a reference, the region cannot be split. split/merge will create this. After a new hfile is generated after compaction, this reference will be deleted. The reference file name format of hfile is generally hfile.parentEncRegion. For example: /hbase/data/default/table/region-one/family/hfilename. Its region-two has a reference hfile file name format: /hbase/data/default/table/region-two/family/hfile.region-one Usually an invalid reference is that the hfile of region-one does not exist, then this reference will be invalid. His repair method is generally to remove the reference invalid reference.

    4. The ".regioninfo" file saves the endkey/offline flag/regionid/regionName/split flag/startkey/tablename, etc.

    5. tableinfo file This file saves tableName/table attribute information/table level config information/family information. The family information stores famliyName/famiy attributes/famliy level config information, etc.

        Usually, the table attributes are: REGION_MEMSTORE_REPLICATION, PRIORITY, IS_ROOT_KEY, etc. Generally, these attributes are the same by default according to the configuration. The family attributes are: BLOCKSIZE, TTL, REPLICATION_SCOPE, etc. The general attributes are used by default according to the configuration.

    6. hbase:meta table data content format

        regionname, info:regioninfo, regioninfo的encodeValue值

        regionname, info:seqnumDuringOpen, sequence number

        regionname, info:server, the name of the server where the region is located

        regionname, info:serverstartcode, timestamp of regionserver startup

 

The principle of reverse generation of metadata

        In the data files described above, the main metadata of HBase is mainly composed of meta table, tableinfo, and regioninfo. The reverse generation of metadata here refers to the process of reversely generating the regioninfo/tableinfo/meta table according to the data hfile data file.

    1. Reverse generate tableinfo file

        case1. By completely restoring the tableinfo file from the tabledescritor cache in the master process memory, the restored tableinfo at this time is complete, exactly the same as before.

        case2. When the tableinfo of this table has not been loaded in the cache, the repair process can only restore the tableinfo from all familyNames in the table's directory structure list. At this time, only the name of the column family can be obtained. In the content of the restored tableinfo file, except for the table The name and column cluster name are the same, and other attributes use the default values. At this time, if the operation and maintenance personnel know that any attributes are customized, they need to be manually added again.

    2. Reversely generate the regioninfo file

        The fileinfo in the hfile reads the firstkey/lastkey and sorts it, obtains the maximum rowkey and minimum rowkey of all hfiles under the region, and completely restores the regioninfo file according to the table name in tableinfo. Mainly here can only restore /startkey/endkey, other attributes such as: offline flag, regionName, split flag, hashcode, etc. all use the default values ​​of code generation or configuration.

    3. Reverse filling of meta table rows

        The regioninfo file is serialized, filled in the info:regioninfo column of the meta table, and written to the default server at the same time. When it is opened again, the region is reassigned to the actual regionserver, and the data rows here are updated.

        In addition to the above direct file and data content repair, reverse engineering also involves repairing other aspects of data integrity. A representation consists of an infinitely small rowkey to an infinitely large rowkey range, and problems that may occur such as region holes and region overlaps, such as:

        If there are regions with holes, they will use their hole boundaries as startkey/endkey, and then repair and create a region directory and the regioninfo file in the directory. If the regions overlap, the overlapping regions will be merged, and the maximum and minimum rowkeys of all regions will be taken as the maximum and minimum rowkeys of the new region after the merge.

 

Metadata tool fixes

        The lack or integrity of metadata will affect the operation of the system, and even the cluster will be directly unavailable. The most common ones are the failure of the meta table to go online, the failure of the region to go online and open, etc. Here are two tools, tool one: hbase hbck online repair integrity repair metadata information, tool two: OfflineMetaRepair offline reconstruction hbase:meta metadata table.

 

Online hbck fix:

    Prerequisite: HDFS fsck ensures that the files in the hbase and directory are not damaged or lost. If there is, remove the corrupt block first.

        Step 1. hbase hbck checks the output so ERROR information, each ERROR will explain the error information.

        Step 2. hbase hbck -fixTableOrphones First fix the missing tableinfo problem, and regenerate the tableinfo file according to the memory cache or hdfs table directory structure.

        Step 3. hbase hbck -fixHdfsOrphones fix the missing regioninfo problem, and regenerate the regioninfo file according to the hfile in the region directory

        Step 4. hbase hbck -fixHdfsOverlaps Fix the region overlapping problem, merge the overlapping region into a region directory, and regenerate a regioninfo

        Step 5. hbase hbck -fixHdfsHoles Fix the missing region, and use the missing rowkey range boundary to generate a new region directory and regioninfo to fill the hole.

        Step 6. hbase hbck -fixMeta Repair the meta table information, use the regioninfo information, regenerate the corresponding meta row and fill it in the meta table, and fill in the default allocation regionserver for it

        Step 7. hbase hbck -fixAssignment triggers these offline regions to go online. When the region starts to open and go online again, it will be reassigned to the real RegionServer, and the corresponding row information in the meta table will be updated.

 

Offline OfflineMetaRepair rebuild:

    Prerequisite: HDFS fsck ensures that the files in hbase and the directory are not damaged or lost, if so, remove the corrupt block first

        Steps: Execute hbase org.apache.hadoop.hbase.util.hbck.OfflineMetaRepair -fix

        Finally, the instructions for using the two tools are relatively detailed. After the basic introduction above, I believe that you will understand it. The tools are not described in detail here. For the description of the tools, please refer to the official website or tooltips. Off topic, when some open source components are designed, some unique information is written to the hbase metadata file, but it has not been modified to the repair tool of the hbase tool, or it has no maintenance and repair tool itself. If such files are damaged or lost, Then the corresponding components will not work properly. Users who use this type of component should not only record the basic structure of your table, but also record the attribute configuration of the table, etc. When the repair operation and maintenance behavior occurs, mainly check and confirm again.

 

summary

       This article introduces the data integrity and reverse metadata repair principles in the basic principles of operation and maintenance of hbase, and introduces two tools and practical execution steps for reverse repair metadata. A series of articles will be published in the future to explain more HBase operation and maintenance foundations, operation principles, etc. I hope it will be helpful to everyone's operation and maintenance and use of HBase.

 

I have something to say

       If you encounter technical problems during work and study, you can follow the community forum http://hbase.group, and you are welcome to ask questions and leave messages on it. If you want to know more about HBase technology, please pay attention to the HBase community public account (WeChat: hbasegroup), long press the QR code below to follow, and you are also welcome to actively contribute

 


communicate with

If you are interested in HBase and are committed to using HBase to solve practical problems, welcome to join the HBase technical community group exchange:

WeChat HBase technical community group , if the WeChat group cannot be added, you can add the secretary WeChat:  SH_425  , and then invite you.

 

 

​DingTalk   HBase technical community group

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325478252&siteId=291194637