[HBase] Problem location and tuning practice (continuously updated...)

Issue title: CTBase manager page cannot be opened, Hbase is unavailable

Problem description : The hbase shell operation reports an error HMaster is initializing
ERROR: org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
Problem location: Check the HMaster log and find that the HMaster waits for the namespace table to time out when it starts, causing the active and standby HMasters to keep running toggle

Timeout: 300000ms waiting for namespace table to be assigned
The above log indicates that the hmaster startup timed out, causing the active and standby hmasters to keep switching

Solution:
Increase the namespace initialization timeout
. There is no hbase.master.namespace.init.timeout configuration item in the C50SPC202 version, and you need to add it manually
. Step
1. Go to /opt/huawei/Bigdata/om-0.1.1/etc/ components/FusionInsight_V100R002C50SPC202/HBase/configuration.xml configuration file, under the HMaster tab, find hbase-site.xml, manually add the configuration item hbase.master.namespace.init.timeout: the
value is set to 36000000
2. Restart the controller
sh /opt/ huawei/Bigdata/om-0.1.1/sbin/restart-controller.sh
3. Synchronize the configuration in the manager web UI
4. Restart the HBase service 5. HBase
returns to normal

Question title: hbase read slow

Problem description : After the application is deployed to the production environment big data cluster, it is very slow to read data from hbase, and it takes about 18 seconds to return the data, which causes other applications to access timeout and cannot be used normally. At present, the amount of data is very small, no more than 500 records. .
The same application is deployed in the development environment, and it takes no more than 3 seconds to access hbase, which is basically within an acceptable range.

Timeout: 300000ms waiting for namespace table to be assigned
The above log indicates that the hmaster startup timed out, causing the active and standby hmasters to keep switching

Solution: DNS is configured. During the interaction between the client and the cluster, the client will go to the DNS server to resolve the cluster IP. There is a delay problem.
Backup the client DNS configuration file and clear it
. # cp /etc/resolv.conf /etc/resolv.conf-bak
# echo "" > /etc/resolv.conf

Failed to submit Loader task

admin和hbase_bk_user用户都无法提交loader作业，错误提示如下：org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit application_1495015772433_0204 to YARN : Failed to submit application application_1495015772433_0204 submitted by user hbase_bk_user reason: No groups found for user hbase_bk_user (caused by: Failed to submit application_1495015772433_0204 to YARN : Failed to submit application application_1495015772433_0204 submitted by user hbase_bk_user reason: No groups found for user hbase_bk_user

The /var directory of one of the nodes has reached 100%, and the job can be submitted normally after the space is cleared.

How to configure Hbase cache

1. When using the client to read HBase data, you can improve the reading speed by configuring and optimizing the use of the code. The client's default cache data size. When calling Scanner.next, the data will be fetched from the cache first. If the data in the cache has been fetched, the scan request will be made to the regionserver. Increasing the value can improve the speed at which the client reads data and greatly reduce the number of requests to the regionserver. hbase.client.scanner.max.result.size, the maximum size of the result returned each time on the client request rs. This value can be increased to increase the amount of data obtained on the regionserver per request, thereby reducing the request to the regionserver. 2. On the RegionServer side, you can configure the size of the block's buffer area to improve the query efficiency to a certain extent. The size of the HBase cache area mainly affects query performance. The size of the cache area is determined according to the query mode and the distribution of query records.

Client access to Phoenix error

zookeeper.ClientCnxn: Session establishment complete on server hdgycn02/99.12.166.131:24002, sessionid = 0x1e04533cfe1a6416, negotiated timeout = 90000
2017-05-08 15:32:26,334 ERROR [main-SendThread(hdgycn02:24002)] client.ZooKeeperSaslClient: An error: (java.security.PrivilegedActionException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7) - UNKNOWN_SERVER)]) occurred when evaluating Zookeeper Quorum Member's received SASL token. This may be caused by Java's being unable to resolve the Zookeeper Quorum Member's hostname correctly. You may want to try to adding '-Dsun.net.spi.nameservice.provider.1=dns,sun' to your client's JVMFLAGS environment.Zookeeper Client will go to AUTH_FAILED state.

The error message shows that when the client connects to zk, the ticket expires and the connection fails. Adjust the connection method in the client configuration file /opt/hadoop_client/Spark/adapter/client/controller/jaas.conf. By default, the cache connection is used. It is recommended to adjust To use keytab to connect to zk Client {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=true
keyTab="/opt/keytabFile/hbase.keytab"
principal="hbase/[email protected]"
useTicketCache=false
storeKey=true
debug=true;
}

Recommendations for hbase pre-segmentation of regions on the cluster

By default, a region partition is automatically created when an HBase table is created. When importing data, all HBase clients write data to this region until the region is large enough. One way to speed up batch writing is to create some empty regions in advance, so that when data is written to HBase, data load balancing will be done in the cluster according to the region partition.

Pheonix cannot connect to Hbase

Description of the problem handling steps:
1. It is found that the data center cluster pheonix cannot access the hbase table, and the SYSTEM.CATALOG table cannot be accessed
. 2. Execute is_enabled SYSTEM.CATALOG and return false; execute is_disabled SYSTEM.CATALOG and return false
3. Restart the HBase service and execute is_enabled SYSTEM.CATALOG, return true; execute is_disabled SYSTEM.CATALOG, return false;
check the HBase native page and find that a region of the SYSTEM.CATALOG table is offline
4. Execute hdfs fsck hdfs://hacluster/hbase/WALs/hdsjzxdata3g03u08p,21302, 1498474722136-splitting / hdsjzxdata3g03u08p% 2C21302% 2C1498474722136.default.1500233042596
return Fsck on path '/ hbase / WALs / hdsjzxdata3g03u08p, 21302,1498474722136-splitting / hdsjzxdata3g03u08p% 2C21302% 2C1498474722136.default.1500233042596' FAILED

1. Execute hdfs fsck –openforwrite path –files -blocks –locations to change the file 'hdsjzxdata3g03u08p%2C21302%2C1498474722136.default.1500233042596' mv to another directory
2. Switch the active and standby Hmasters
3. Return to normal

Hbase table cannot be exported

The HBase large table data export script fails to execute, but other HBase small table export scripts can be successfully executed. The script uses the export API to export the table

Check the cluster and find that a regionserver is down and cannot be pinged. It is suspected that the execution of the export script fails because the export table region is allocated on the downed server. After the downtime, the region has not been migrated and the script execution fails. Check the region allocation and status of the export table, confirm that each region is in good condition, prove that the region migration is complete, and re-execute the script that failed before, and the execution is successful.

The alarm HDFS is unavailable, and viewing the log shows that HDFS cannot be written

In one sentence, the HDFS startup fails because the HDFS reserved space is full due to concurrent writes on the Hbase\CTBASE side.
Technical description: There are n write threads in a certain DN1, so the reserved space = BlockSize * n disk space is required for the n write concurrency. The on-site DataNode node has less disk space, and the remaining free space is 50G. The default maximum support is 50000/128 =390 concurrency;

1. Check the NameNode log for a large number of write files to report that there is no space, but check the remaining actual DataNode space and there are 50G available
2017-07-19 10:51:31,135 | WARN | Thread-7 | DataStreamer Exception | DataStreamer. java:694 org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /hdfsServiceCheck-99-8-58-32-hm3._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1). There are 3 datanode(s) running and no node(s) are excluded in this operation.
2. Check the 20-minute log of the namenode, if there are more than 30,000 allocates, it means that there are high concurrent writes, and it is initiated by hbase/ctbase, which leads to reservation The Datanode space is full and new space cannot be allocated, triggering the following error.
2017-07-19 09:01:16,892 | INFO | IPC Server handler 16 on 25000 | BLOCK* allocate blk_1081389716_1099521341712{UCState=UNDER_CONSTRUCTION, truncateBlock=null, primaryNodeIndex=-1, replicas=[ReplicaUC[[DISK]DS-42523040- cac6-4e51-bd03-ac0add9dde12:NORMAL:99.8.58.35:25009|RBW]]}
for /hbase/data/default/RT_LA101_BASE/e90229296d66e1fe2132c2417de68eb1/recovered.edits/0000000000000000097.temp | The default block size is 128M, which leads to insufficient reserved space. Instruct the live network to adjust the block size parameter of the HDFS client (that is, dfs.blocksize of hbase/ctbase) to 16M and restart to avoid the problem of "full reserved space". , successfully started HDFS.

The amount of data in the cluster is too large

During the in-depth inspection process of big data in the middle of the year, I learned that the data volume of HBase reached 20.2T, the HDFS usage rate reached 83%, and the daily growth rate was 2%. 10% space is not available for storing actual user data).
It is understood that the channel big data cluster only needs to retain most of the HBase data for 3 months. And the TTL attribute has been set to 3 months, that is to say, data with a timestamp longer than 3 months is expired data. The largest large table data set with TTL has reached 6.5T, and the actual data volume of the business is expected to be 3-4T.

GC_OPTS parameter of RegionServer, adjust
a) hbase.hstore.compaction.max.size from 2.5G to unlimited
b) hbase.regionserver.thread.compaction.large increased by 2 times, from 5 to 10
c) hbase. Increase hstore.compaction.kv.max from 10 to 100
 In the idle period of business, manually execute major compaction
a) Use admin user to log in to hbase shell, execute major_comcompact 'table name'

Client cannot access hbase table

The status of the hmaster instance on the FusionInsight manage page is unknown, and the status of the hmaster instance is standby. Executing list in the Hbase shell cannot return the list of tables. The business layer cannot read the phoenix table.

Exception, Abort There is no abnormal alarm for other underlying dependent services. Execute deleteall /hbase/region-in-transition on the zk client and modify hbase.master.initializationmonitor.haltontimeout to be 5 times the original value. After the modification, it is 150000000
. Restart and modify zookeeper.session.timeout to 2 times the original value. After the modification, it is 90000. Restart hbase After restarting, the HBase service part returns to normal, including hbase shell can write data, flush, and phoenix can read data normally.
However, there are 421 regions in the rit state for a long time. There are a lot of inconsistencies in the hbase hbck check, and the number of rits is not reduced by executing -fixMeta –fixAssignments –fixReferenceFiles –fixHdfsHoles.
Check the records of the corresponding region in the main hmaster log and find that there are records of Couldn't reach online server hdsjzxdata3g01u14p, 21302, 15000999163339 and Skip assigning RS6000_CW:biz_sys_othernew_mi

blu occasionally fails to query hbase

The exception stack shows that the zookeeper client authentication failed because the zookeeper ticket failed to relogin. Adjust the krb5.conf in the application to be no less than 10 to increase the authentication success rate.

The number of hbase large tables viewed by the admin user in the hbase shell is inconsistent with the number in the hmaster; the export of the large table DC_BASE data reports an error and has no permission; the admin user executes hdfs dfs -ls /hbase and returns no permission

1 Modify the parameters and
need to adjust the RegionServer GC_OPTS (the following is not a direct replacement, but a modification of the previous part)
-server -Xms20G -Xmx20G -XX:NewSize=2G -XX:MaxNewSize=2G -XX:MetaspaceSize=128M -XX:MaxMetaspaceSize=512M -XX:MaxDirectMemorySize=4G 2 Restore id -Gn to some nodes, restart sssd
service sssd restart
id -Gn admin

Some regions of hbase are unavailable

After a region server fails and restarts, the region is displayed offline, and the client requests the relevant region without returning. old:
-server -Xms512M -Xmx1G -XX:NewSize=64M -XX:MaxNewSize=128M -XX:PermSize=128M -XX:MaxPermSize= 128M -XX:CMSFullGCsBeforeCompaction=1 -XX:MaxDirectMemorySize=128M -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:+UseCMSCompactAtFullCollection -XX:CMSInitiatingOccupancyFraction=65 -Xloggc:/var/log/Bigdata/hbase/hm/master- omm-gc.log -XX:+PrintGCDetails -Dsun.rmi.dgc.client.gcInterval=0x7FFFFFFFFFFFFFE -Dsun.rmi.dgc.server.gcInterval=0x7FFFFFFFFFFFFFE -XX:-OmitStackTraceInFastThrow -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps
new :
-server -Xms4G -Xmx4G -XX:NewSize=256M -XX:MaxNewSize=256M -XX:PermSize=128M -XX:MaxPermSize=128M -XX:CMSFullGCsBeforeCompaction=1 -XX:MaxDirectMemorySize=2G -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:+UseCMSCompactAtFullCollection -XX:CMSInitiatingOccupancyFraction=65 -Xloggc:/var/log/Bigdata/hbase/hm/master-omm-gc.log -XX:+PrintGCDetails -Dsun.rmi.dgc.client.gcInterval=0x7FFFFFFFFFFFFFE -Dsun.rmi.dgc.server.gcInterval=0x7FFFFFFFFFFFFFE -XX:-OmitStackTraceInFastThrow -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps
old:
-server -Xms4G -Xmx6G -XX:NewSize=256M -XX:MaxNewSize=512M -XX:PermSize=128M -XX:MaxPermSize=128M -XX:CMSFullGCsBeforeCompaction=1 -XX:MaxDirectMemorySize=512M -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:+UseCMSCompactAtFullCollection -XX:CMSInitiatingOccupancyFraction=65 -Xloggc:/var/log/Bigdata/hbase/rs/regionserver-omm-gc.log -XX:+PrintGCDetails -Dsun.rmi.dgc.client.gcInterval=0x7FFFFFFFFFFFFFE -Dsun.rmi.dgc.server.gcInterval=0x7FFFFFFFFFFFFFE -XX:-OmitStackTraceInFastThrow -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps
new:
-server -Xms20G -Xmx20G -XX:NewSize=2G -XX:MaxNewSize=2G -XX:PermSize=128M -XX:MaxPermSize=128M -XX:CMSFullGCsBeforeCompaction=1 -XX:MaxDirectMemorySize=4G -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:+UseCMSCompactAtFullCollection -XX:CMSInitiatingOccupancyFraction=65 -Xloggc:/var/log/Bigdata/hbase/rs/regionserver-omm-gc.log -XX:+PrintGCDetails -Dsun.rmi.dgc.client.gcInterval=0x7FFFFFFFFFFFFFE -Dsun.rmi.dgc.server.gcInterval=0x7FFFFFFFFFFFFFE -XX:-OmitStackTraceInFastThrow -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps

Failed to import Hbase data in batches

Check the Flush queue and the Compaction queue. The number of queues is more than 100. This problem is an open source bug and can be solved after restarting the Hbase service.

Inconsistent data between hbase and pheonix

The hbase associated with Phoenix has inconsistent data. Among them, cust_tag and cust_his are both created by create table, and the statistical results are different. cust_asset is created with create view, and the statistical count is consistent. cust_tag, the count record in hbase is more than 1.4 million, and the count in Phoenix is only more than 60,000;
cust_his, nearly 10 million data are inserted every day, but only records before 8.4 can be found in Phoenix by "date" column, and no records after 8.4 Record

Spark failed to import data into hbase

Check the application log, the error can not find a region in the xxxregionserver, check the regionserver on the native page, you really can't see the region, you can find the region by looking at the meta table. The Hmaster native page later showed that the region was in the RIT state, and when hbck was executed, about 10% of the regions were inconsistent. Viewing the manager alarm, it is found in the log that the region server where the region where the error is reported has a disk that has a slow disk alarm for many times, up to 5 times a day. Stop the regionserver and the business resumes. However, the number of regions in the meta table and the native page is still inconsistent. The native page shows about 37,000 regions, and the meta table records about 39,000. Execute hback -repair related tables, and the number of regions is restored to the same.

Incremental export of hbase data is very slow

On the 10.14th, the export task of the LOG_BASE table was migrated to the standby cluster. The export job ExportUserTable_LOG_BASE_T_VINCIO_LOG runs more than 100 times a day, and the task start time is about 02:00-16:00. Looking at the number of get operations of the two EXPORT tasks, ExportUserTable_PRM_BASE_T_PRMI_TRANSACTION and ExportUserTable_LOG_BASE_T_VINCIO_LOG are both about 10 million times, while the VINCIO_LOG export task runs more than 100 times a day, which is equivalent to increasing the get read load of the cluster by more than 100 times, causing the server to wait accordingly. Solution: It is recommended that the execution time of the PRM_BASE table export task is completely staggered from the execution time of the LOG_BASE table export task.

During the inspection, it was found that the startup times of multiple regionservers were different

The region server is restarted multiple times, and the region servers are not restarted at the same time, which does not affect the business. Looking at the regionserver log at the restart time, a large number of GC pause records were found at this time, which was initially located as a GC parameter setting problem. It needs to be further located according to the detailed log.

It is found that the exported HBase incremental data is duplicated

It is found that the regions of HBase are duplicated, and the start keys of some regions are the same; the end keys of some regions are the same. Execute hbase hbck -fixHdfsOverlaps -sidelineBigOverlaps -maxMerge 200 -maxOverlapsToSideline -fixAssignments -fixReferenceFiles -fixMeta table name; hbase hbck -repair table name after repair

Too many zk connections are established on the Hbase native interface

该接口就是每调用一次建立一次zk连接，建议慎用该接口/**
* get the regions of a given table.
*
* @param tableName the name of the table
* @return Ordered list of {@link HRegionInfo}.
* @throws IOException
*/
@Override
public List<HRegionInfo> getTableRegions(final TableName tableName)
throws IOException {
ZooKeeperWatcher zookeeper =
new ZooKeeperWatcher(conf, ZK_IDENTIFIER_PREFIX + connection.toString(),
new ThrowableAbortable());
List<HRegionInfo> Regions = null;
try {
Regions = MetaTableAccessor.getTableRegions(zookeeper, connection, tableName, true);
} finally {
zookeeper.close();
}
return Regions;
}

Repeated attempts to export Hbase incremental data fail

A secondary index that is too empty cannot be skipped if a field is empty 2. Check the existence of the primary index corresponding to the secondary index 3. Splice the rowkey of the primary index, and splicing the primary/secondary according to the information of the rowkey of the empty secondary index in the log The index corresponds to the rowkey in the big table. Scan these rowkeys in the hbase shell, but there is no value returned. So far, it means that the empty field corresponding to the secondary index is also empty in the big hbase table. The reason may be that the application of the day did not successfully write the data. 4. Export the 24-hour incremental data to the task and split it into 8 tasks according to 3 hours. The result is 2 failures and 6 successes, indicating that there are many Empty field 5. Install hd client and CTBase0.76 client, re-execute the export data script after source, and the data export is successful 6. It is recommended that customers add a multiple retry mechanism for write failures in the program

HBase scan timeout

The add filter operation times out. The filter filter condition is that a column (time) is within a certain time range, and an error is reported that the lease cannot be found. . The size of the table is 1.3T, with more than 3000 regions. Optimization plan: 1. Write the scan operation as a mapreduce program and submit it to yarn for execution; 2. Create a secondary index for the filter field; 3. Execute major compaction regularly to clean up redundant data.

oldWALs cannot be automatically deleted

Check the Hmaster log and find that the thread responsible for regularly cleaning oldWALs has stopped abnormally. You can restart the thread by switching the active/standby 2018-01-05 10:04:15,954 | WARN | hdchannel-mgt3,21300,1514367333682_ChoreService_1 | A file cleanerLogsCleaner is stopped, won' t delete any more files in:hdfs://hacluster/hbase/oldWALs | org.apache.hadoop.hbase.master.cleaner.CleanerChore.checkAndDeleteFiles(CleanerChore.java:228)

Failed to rebuild data for secondary index

The composition of the secondary index fields of the reconstructed data is the same as that of the primary index, but the order is different. After a field is added to the secondary index and separated from the primary index, the reconstructed data is successful. The long-term solution is to upgrade the CTBase version.

Problem Description:

Call the script at 1:00 am every day to export HBase incremental data to the outside of the cluster for offline analysis. The script calls the export interface provided by hbase. The program has been running continuously for 30 days and suddenly failed today.

Analysis: Looking at the task log, the step of exporting HBase data to HDFS failed.

View the export MapReduce job through YARN. There are 83 map tasks in the job, 82 of which have been completed, and the remaining 1 map task has been in the running state. Under normal circumstances, the entire program can be completed in about half an hour, but this The map task is already in the running state. It is this task that causes the entire program to hang.

Under normal circumstances, the entire program can be run in about half an hour, but the map task is already in the running state.
Check that the HBase service is in good condition, use the manager to check that the regionserver node where the task is located is normal, and use HMaster to check that the region server where the incremental export failure table is located is normal. From this, you can basically judge that the HBase service itself is fine.

Check the task log through YARN, find the error OutOfOrderScannerNextException, judge the read timeout of the scan operation, adjust the client parameter hbase.client.scanner.caching value from 100 to 50, that is, a scan operation returns 50 rows of data, reducing the cost of each scan time, reducing the possibility of timeouts when calling next. In addition, adjust the client configuration hbase.rpc.timeout value from 60000 to 600000 to increase the client rpc operation timeout tolerance time.

After changing the above configuration, manually call the restart program, and the execution is completed successfully.

Why did the program work fine for the first 30 days, but suddenly there is an exception today?

I personally think there may be two reasons:

1. The amount of data in the region where the map task that the scanner hangs on is large, resulting in a large scan delay

2. The increase of the cluster's business volume or the peak of the business leads to a large network IO load, (or the network itself causes the network IO to be slow), which eventually leads to a long scan time.

[HBase] Problem location and tuning practice (continuously updated...)

Guess you like