After monitoring the amount of data access services Pinpoint rose, the average weekly Hbase data about the increment 25G, too much data, data needs to be cleaned regularly, otherwise the reduced availability monitoring.
Procedure
to find a large data table hbase
[root@iZ28ovlz7ccZ worker]# du -sh hbase/data/default/* 2.2M hbase/data/default/AgentEvent 348K hbase/data/default/AgentInfo 2.6M hbase/data/default/AgentLifeCycle 329M hbase/data/default/AgentStatV2 34M hbase/data/default/ApiMetaData 44K hbase/data/default/ApplicationIndex 66M hbase/data/default/ApplicationMapStatisticsCallee_Ver2 60M hbase/data/default/ApplicationMapStatisticsCaller_Ver2 16M hbase/data/default/ApplicationMapStatisticsSelf_Ver2 1.1M hbase/data/default/ApplicationStatAggre 1.1G hbase/data/default/ApplicationTraceIndex 976K hbase/data/default/HostApplicationMap_Ver2 15M hbase/data/default/SqlMetaData_Ver2 848K hbase/data/default/StringMetaData 21G hbase/data/default/TraceV2
24 hours generate data about 20G, and found TraceV2 ApplicationTraceIndex large data set and TTL are 7Day 14Day
Enter hbase modify the table ttl
[root@iZ28ovlz7ccZ ~]# /usr/local/hbase-1.0.3/bin/hbase shell 2019-08-19 15:43:20,320 WARN [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable hbase(main):002:0> list TABLE AgentEvent AgentInfo AgentLifeCycle AgentStatV2 ApiMetaData ApplicationIndex ApplicationMapStatisticsCallee_Ver2 ApplicationMapStatisticsCaller_Ver2 ApplicationMapStatisticsSelf_Ver2 ApplicationStatAggre ApplicationTraceIndex HostApplicationMap_Ver2 SqlMetaData_Ver2 StringMetaData TraceV2 15 row(s) in 0.0100 seconds => ["AgentEvent", "AgentInfo", "AgentLifeCycle", "AgentStatV2", "ApiMetaData", "ApplicationIndex", "ApplicationMapStatisticsCallee_Ver2", "ApplicationMapStatisticsCaller_Ver2", "ApplicationMapStatisticsSelf_Ver2", "ApplicationStatAggre", "ApplicationTraceIndex", "HostApplicationMap_Ver2", "SqlMetaData_Ver2", "StringMetaData", "TraceV2"] hbase(main):004:0> describe 'TraceV2' Table TraceV2 is ENABLED TraceV2 COLUMN FAMILIES DESCRIPTION {NAME => 'S', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'PREFIX', TTL => '5184000 SECONDS ( 60 DAYS)', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'} 1 row(s) in 0.1190 seconds hbase(main):005:0> disable 'TraceV2' 0 row(s) in 4.2190 seconds hbase(main):006:0> alter 'TraceV2' , {NAME=>'S',TTL=>'604800'} Updating all regions with the new schema... 256/256 regions updated. Done. 0 row(s) in 1.0980 seconds hbase(main):009:0> enable 'TraceV2' 0 row(s) in 4.2370 seconds hbase(main):010:0> describe 'TraceV2' Table TraceV2 is ENABLED TraceV2 COLUMN FAMILIES DESCRIPTION {NAME => 'S', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'PREFIX', TTL => '604800 SECONDS (7 DAYS)', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'} 1 row(s) in 0.0160 seconds hbase(main):002:0> describe 'TraceV2' Table TraceV2 is ENABLED TraceV2 COLUMN FAMILIES DESCRIPTION {NAME => 'S', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'PREFIX', TTL => '5184000 SECONDS (60 DAYS)', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'} 1 row(s) in 0.1000 seconds
Set ApplicationTraceIndex of TTL is 14 days
hbase(main):011:0> describe 'ApplicationTraceIndex' Table ApplicationTraceIndex is ENABLED ApplicationTraceIndex COLUMN FAMILIES DESCRIPTION {NAME => 'I', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'PREFIX', TTL => '5184000 SECONDS ( 60 DAYS)', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'} 1 row(s) in 0.0150 seconds hbase(main):012:0> disable 'ApplicationTraceIndex' 0 row(s) in 1.1660 seconds hbase(main):013:0> alter 'ApplicationTraceIndex' , {NAME=>'I',TTL=>'1209600'} Updating all regions with the new schema... 16/16 regions updated. Done. 0 row(s) in 1.0550 seconds hbase(main):014:0> enable 'ApplicationTraceIndex' 0 row(s) in 0.3520 seconds hbase(main):015:0> describe 'ApplicationTraceIndex' Table ApplicationTraceIndex is ENABLED ApplicationTraceIndex COLUMN FAMILIES DESCRIPTION {NAME => 'I', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'PREFIX', TTL => '1209600 SECONDS ( 14 DAYS)', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'} 1 row(s) in 0.0200 seconds hbase(main):016:0> major_compact 'ApplicationTraceIndex' 0 row(s) in 0.1660 seconds
Remark
The purpose of the operation major_compact
Merge Files
Clear delete outdated, redundant versions of the data
Improve the efficiency of reading and writing data
604800 7day
describe 'TraceV2'
disable 'TraceV2'
alter 'TraceV2' , {NAME=>'S',TTL=>'604800'}
enable 'TraceV2'
describe 'TraceV2'
major_compact 'TraceV2'
1209600 14day
describe 'ApplicationTraceIndex'
disable 'ApplicationTraceIndex'
alter 'ApplicationTraceIndex' , {NAME=>'I',TTL=>'1209600'}
enable 'ApplicationTraceIndex'
describe 'ApplicationTraceIndex'
major_compact 'ApplicationTraceIndex'
[root@iZ28ovlz7ccZ ~]# du -sh /worker/hbase/data/* 14G /worker/hbase/data/default 348K /worker/hbase/data/hbase