HBase tuning | HBase Compaction parameter tuning

The main purpose of Compaction: 1. Combine multiple HFiles into a larger HFile to improve query performance 2. Reduce the number of HFiles and reduce the impact of small files on HDFS 3. Improve the speed of region initialization.

hbase.hstore.compaction.min

When the number of HFile files under a certain column family exceeds this value, the minor compaction operation will be triggered. The default is 3, which is relatively small. It is recommended to set the value of 10-15. The impact is: setting too small will result in merging files too frequently, especially frequently In the case of bulkload or large data volume, setting too large will result in a large number of HFiles under a column family, which will affect query efficiency

Advanced: The setting of this value is also related to the characteristics of the business data. For example, similar to the detailed single cloud system, the business logic is to build a table on a monthly basis, and a table is built every month. The rowkey is reverse (mobile phone number) + timestamp data per month. Import every 3-5 minutes. The query logic is based on the mobile phone number + time period. Generally, the mobile phone traffic usage will continue to generate a certain number, so the data generated by a mobile phone number will basically be distributed in many HFiles. If hbase.hstore.compaction.min is set too large, more HFiles will be accessed during a query, which will affect query efficiency. This kind of business is not suitable for a particularly large setting.

On the contrary, if it is similar to the log business that only queries for a certain period of time, the data to be queried is relatively concentrated, that is, the query will only occur in one HFile or two adjacent HFiles. Merging files at this time has little effect on the improvement of query efficiency. You can set the value larger to reduce the impact of the merger on the system.

hbase.hstore.compaction.max

The maximum number of HFiles that can be merged at one time, the default is 10, which limits the number of files that can be selected under a certain column family for merging. Note that the conditions need to be met hbase.hstore.compaction.max> hbase.hstore.compaction.min

hbase.hstore.compaction.max.size

The default Long maximum value. When the HFile size exceeds this value, it will not be selected and merged when minor_compact is used to limit the HFile that is too large from being selected and merged, reduce write amplification and increase the merge speed

hbase.hstore.compaction.min.size

The default memstore size, if HFile is smaller than this value for minor_compact, it will be selected and can be used to optimize as many small files as possible.

hbase.regionserver.thread.compaction.small defaults to 1. The number of minor compaction threads for each RS is actually not very accurate. This thread mainly depends on the amount of HFile data involved in the merge. It is possible that the amount of minor compaction data is larger and will be increased by compaction.large Thread can improve the efficiency of HFile merge

hbase.regionserver.thread.compaction.large defaults to 1. The number of major compaction threads for each RS is actually not very accurate. This thread mainly depends on the amount of HFile data involved in the merge. It is possible that the amount of minor compaction data may be larger and use compaction.large to increase Thread can improve the efficiency of HFile merge

hbase.hregion.majorcompaction

Default: 86400000

Shut down hbase major compaction and execute manually when business is low





Guess you like

Origin blog.51cto.com/15060465/2676889