HBase Sharing | Cloud HBase OpenTSDB Timing Engine Compression Optimization

Abstract : At the 10th MeetUp--HBase Ecological Practice (Hangzhou Station) of the Chinese HBase Technology Community, Alibaba Cloud technical expert Guo Zehui introduced the introduction and compression optimization of OpenTSDB, the timing engine of Cloud HBase, and showed you how to use OpenTSDB Some problems encountered and optimization solutions, and a corresponding introduction to the centralized use mode of cloud OpenTSDB.

This article is organized based on the speech video and PPT.

This article will mainly focus on the following four aspects to share:

  1. Introduction to OpenTSDB

  2. Frequently Asked Questions of OpenTSDB

  3. Compression optimization of OpenTSDB

  4. Cloud OpenTSDB usage model

This article will first briefly introduce OpenTSDB and the common problems encountered when using OpenTSDB. Then, it will focus on the corresponding improvements compared to the community version of OpenTSDB, and finally introduce several usage modes of Cloud OpenTSDB.


1. Introduction of OpenTSDB

OpenTSDB is a time series database based on HBase. The time series data has the following characteristics:
(1) The data is of numerical type and is often used for monitoring.
(2) Data usually arrives in chronological order.
(3) Basically, the data will not be updated.
(4) More writes and less reads. Currently, the architecture chosen by users to deploy time-series databases on Alibaba Cloud is usually to deploy a certain amount of OpenTSDB after SLB is mounted, and finally the bottom layer is associated with HBase.

image

OpenTSDB has the following definitions for time series data. First, a metric is created during the data storage process, which can be understood as a monitoring group, and tags are used to further identify the data. The combination of Metric+Tags represents a specific timeline, each time Multiple data points will be stored on the line, where the data point is a two-tuple containing time and value, and each timeline will continuously generate and write related data. For time precision, OpenTSDB supports time precision in seconds and milliseconds. In the written data, if the time stamp of the data point is 10 digits, the time precision is seconds, and if the written time stamp is 13 digits, the time precision is Is milliseconds. Next, let’s briefly introduce the storage structure of OpenTSDB. Take metric=host.cpu, timestamp=1552643656, tags={region:HangZhou,host:30,43,111,255}, value=100 as an example. After this data reaches the HBase storage layer After the following conversion, when the data metric=host.cpu is stored, it is mapped to a unique UID for storage. In order to reduce storage space, each timeline stores one row per hour. Therefore, for timestamp=1552643656, the data will be converted into the first full second of the hour. Similarly, the information of tags will also be mapped into a unique UID for storage. At this point, the entire RowKey part is stored. For the Column part, the offset from the start time of the hour and the value of value will be recorded.

image

The following is a more detailed introduction to the data format of RowKey and Column. In fact, in order to avoid the write hot issue caused by the same metric having many timelines, there will be a salted data salt in front of the metric. Break up the hot spots of different timelines under the same metric. At the same time, the default length of Metric and Tag is 3 bytes, that is, only 16777216 UIDs can be allocated at most. In order to prevent the problem of UID excess, you can modify the corresponding parameters (tsd.storage.uid.width.metric, tsd.storage .uid.width.tagk, tsd.storage.uid.width.tagv), but it should be noted that once the cluster has written data, the parameters cannot be modified again. For the data format of the Column part, there is a difference between the second-level and millisecond-level data formats, and the millisecond-level data requires more space for storage.



2. Common problems
of OpenTSDB Next, we will introduce common problems of OpenTSDB. As mentioned above, in order to avoid writing hot issues caused by having many timelines in the same metric, the hotspots will be broken up by adding one bit of data to the data. , But after breaking up, the Metric will be divided into multiple buckets. At this time, when querying, how many buckets will concurrently send query requests to HBase. At this time, there is a trade-off between writing hotspots and concurrent queries. The problem is that too much concurrency can easily blow HBase, so you need to consider the actual HBase business scale to set the parameters. Also note that this setting cannot be modified after data is written.

image


The second problem is that the Cartesian product of Tags affects the query efficiency. Assume that a mertric=host.cpu that monitors the CPU status of a machine has three tags, which are the region of the computer room, the host name host, and the CPU single-core number core. The numbers of these three tags are 10, 100, and 8, respectively. Then the mertric will contain 101008=8000 timelines. If you need to query the data on a certain timeline, you also need to access the data on 8000 timelines to get the desired result. The fundamental reason is that the tag is not indexed. At present, there are mainly two common solutions. One is to raise the tag before RowKey to establish an index, and the other is to splice part of the tag information in RowKey to reduce the Cartesian product. OpenTSDB also has some other query problems. For example, the data is not streamed, and the data points must be put into the TSDB memory before the aggregation process will start, which is easy to OOM; a large query will only be allocated to one TSDB for processing , Can not be distributed processing; even if you only query 15 minutes of data, you need to traverse one hour of data and so on.

image

The third problem is the compression problem of OpenTSDB, that is, the whole point compression has a traffic impact on HBase. In order to save storage space, OpenTSDB synthesizes discrete KVs into one large KV to reduce repeated fields. Therefore, a series of read-compress-rewrite-delete operations will occur on the hour. These operations will HBase initiates a large number of IO requests, resulting in a long-term traffic peak of several times, which can easily blow HBase. Therefore, a higher machine scale is required to resist wave crests.

image


3. Compression optimization of OpenTSDB

In fact, if the traffic peaks can be eliminated, operating costs can be greatly reduced and the stability of the system can be improved. The traditional data compression step is to first read all the data written in the last hour from HBase, then compress the read data in OpenTSDB, write the compressed data into HBase, and finally transfer the old data from HBase It can be seen that there are a large number of OpenTSDB IO requests for HBase in the compressed data. In order to solve this problem, the compression of OpenTSDB needs to be optimized. The specific realization idea is to sink the compression process of OpenTSDB into the underlying HBase, because HBaes itself will read and write the data when it compresses the data. At this time, the data will be read and written. For processing, reuse the traffic of the HBase compression process, and merge KV according to the data format of OpenTSDB when merging HFiles, so as to avoid OpenTSDB from initiating a large number of IO requests to HBase, thereby avoiding traffic peaks.

image


Four. Cloud OpenTSDB usage mode

Finally, I will introduce several usage modes of Cloud OpenTSDB. One is the exclusive mode, that is, a time series database that is completely deployed independently, which is suitable for situations where time series services are heavy and need to be deployed separately.

image


The other is the sharing mode, which can reuse the cloud HBase that has been purchased, which is suitable for situations where the timing business is small or the HBase cloud resources are small, which can save costs to a certain extent.

image


imageimage


Guess you like

Origin blog.51cto.com/15060465/2676958
Recommended