【Introduction to OpenTSDB】

OpenTSDB can be considered as a time series data (library), which stores data based on HBase, gives full play to the distributed column storage characteristics of HBase, and supports millions of reads and writes per second.

 

The open source monitoring system OpenTSDB uses hbase to store all time series (without sampling) to build a distributed and scalable time series database. It supports second-level data collection of all metrics, supports permanent storage, can do capacity planning, and is easily integrated into the existing alarm system. OpenTSDB can obtain corresponding metrics from large-scale clusters (including network devices, operating systems, and applications in the cluster) and store, index, and serve them, making these data easier to understand, such as web and graphical Wait.

 

How does OpenTSDB work?

OpenTSDB consists of a Time Series Daemon (TSD) as well as set of command line utilities. Interaction with OpenTSDB is primarily achieved by running one or more of the TSDs. Each TSD is independent. There is no master, no shared state so you can run as many TSDs as required to handle any load you throw at it. Each TSD uses the open source database HBase to store and retrieve time-series data. The HBase schema is highly optimized for fast aggregations of similar time series to minimize storage space. Users of the TSD never need to access HBase directly. You can communicate with the TSD via a simple telnet-style protocol, an HTTP API or a simple built-in GUI. All communications happen on the same port (the TSD figures out the protocol of the client by looking at the first few bytes it receives).



 In OpenTSDB, a time series data point consists of:

A metric name.

A UNIX timestamp (seconds or millisecinds since Epoch).

A value (64 bit integer or single-precision floating point value).

A set of tags (key-value pairs) that describe the time series the point belongs to.

 

opentsdb supports several writing methods, telnet, http, or tcollector (collector), each with its own characteristics, Telnet is more suitable for testing, tcollector is a pull model, and the http interface may be more general. Before writing data, we need to Think about the characteristics of the data we want to store.

 

 

Usually, when we design a system, we must design a metrics system. When we name metrics, we always give the system a long and well-known name, followed by value. For example, webserver01.sys.cpu.0.user represents the cpu usage rate of cpu user mode No. 0 of the server webserver01, but if we directly store this metric as a key, is it convenient to query later?

 

In fact, the task is stored for the purpose of taking it out one day, or taking it out according to certain conditions. If you only store it without taking it out, that is called deletion! Therefore, the same is true for opentsdb, we have to consider whether it is convenient to query after storage.

In the previous example, if we use webserver01.sys.cpu.0.user as the key, we may need to store many such keys. For example, the machine has 64 cores, and you need to store 64 records. Now I want to calculate the machine cpu avg, Then you need to scan all the records, and then aggregate and calculate the average value. Imagine if you have many computer rooms and many machines? When your statistical requirements involve a large number of records, these records are particularly scattered in the underlying storage, then undoubtedly, the performance of your aggregation will be very poor. It is related to the design of the hbase table. The rowkey starts with metrics and then adds timestamp and tag.

kv, if it is not designed like this, the records involved in your aggregate query may not be compact, or span multiple regions. Think about it, can the performance be faster?

opentsdb uses tags to assist in the process, that is, when you store it, the key is still there, but it should be simplified as much as possible. When you need to describe the metric in multiple dimensions, I will record the remaining dimensions for you in the form of tags. What's the difference, or what's the benefit, and why is it designed this way?

opentsdb requires that you must have at least one tag, and there can be multiple, unlimited numbers. At the same time, a simplified metric is used as the key. This metric can actually be shared. Still take the previous example, webserver01.sys.cpu.0.user is transformed into sys.cpu.user{host=webserver01,cpu=42}, that is, with 2 tag kv pairs, metric becomes sys.cpu.user

In this way, no matter what computer room or machine you are in, all data on CPU usage will be compactly stored together, and the corresponding query aggregation performance will be better.

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326607769&siteId=291194637