Monitoring Indicators and Monitoring Types

The most basic part of the monitoring system is monitoring indicators. The monitoring system is a system that focuses on the collection, transmission, storage, analysis, and visualization of indicators.

Monitoring indicators refer to monitoring data of numerical type, such as the memory utilization rate of a certain machine, the current number of connections of a certain MySQL instance, the maximum memory limit of a certain Redis, and so on. Different monitoring systems have different description methods for monitoring indicators, and there are three typical methods.

  • A monitoring indicator is usually a globally unique string, such as the memory utilization of a machine host.10.2.3.4.mem_used_percent. This string contains the information of the machine and the name of the indicator, which can uniquely identify a monitoring indicator.
  • Combinations of label sets are identified as indicators. The first section is the indicator name, the second section is the timestamp (in seconds), the third section is the indicator value, and the rest are multiple labels (tags/labels), each label is in the format of key=value , separate multiple tags with spaces.
  • Elegant and efficient Influx metrics format. measurement, tag_set field_set timestamp, where tag_set is optional, and tag_set is separated from the previous measurement by commas, and other parts are separated by spaces.

The Prometheus ecology also supports data types, which are divided into four types: Gauge, Counter, Histogram, and Summary. Let's briefly understand these four types of Prometheus.

  • Gauge: The measurement value type, which can be large or small, positive or negative. For this type of data, we usually focus on the current value.
  • Counter: Indicates a monotonically increasing value, such as the number of all traffic packets received by the network card since the operating system was started.
  • Histogram: Histogram type, used to describe data distribution. The most typical application scenario is to monitor delayed data and calculate the 90th and 99th percentile values. The so-called quantile value is to sort a batch of data from small to large, and then take the data at the X% position. The 90th quantile refers to the value at the 90% position of the sample data.
  • Summary: This type calculates the quantile value on the client side, and then pushes the calculated result to the server side for storage. When displaying, it can be queried directly. It does not need to do heavy calculations, and the performance is greatly improved.

Time series database is a database that specializes in processing time series data. Among our common databases, MySQL is a relational database, Redis is a KV database, MongoDB is a document database, and InfluxDB, VictoriaMetrics, M3DB, etc. are all time series libraries. Prometheus actually has a built-in time series storage module.

The biggest feature of time-series data is that each piece of data has a time stamp, which is usually in a monotonous order and will not be out of order. It is sent to the server in a streaming format and is usually not modified. For example, indicator data and log data are typical time-series data. .

This article is a study note for Day28 in July. The content comes from Geek Time "Operation and Maintenance Monitoring System Practical Notes". This course is recommended.

Guess you like

Origin blog.csdn.net/key_3_feng/article/details/131987637