"InfluxDB principle and actual combat" reading notes

Overview

The preface, as a first impression (although many people, including me sometimes don’t read it very much), is very powerful, and the conclusion is very pertinent:

The real challenge of massive monitoring data lies in the following points:
1. Can it be real-time. Real-time is the ability of germplasm to change, which can upgrade an offline monitoring platform into a real-time decision-making system. The difficulty lies in whether it is possible to design a high-performance architecture and whether to achieve horizontal expansion.
2. After clustering, the traffic size of a single business and the number of label sets are the key. Large flow, relatively easy to solve, mainly related to system performance and horizontal expansion. There are many tag sets, a large number of tags, and a large number of time series lines. How to optimize query is a challenge. For example, the monitoring data reported by the author has dozens of dimensions of tags, and the QQ number and URL are used as tag values. Very large number of time series lines.
3. How to design an efficient storage engine based on the characteristics of monitoring data that are more written and less read and cost-sensitive? It can not only give full play to hardware performance, but also ensure query efficiency while efficiently compressing storage.


Technical concept positioning: technology reduces costs and resolutely opposes open source software stacking.


What needs to be done is to use strong technical and engineering capabilities to face the problem and solve it at the architecture and source code level, instead of introducing and stacking more open source software.

The first chapter first met InfluxDB

Time series data

The definition of time series data, summarized several characteristics:

  1. Arriving data is almost always recorded as a new entry, no update operation.
  2. Data usually arrives in chronological order.
  3. Time is a main axis.

InfluxData与TICK

InfluxData's open source high-performance timing mid-stage TICK (Telegraf + InfluxDB + Chronograf + Kapacitor), InfluxDB is designed and developed as a TICK storage system. TICK focuses on application scenarios such as DevOps monitoring, IoT monitoring, and real-time analysis. It is an open source timing middle platform that integrates collection, storage, analysis, and visualization capabilities. It consists of 4 components such as Telegraf, InfluxDB, Chronograf, and Kapacitor. However, it is constituted in a closely coordinated and mutually complementary way. Each module cooperates and complements each other. The overall system architecture: After
Insert picture description here
the company open - sourced the high-availability suite of InfluxDB-Relay , it announced the closed source of its cluster function as a commercial paid version (InfluxDB Enterprise And InfluxDB Cloud) distribution.

scenes to be used

Time series data storage, analysis and monitoring

Advantage

Features and introduction

Chapter 2 Introduction to InfluxDB

installation

Introduction to the command line

Configuration file

Chapter 3 Writing and Query

Write

InfluxDB write operations support the concise Line Protocol (a text-based protocol), and third-party protocols such as CollectD, Graphite, OpenTSDB, Prometheus, and UDP.

Line protocol A
single line of text, representing a time series data, consisting of 4 parts: table, tag set, indicator set and time stamp

Chapter 4 Continuous Query and Retention Strategy

Continuous query

Retention policy

Chapter 5 Authentication and Authorization

Through authentication and authorization, different accounts of InfluxDB have completely independent data space and permission space.

Certification

Authorization

Chapter 6 Cluster and High Availability

Chapter 7 Backup Management and Node Management

Backup management

InfluxDB Enterprise Edition provides two toolsets:

  1. Backup and restore backup tool set
    Used in most scenarios, a general-purpose tool; it supports selecting the data to be operated in the three dimensions of database, retention strategy, and sharding to perform backup or restore backup operations.

  2. Export and import data tool set
    A backup tool designed to complement the scenes of massive data sets (above 100G).

    • Export: influx_inspect export, export data in line protocol format, parameter options:
      • -compress: Use gzip to compress data, not compressed by default
      • -database <db_name>: the name of the database corresponding to the data to be exported
      • -datadir <data_dir>: the storage directory corresponding to the data of the DATA node, the default value$HOME/.influxdb/data
      • -end <timestamp>: the timestamp of the end of the time range, in rfc3339 format
      • -out <export_dir>: The storage directory of the exported data, the default value$HOME/.influxdb/export
      • -retention <rp_name>: The name of the retention policy corresponding to the data to be exported
      • -start <timestamp>: the timestamp of the beginning of the time range, in rfc3339 format
      • -waldir <wal_dir>: the storage directory corresponding to the WAL file of the DATA node, the default value is$HOME/.influxdb/wal
    • Import: influx-import, options:
      • -path: The storage directory of the data file to be imported
      • -compressed: If the imported file is a compressed file, set to true, support. gz format compression
      • -pps: Import the allowed rate. The default pps is 0 for unlimited speed.
      • -precision'h|m|s|ms|u|ns': Specify the time stamp accuracy of imported data, support h hour, m minute, s second, ms millisecond, u microsecond, ns nanosecond, and the default precision is ns.

Node management

Chapter 8 Third Party Agreement

Today's open source world is too prosperous, and one family behind closed doors is a big one, and there is a high probability that it will die. Therefore, it is necessary to support the extensibility of native integration with other systems or integration in the form of plug-ins.

UDP

CollectD

Official website ,
architecture diagram:
Insert picture description here
Multi-value Plugins are used to specify the processing method of multi-index data, there are two processing methods: split and join

Graphite

OpenTSDB

OpenTSDB is a distributed and scalable time series database based on HBase, consisting of Time Series Daemon (TSD) and a set of command line utilities. The OpenTSDB service can be provided by running one or more TSDs, each TSD is independent, there is no main device and shared state, so users can run any number of TSDs as needed to support business needs. Each TSD uses HBase or the hosted Google Bigtable service to store and retrieve time series data. The data mode is highly optimized for the rapid aggregation of similar time series to save storage space to the greatest extent. Users do not need to directly access the underlying storage, and can communicate with TSD through the telnet protocol, HTTP API or built-in GUI to obtain relevant data or perform preset operations. All communications are carried out on the same designated port, and TSD determines the client's protocol by looking at the first few bytes received. OpenTSDB architecture:
Insert picture description here

Prometheus

Prometheus is inspired by Google's Borgmon monitoring system. The main modules include Prometheus Server, Exporters, Push Gateway, PromQL, Alertmanager and graphical interface. The system architecture:
Insert picture description here

Chapter 9 Actual Combat of DevOps Monitoring Based on TICK

TICK, namely Telegraf, InfluxDB, Chronograf, and Kapacitor, is a monitoring system that integrates collection capabilities, computing and storage capabilities, alarm capabilities, and visualization capabilities, and supports hundreds of third-party systems and software. It focuses on DevOps monitoring, IoT monitoring and real-time analysis.

TICK architecture diagram:
Insert picture description here
Telegraf collects the monitoring data specified in the configuration file and reports it to the InfluxDB server through the InfluxDB API interface. After the InfluxDB server receives the reported time series data, it executes preset continuous queries, aggregation operations and other operations and compresses them for storage. Through Chronograf, visual information such as Dashboard can be viewed, and through Kapacitor, preset alarm strategies can be executed on the received time series data.

Telegraph

InfluxDB

Chronograf

Capacitor

Chapter 10 Actual Combat of DevOps Monitoring Based on InfluxDB, Prometheus, and Grafana

In the open source world, openness and ecology are the future. InfluxDB supports the integration of Prometheus and Grafana. Prometheus lacks the ability to store data persistently?

Prometheus

Install, skip.

Grafana

Grafana features:

  • Visualization: From heat maps to histograms, from charts to geographic maps, Grafana supports a large number of visualization options to help readers understand the data in a beautifully presented way.
  • Alarms: Visually define thresholds, configure alarm strategies, process index data and generate corresponding alarm information, and support notification systems such as Slack, PagerDuty, VictorOps, OpsGenie, etc.
  • Unified display: Integrate data from multiple data sources for unified display, support more than 30 open source and commercial data sources, and build a unified dashboard based on multiple data sources.
  • Open: completely open source, developed and operated by a vibrant community; easy to install, supports all mainstream operating systems, provides Docker installation images; provides cloud-hosted Grafana services.
  • Scalability: The current Grafana official library supports hundreds of dashboards and plugins, and the passionate and energetic Grafana community will launch new dashboards or plugins every week.
  • Collaboration: Data and dashboards can be shared between teams.

Install, skip.

integrated

Chapter 11 Analysis of InfluxDB Source Code Architecture

Guess you like

Origin blog.csdn.net/lonelymanontheway/article/details/107444961