Getting Started with Prometheus

What is TSDB?

TSDB (Time Series Database) time series database, we can simply understand it as an optimized software for processing time series data, and the arrays in the data are indexed by time.

Features of time series databases

  • Most of the time is a write operation.
  • Write operations are almost sequential additions, most of the time the data arrives sorted by time.
  • Write operations rarely write data that is a long time ago, and rarely update data. In most cases the data is written to the database seconds or minutes after it has been collected.
  • The deletion operation is generally block deletion, selecting the starting historical time and specifying the subsequent blocks. Rarely delete data at a certain time alone or at a separate random time.
  • The basic data is large and generally exceeds the memory size. Generally, only a small part of it is selected and there is no pattern, and the cache has almost no effect.
  • Read operations are typically ascending or descending sequential reads.
  • Read operations with high concurrency are very common.

Common time series databases

TSDB project Official website
influxDB https://influxdata.com/
RRDtool http://oss.oetiker.ch/rrdtool/
Graphite http://graphiteapp.org/
OpenTSDB http://opentsdb.net/
Kdb+ http://kx.com/
Druid http://druid.io/
KairosDB http://kairosdb.github.io/
Prometheus https://prometheus.io/

What is Prometheus?

Prometheus is an open source monitoring alarm system and time series database (TSDB) developed by SoundCloud. Prometheus is developed in Go language and is an open source version of Google's BorgMon monitoring system.

In 2016, Google launched the Cloud Native Computing Foundation under the Linux Foundation, which included Prometheus as its second largest open source project. Prometheus is currently quite active in the open source community.

Prometheus and Heapster (Heapster is a sub-project of K8S used to obtain the performance data of the cluster.) Compared with the more complete and comprehensive functions. The performance of Prometheus is also sufficient to support clusters of tens of thousands of machines.

Features of Prometheus

  • Multidimensional data model.
  • Flexible query language.
  • Instead of relying on distributed storage, a single server node is autonomous.
  • Collect time series data through HTTP-based pull.
  • Time series data can be pushed through an intermediate gateway.
  • Discover target service objects through service discovery or static configuration.
  • Support a variety of charts and interface display, such as Grafana and so on.

Prometheus related components

The Prometheus ecosystem consists of multiple components, some of which are optional. Most Prometheus components are written in Go, which makes these components easy to compile and deploy.

  • Prometheus Server

It is mainly responsible for data collection and storage, and provides support for the PromQL query language.

  • Client SDK

The officially provided client libraries include go, java, scala, python, ruby, and many other third-party developed libraries, supporting nodejs, php, erlang, etc.

  • Push Gateway

An intermediate gateway that supports active push metrics from temporary jobs.

  • PromDash

Use Rails to develop a visual Dashboard for visualizing indicator data.

  • Exporter

Exporter is a general term for a class of data acquisition components of Prometheus. It is responsible for collecting data from the target and converting it into a format supported by Prometheus. Different from the traditional data collection component, it does not send data to the central server, but waits for the central server to take the initiative to grab it.

Prometheus provides various types of Exporters for collecting the running status of various services. Currently supported are databases, hardware, message middleware, storage systems, HTTP servers, JMX, etc.

  • alertmanager

Alert Manager for alerting.

  • prometheus_cli

command line tool.

  • Other auxiliary tools

A variety of export tools can support the data storage format required by Prometheus storage data to be converted into HAProxy, StatsD, Graphite and other tools.

Architecture of Prometheus

The following picture illustrates the overall architecture of Prometheus and the roles of some components in the ecosystem:

The basic principle of Prometheus is to periodically capture the status of monitored components through the HTTP protocol. Any component can access monitoring as long as it provides the corresponding HTTP interface. No SDK or other integration process is required. This is very suitable for virtualized environment monitoring systems, such as VM, Docker, Kubernetes, etc. The HTTP interface for exporting information about monitored components is called exporter. At present, most of the components commonly used by Internet companies can be directly used by exporters, such as Varnish, Haproxy, Nginx, MySQL, and Linux system information (including disk, memory, CPU, network, etc.).

The Prometheus service process is probably like this:

  • Prometheus Daemon is responsible for regularly grabbing metrics data on the target, and each grab target needs to expose an http service interface for it to grab regularly. Prometheus supports specifying crawl targets through configuration files, text files, Zookeeper, Consul, DNS SRV Lookup, etc. Prometheus uses the PULL method for monitoring, that is, the server can push data directly through the target PULL data or indirectly through an intermediate gateway.

  • Prometheus stores all the captured data locally, cleans and organizes the data through certain rules, and stores the obtained results in a new time series.

  • Prometheus visualizes collected data via PromQL and other APIs. Prometheus supports chart visualization in many ways, such as Grafana, its own Promdash, and its own template engine, etc. Prometheus also provides HTTP API query methods to customize the required output.

  • PushGateway supports the client to actively push metrics to PushGateway, while Prometheus only periodically fetches data from the Gateway.

  • Alertmanager is a component independent of Prometheus, which can support Prometheus query statements and provide very flexible alerting methods.

Scenarios for Prometheus

Prometheus is very good at recording purely numeric time series. It is suitable not only for monitoring hardware indicators such as servers, but also for monitoring highly dynamic service-oriented architectures. For popular microservices, Prometheus' multi-dimensional data collection and data filtering query language is also very powerful. Prometheus is designed for service reliability, allowing you to quickly locate and diagnose problems when a service fails. Its construction process has no strong dependencies on hardware and services.

Scenarios where Prometheus does not apply

The value of Prometheus lies in its reliability. Even in a very harsh environment, you can access it at any time and view the statistics of various indicators of the system service. It doesn't work if you need 100% accuracy for statistics, eg: it doesn't work with real-time billing systems.

Prometheus official website: https://prometheus.io/

Install Prometheus

Prometheus officially provides multiple deployment solutions, such as: Docker container, Ansible, Chef, Puppet, Saltstack, etc.

Prometheus is implemented in Golang, so it is naturally portable (supports Linux, Windows, macOS and Freebsd). This is deployed directly using precompiled binaries, out of the box.

  • Prometheus installation

Here is an example of a Linux system:

1
2
3
$ wget  https://github.com/prometheus/prometheus/releases/download/v1.6.3/prometheus-1.6.3.linux-amd64.tar.gz
$ tar xzvf prometheus-1.6.3.linux-amd64.tar.gz
$ mv prometheus-1.6.3.linux-amd64 /usr/local/prometheus

Other system versions can be downloaded here: https://prometheus.io/download/

  • Verify installation
1
2
3
4
5
6
$ cd /usr/local/prometheus
$ ./prometheus --version
prometheus, version 1.6.3 (branch: master, revision: c580b60c67f2c5f6b638c3322161bcdf6d68d7fc)
build user: root@e54b06e0b22f
build date: 20170519-08:00:43
go version: go1.8.1
  • Configure Prometheus

In the prometheus directory there is a prometheus.ymlmain configuration file called . It includes most of the standard configuration and self-checking configuration of prometheus. The default configuration file is as follows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
$ cat /usr/local/prometheus/prometheus.yml 

# Global configuration
global:
scrape_interval: 15s # Default scrape interval, scrape data to the target once every 15 seconds.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).

# This label is generated by default on each time series on the local machine, and can be used mainly Used in federated query, remote storage, Alertmanger.
external_labels:
monitor: 'codelab-monitor'

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first.rules"
# - "second.rules"

# Here is the configuration of the grabbed object
# Here is the configuration of grabbing promethues itself
scrape_configs:
# job name This configuration is the time series example in this configuration, and each item will be automatically added with the label of this {job_name:"prometheus"}.
- job_name: 'prometheus'

# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.

# Rewrite the global crawl interval from 15 seconds to 5 seconds.
scrape_interval: 5s

static_configs:
- targets: ['localhost:9090']
  • create user

It is a good habit to create a user dedicated to running prometheus here. It is a good habit to run programs without root. The home directory is /var/lib/prometheusused as the data directory for prometheus.

1
2
$ groupadd prometheus
$ useradd -g prometheus -m -d /var/lib/prometheus -s /sbin/nologin prometheus
  • Create Systemd service
1
2
3
4
5
6
7
8
9
10
11
12
$ vim /etc/systemd/system/prometheus.service

[Unit]
Description=prometheus
After=network.target
[Service]
Type=simple
User=prometheus
ExecStart=/usr/local/prometheus/prometheus -config.file=/usr/local/prometheus/prometheus.yml -storage.local.path=/var/lib/prometheus
Restart=on-failure
[Install]
WantedBy=multi-user.target
  • 启动Prometheus
1
$ systemctl start prometheus
  • 验证Prometheus是否启动成功
1
2
3
4
5
6
7
8
9
10
$ systemctl status prometheus
● prometheus.service - prometheus
Loaded: loaded (/etc/systemd/system/prometheus.service; disabled; vendor preset: enabled)
Active: active (running) since Mon 2017-05-22 11:13:36 CST; 18s ago
Main PID: 9175 (prometheus)
Tasks: 9
Memory: 15.8M
CPU: 207ms
CGroup: /system.slice/prometheus.service
└─9175 /usr/local/prometheus/prometheus -config.file=/usr/local/prometheus/prometheus.yml -storage.local.path=/var/lib/prometheus
  • 访问自带Web

Prometheus自带一个比较简单的Web,可以查看表达式搜索结果、报警配置、prometheus配置,exporter状态等。自带Web默认http://ip:9090

Prometheus本身也是自带exporter的,我们通过请求 http://ip:9090/metrics 可以查看从exporter中能具体抓到哪些数据。

这里以Prometheus本身数据为例,简单演示下在Web中查询指定表达式及图形化显示查询结果。

使用Prometheus监控服务器

上面用Prometheus本身的数据简单演示了监控数据的查询,这里我们用一个监控服务器状态的例子来更加直观说明。

为监控服务器CPU、内存、磁盘、I/O等信息,首先需要安装node_exporter。node_exporter的作用是用于机器系统数据收集。

  • 安装node_exporter

node_exporter也是用Golang实现,直接使用预编译的二进制文件部署,开箱即用。

1
2
3
$ wget https://github.com/prometheus/node_exporter/releases/download/v0.14.0/node_exporter-0.14.0.linux-amd64.tar.gz
$ tar -zxvf node_exporter-0.14.0.linux-amd64.tar.gz
$ mv node_exporter-0.14.0.linux-amd64 /usr/local/prometheus/node_exporter
  • 创建Systemd服务
1
2
3
4
5
6
7
8
9
10
11
12
$ vim /etc/systemd/system/node_exporter.service

[Unit]
Description=node_exporter
After=network.target
[Service]
Type=simple
User=prometheus
ExecStart=/usr/local/prometheus/node_exporter/node_exporter
Restart=on-failure
[Install]
WantedBy=multi-user.target
  • 启动Node exporter
1
$ systemctl start node_exporter
  • 验证Node exporter是否启动成功
1
2
3
4
5
6
7
8
9
10
$ systemctl status node_exporter
● node_exporter.service - node_exporter
Loaded: loaded (/etc/systemd/system/node_exporter.service; disabled; vendor preset: enabled)
Active: active (running) since Mon 2017-05-22 12:13:43 CST; 6s ago
Main PID: 11776 (node_exporter)
Tasks: 4
Memory: 1.5M
CPU: 24ms
CGroup: /system.slice/node_exporter.service
└─11776 /usr/local/prometheus/node_exporter/node_exporter
  • 修改prometheus.yml,加入下面的监控目标:

Node Exporter默认的抓取地址为http://IP:9100/metrics

1
2
3
4
5
6
7
$ vim  /usr/local/prometheus/prometheus.yml

- job_name: 'linux'
static_configs:
- targets: ['localhost:9100']
labels:
instance: node1

prometheus.yml中一共定义了两个监控:一个是监控prometheus自身服务,另一个是监控Linux服务器。这里给个完整的示例:

1
2
3
4
5
6
7
8
9
10
11
scrape_configs:

- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']

- job_name: 'linux'
static_configs:
- targets: ['localhost:9100']
labels:
instance: node1
  • 重启Prometheus
1
$ systemctl restart prometheus
  • 在Prometheus Web查看监控的目标

访问Prometheus Web,在Status->Targets页面下,我们可以看到我们配置的两个Target,它们的State为UP。

使用Prometheus Web来验证Node Exporter的数据已经被正确的采集。

a) 查看当前主机的CPU使用情况

b) 查看当前主机的CPU负载情况

Prometheus Web界面自带的图表是非常基础的,比较适合用来做测试。如果要构建强大的Dashboard,还是需要更加专业的工具才行。接下来我们将使用Grafana来对Prometheus采集到的数据进行可视化展示。

给Prometheus添加一个强大的仪表盘

Grafana是用于可视化大型测量数据的开源程序,它提供了强大和优雅的方式去创建、共享、浏览数据。Dashboard中显示了你不同metric数据源中的数据。

Grafana最常用于因特网基础设施和应用分析,但在其他领域也有用到,比如:工业传感器、家庭自动化、过程控制等等。Grafana支持热插拔控制面板和可扩展的数据源,目前已经支持Graphite、InfluxDB、OpenTSDB、Elasticsearch、Prometheus等。

  • Grafana安装

软件源里是比较旧的2.6版本,并且还需要单独打补丁才能正常使用Prometheus的数据源。这里直接下载4.2版本安装包进行安装。

以Ubutu系统为例:

1
2
$ wget https://s3-us-west-2.amazonaws.com/grafana-releases/release/grafana_4.2.0_amd64.deb
$ dpkg -i grafana_4.2.0_amd64.deb

其它系统可在这里下载:https://grafana.com/grafana/download

  • 启动Grafana
1
$ systemctl start grafana-server
  • 查看Grafana是否启动成功
1
2
3
4
5
6
7
8
$ systemctl status grafana-server
● grafana-server.service - Grafana instance
Loaded: loaded (/usr/lib/systemd/system/grafana-server.service; masked; vendor preset: enabled)
Active: active (running) since Mon 2017-05-22 14:57:29 CST; 49min ago
Docs: http://docs.grafana.org
Main PID: 21735 (grafana-server)
CGroup: /system.slice/grafana-server.service
└─21735 /usr/sbin/grafana-server --config=/etc/grafana/grafana.ini --pidfile= cfg:default.paths.logs=/var/log/grafana cfg:default.paths.data=/var/lib/grafana cfg:default.paths.plugins=/var/lib/grafana/plugins
  • 访问Grafana

通过http://ip:3000访问Grafana Web界面(缺省帐号/密码为admin/admin)

  • 在Grafana中添加Prometheus数据源
1
2
3
4
Name:Prometheus
Type:Prometheus
Url:http://localhost:9090/
Access:proxy

在Dashboards页面导入自带的Prometheus Status模板

  • 导入Node Exporter Server Metrics模板

访问https://grafana.com/dashboards/405,从这里下载Node Exporter Server Metrics模板的JSON文件。

Grafana--Dashboard中导入这个文件,数据源选择Prometheus。

  • 访问Dashboards

在Dashboards上选Node Exporter Server Metrics模板,就可以看到被监控服务器的CPU, 内存, 磁盘等统计信息。

如果想具体查看某一项指标也是可以的。

在Dashboards上选Prometheus Status模板,查看Prometheus各项指标数据。

 

 

参考文章: https://www.hi-linux.com/posts/25047.html

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324845273&siteId=291194637