Article directory
cAdvisor+Prometheus+Grafana 10 minutes to get Docker container monitoring platform
cAdvisor(Container Advisor)
It is an Google
open source container monitoring tool that can be used to monitor the usage and performance of container resources. Used to collect, aggregate, process and export information about running containers. Specifically, the component records resource isolation parameters, historical resource usage, histograms of full historical resource usage, and network statistics for each container. It supports containerscAdvisor
itself , and also provides support for other types of containers as much as possible, striving to be compatible and adaptable to all types of containers.Docker
From the above introduction, we can know that cAdvisor
it is used to monitor the container engine. Due to its monitoring practicability, Kubernetes
it has been Kubelet
integrated . Therefore, for cloud-native clusters, directly use Kubelet
the indicator collection address provided by the component That's it.
cAdvisor deployment
1. Use the following command to install and start cAdvisor组件
:
docker run \
--volume=/:/rootfs:ro \
--volume=/var/run:/var/run:ro \
--volume=/sys:/sys:ro \
--volume=/var/lib/docker/:/var/lib/docker:ro \
--volume=/dev/disk/:/dev/disk:ro \
--publish=8080:8080 \
--detach=true \
--name=cadvisor \
--privileged \
--device=/dev/kmsg \
google/cadvisor:latest
2. At this point, cAdvisor组件
it has been started, and we can use the browser http://自己IP地址:8080
to cAdvisor组件
access Web UI
:
3. In the case of multiple hosts, it is obviously not convenient to run one on all nodes and then view the monitoring information cAdvisor
through their own . At the same time , only 2 minutes of monitoring data are saved by default. The good news is that there is already built-in support for . The standard monitoring sample output can be obtained by accessing :Web UI
cAdvisor
cAdvisor
Prometheus
http://自己的IP地址:8080/metrics
Prometheus
4. The following table lists some cAdvisor
typical monitoring indicators obtained in:
Indicator name | type | meaning |
---|---|---|
container_cpu_load_average_10s | gauge | The average load of the container CPU in the past 10 seconds |
container_cpu_usage_seconds_total | counter | Cumulative occupancy time of the container on each CPU core (unit: second) |
container_cpu_system_seconds_total | counter | System CPU cumulative usage time (unit: second) |
container_cpu_user_seconds_total | counter | Cumulative CPU usage time of User (unit: second) |
container_fs_usage_bytes | gauge | The usage of the file system in the container (unit: byte) |
container_fs_limit_bytes | gauge | The total amount of file system that can be used by the container (unit: byte) |
container_fs_reads_bytes_total | counter | The total amount of accumulatively read data in the container (unit: byte) |
container_fs_writes_bytes_total | counter | The total amount of accumulatively written data in the container (unit: byte) |
container_memory_max_usage_bytes | gauge | The maximum memory usage of the container (unit: bytes) |
container_memory_usage_bytes | gauge | The current memory usage of the container (unit: byte |
container_spec_memory_limit_bytes | gauge | Container memory usage limit |
machine_memory_bytes | gauge | The total amount of memory on the current host |
container_network_receive_bytes_total | counter | The total amount of accumulative data received by the container network (unit: byte) |
container_network_transmit_bytes_total | counter | The total amount of accumulatively transferred data on the container network (unit: byte) |
Prometheus deployment
1. Create prometheus
an external directory for storing data to avoid loss when the container is restarted:
mkdir -p /disk/docker-monitor/prometheus/data
chmod 777 /disk/docker-monitor/prometheus/data
2. prometheus
The configuration file is plugged out for easy modification vi /disk/docker-monitor/prometheus/prometheus.yml
:
global:
scrape_interval: 15s
evaluation_interval: 15s
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093
rule_files:
- rule/record/*.yml
scrape_configs:
- job_name: "prometheus"
static_configs:
- targets: ["localhost:9090"]
- job_name: "cadvisor"
static_configs:
- targets: ["124.222.45.207:8080"]
job_name: "prometheus" is configured to capture Prometheus's own related indicators;
job_name: "cadvisor" is configured to capture the previously configured cAdvisor component indicators.
3. Docker
Deployment Prometheus
:
docker run -d -p 9090:9090 --name prometheus \
-v /disk/docker-monitor/prometheus/conf:/opt/bitnami/prometheus/conf \
-v /disk/docker-monitor/prometheus/data:/opt/bitnami/prometheus/data \
bitnami/prometheus:2.42.0 \
--web.enable-lifecycle --web.enable-admin-api\
--config.file=/opt/bitnami/prometheus/conf/prometheus.yml\
--storage.tsdb.path=/opt/bitnami/prometheus/data
–web.enable-lifecycle –web.enable-admin-api provides a rest api interface to manage prometheus, such as configuring hot loading: curl -XPOST http://localhost:9090/-/reload.
Note: The prometheus configuration file and storage directory are hung out here to avoid data loss after the container restarts.
4. Prometheus
After the startup is complete, the browser visits:
Visit Status -> Targets
the page and find that the two captures configured Job
have been displayed, and if State
they are green UP
, the access is successful.
5. After cAdvisor
the sample data can be collected normally, the container can be calculated by the following expression CPU使用率
:sum(irate(container_cpu_usage_seconds_total{image!=""}[1m])) without (cpu)
Grafana deployment
1. Deployment Grafana
:
docker run -d --name=grafana -p 3000:3000 -v grafana:/var/lib/grafana grafana/grafana
The /var/lib/grafana path is plugged out. This directory stores Grafana plug-ins and data information to avoid data loss when the Docker container is restarted.
2. Visit: http://自己的IP:3000/login
, enter the account number admin/admin
:
3. Create Prometheus
a type data source, pointing to the one just built Prometheus
:
4. Import Docker
the container monitoring panel, here use 11277
:
5. You can see Docker
the running status of the container on the monitoring panel. As shown in the figure below, there are 4 containers currently running, and the total memory usage 319MB
is about 1.84%. The memory usage, network IO and disk CPU使用率
of each container are displayed in a curve CPU使用率
IO and so on.