安装prometheus 默认监控prometheus
1. 下载 安装包
https://prometheus.io/download/
复制代码
2. 解压
tar -xvf prometheus-2.33.1.linux-amd64.tar.gz
复制代码
3. 创建软件目录
mkdir prometheus
复制代码
4. 移动软件到prometheus
mv prometheus-2.33.1.linux-amd64 /usr/local/prometheus
复制代码
5. 默认监控本机器
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: "prometheus"
scrape_interval: 5s
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ["localhost:9090"]
- job_name: 'node'
scrape_interval: 5s
static_configs:
- targets: ['localhost:8085']
复制代码
6. 启动服务
./prometheus --config.file=prometheus.yml
复制代码
启动日志
./prometheus --config.file=prometheus.yml
ts=2022-02-11T15:28:43.910Z caller=main.go:475 level=info msg="No time or size retention was set so using the default time retention" duration=15d
ts=2022-02-11T15:28:43.911Z caller=main.go:512 level=info msg="Starting Prometheus" version="(version=2.33.1, branch=HEAD, revision=4e08110891fd5177f9174c4179bc38d789985a13)"
ts=2022-02-11T15:28:43.911Z caller=main.go:517 level=info build_context="(go=go1.17.6, user=root@37fc1ebac798, date=20220202-15:23:18)"
ts=2022-02-11T15:28:43.911Z caller=main.go:518 level=info host_details="(Linux 4.15.0-128-generic #131-Ubuntu SMP Wed Dec 9 06:57:35 UTC 2020 x86_64 jessy (none))"
ts=2022-02-11T15:28:43.911Z caller=main.go:519 level=info fd_limits="(soft=65535, hard=65535)"
ts=2022-02-11T15:28:43.911Z caller=main.go:520 level=info vm_limits="(soft=unlimited, hard=unlimited)"
ts=2022-02-11T15:28:43.913Z caller=web.go:570 level=info component=web msg="Start listening for connections" address=0.0.0.0:9090
ts=2022-02-11T15:28:43.913Z caller=main.go:923 level=info msg="Starting TSDB ..."
ts=2022-02-11T15:28:43.920Z caller=head.go:493 level=info component=tsdb msg="Replaying on-disk memory mappable chunks if any"
ts=2022-02-11T15:28:43.920Z caller=head.go:527 level=info component=tsdb msg="On-disk memory mappable chunks replay completed" duration=3.474µs
ts=2022-02-11T15:28:43.920Z caller=head.go:533 level=info component=tsdb msg="Replaying WAL, this may take a while"
ts=2022-02-11T15:28:43.922Z caller=tls_config.go:195 level=info component=web msg="TLS is disabled." http2=false
ts=2022-02-11T15:28:43.922Z caller=head.go:604 level=info component=tsdb msg="WAL segment loaded" segment=0 maxSegment=0
ts=2022-02-11T15:28:43.923Z caller=head.go:610 level=info component=tsdb msg="WAL replay completed" checkpoint_replay_duration=36.242µs wal_replay_duration=2.130318ms total_replay_duration=2.305903ms
ts=2022-02-11T15:28:43.926Z caller=main.go:944 level=info fs_type=EXT4_SUPER_MAGIC
ts=2022-02-11T15:28:43.926Z caller=main.go:947 level=info msg="TSDB started"
ts=2022-02-11T15:28:43.926Z caller=main.go:1128 level=info msg="Loading configuration file" filename=prometheus.yml
ts=2022-02-11T15:28:43.927Z caller=main.go:1165 level=info msg="Completed loading of configuration file" filename=prometheus.yml totalDuration=793.597µs db_storage=912ns remote_storage=1.85µs web_handler=434ns query_engine=1.057µs scrape=284.374µs scrape_sd=41.261µs notify=37.028µs notify_sd=15.439µs rules=1.847µs
ts=2022-02-11T15:28:43.927Z caller=main.go:896 level=info msg="Server is ready to receive web requests."
复制代码
7.访问测试
http://ip:9090/targets
复制代码
- Labels :表示监控的目标
8.查看当前健康的指标
http://ip:9090/metrics
复制代码
# HELP go_gc_cycles_automatic_gc_cycles_total Count of completed GC cycles generated by the Go runtime.
# TYPE go_gc_cycles_automatic_gc_cycles_total counter
go_gc_cycles_automatic_gc_cycles_total 7
# HELP go_gc_cycles_forced_gc_cycles_total Count of completed GC cycles forced by the application.
# TYPE go_gc_cycles_forced_gc_cycles_total counter
复制代码
9. 将prometheus设置为系统服务
sudo vim /etc/systemd/system/prometheus.service
[Unit]
Description=prometheus
After=network-online.target
[Service]
User=root
Restart=on-failure
ExecStart=/usr/local/prometheus/prometheus --config.file=/usr/local/prometheus/prometheus.yml --storage.tsdb.path=/usr/local/prometheus/data
ExecReload=/bin/kill -HUP $MAINPID
[Install]
WantedBy=multi-user.target
复制代码
10. 系统启动
systemctl enable prometheus #设置成开机自启
systemctl daemon-reload #每次更改prometheus.service都需要执行一次,重新加载一下
systemctl start prometheus
systemctl status prometheus #看到acvting代表成功
复制代码
安装一个nodeExport监控宿主机的机器指标
1. 下载export
https://prometheus.io/download/#node_exporter
tar -xvf node_exporter-1.3.1.linux-amd64.tar.gz
复制代码
2. 启动这个export
./node_exporter --web.listen-address localhost:8085
复制代码
3. 在prometheus 配置这个export
- job_name: 'node'
static_configs:
- targets: ['localhost:8085']
复制代码
4. 设置为自启动
配置文件
[Unit]
Description=node_exporter
Documentation=node_exporter Monitoring System
After=network.target
[Service]
ExecStart= /usr/local/prometheus/export/node_exporter-1.3.1.linux-amd64/node_exporter --web.listen-address localhost:8085
[Install]
WantedBy=multi-user.target
复制代码
启动服务
设置为自启动服务设置如下
systemctl start node_export 开启服务
systemctl stop node_export 关闭服务
systemctl restart node_export 重启服务
systemctl status node_export 查看服务状态
systemctl enable node_export 将服务设置为开机自启动
systemctl disable node_export 禁止服务开机自启动
systemctl is-enabled node_export 查看服务是否开机启动
systemctl list-unit-files|grep enabled 查看开机启动的服务列表
systemctl --failed 查看启动失败的服务列表
复制代码
测试
查看机器内存
node_memory_active_bytes
复制代码
查看机器负载
node_load1
复制代码
启动一个springBoot项目
启动这个项目
java -jar demo-0.0.1-SNAPSHOT.jar &
复制代码
prometheus 配置这个export
#SpringBoot应用配置
- job_name: 'springbootPrometheusGrafana'
scrape_interval: 5s
metrics_path: '/actuator/prometheus'
static_configs:
- targets: ['192.168.8.1:8080']
复制代码
安装 Grafana
1. 按照系统安装当前系统,安装
https://grafana.com/docs/grafana/latest/installation/
复制代码
-
刷新配置:sudo systemctl daemon-reload
-
启动:sudo systemctl start grafana-server
-
停止:sudo systemctl stop grafana-server
-
状态:sudo systemctl status grafana-server
下载合适的插件
https://grafana.com/grafana/dashboards/
复制代码
配置grafana的数据源为prometheus (http://ip:9090/)
springboot监控
系统监控
报警配置
配置邮件服务
[smtp] enabled = true host = smtp.qq.com:465 user = [email protected] password = yyzplvajtlgmbedc cert_file = key_file = skip_verify = false from_address = [email protected] from_name = Grafana ehlo_identity =
[emails] welcome_email_on_sign_up = false templates_pattern = emails/*.html