prometheus学习实践

安装prometheus 默认监控prometheus

1. 下载 安装包

https://prometheus.io/download/
复制代码

2. 解压

tar -xvf prometheus-2.33.1.linux-amd64.tar.gz
复制代码

3. 创建软件目录

mkdir prometheus 
复制代码

4. 移动软件到prometheus

mv prometheus-2.33.1.linux-amd64 /usr/local/prometheus 
复制代码

5. 默认监控本机器

global:
  scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
    - static_configs:
        - targets:
          # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"   
        
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself. 
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: "prometheus"
    scrape_interval: 5s
    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.
    static_configs: 
      - targets: ["localhost:9090"]
  - job_name: 'node'
    scrape_interval: 5s
    static_configs:
      - targets: ['localhost:8085']

复制代码

6. 启动服务

./prometheus --config.file=prometheus.yml

复制代码
启动日志
 ./prometheus --config.file=prometheus.yml
ts=2022-02-11T15:28:43.910Z caller=main.go:475 level=info msg="No time or size retention was set so using the default time retention" duration=15d
ts=2022-02-11T15:28:43.911Z caller=main.go:512 level=info msg="Starting Prometheus" version="(version=2.33.1, branch=HEAD, revision=4e08110891fd5177f9174c4179bc38d789985a13)"
ts=2022-02-11T15:28:43.911Z caller=main.go:517 level=info build_context="(go=go1.17.6, user=root@37fc1ebac798, date=20220202-15:23:18)"
ts=2022-02-11T15:28:43.911Z caller=main.go:518 level=info host_details="(Linux 4.15.0-128-generic #131-Ubuntu SMP Wed Dec 9 06:57:35 UTC 2020 x86_64 jessy (none))"
ts=2022-02-11T15:28:43.911Z caller=main.go:519 level=info fd_limits="(soft=65535, hard=65535)"
ts=2022-02-11T15:28:43.911Z caller=main.go:520 level=info vm_limits="(soft=unlimited, hard=unlimited)"
ts=2022-02-11T15:28:43.913Z caller=web.go:570 level=info component=web msg="Start listening for connections" address=0.0.0.0:9090
ts=2022-02-11T15:28:43.913Z caller=main.go:923 level=info msg="Starting TSDB ..."
ts=2022-02-11T15:28:43.920Z caller=head.go:493 level=info component=tsdb msg="Replaying on-disk memory mappable chunks if any"
ts=2022-02-11T15:28:43.920Z caller=head.go:527 level=info component=tsdb msg="On-disk memory mappable chunks replay completed" duration=3.474µs
ts=2022-02-11T15:28:43.920Z caller=head.go:533 level=info component=tsdb msg="Replaying WAL, this may take a while"
ts=2022-02-11T15:28:43.922Z caller=tls_config.go:195 level=info component=web msg="TLS is disabled." http2=false
ts=2022-02-11T15:28:43.922Z caller=head.go:604 level=info component=tsdb msg="WAL segment loaded" segment=0 maxSegment=0
ts=2022-02-11T15:28:43.923Z caller=head.go:610 level=info component=tsdb msg="WAL replay completed" checkpoint_replay_duration=36.242µs wal_replay_duration=2.130318ms total_replay_duration=2.305903ms
ts=2022-02-11T15:28:43.926Z caller=main.go:944 level=info fs_type=EXT4_SUPER_MAGIC
ts=2022-02-11T15:28:43.926Z caller=main.go:947 level=info msg="TSDB started"
ts=2022-02-11T15:28:43.926Z caller=main.go:1128 level=info msg="Loading configuration file" filename=prometheus.yml
ts=2022-02-11T15:28:43.927Z caller=main.go:1165 level=info msg="Completed loading of configuration file" filename=prometheus.yml totalDuration=793.597µs db_storage=912ns remote_storage=1.85µs web_handler=434ns query_engine=1.057µs scrape=284.374µs scrape_sd=41.261µs notify=37.028µs notify_sd=15.439µs rules=1.847µs
ts=2022-02-11T15:28:43.927Z caller=main.go:896 level=info msg="Server is ready to receive web requests."

复制代码

7.访问测试

http://ip:9090/targets
复制代码

HdyDzR.png

  • Labels :表示监控的目标

8.查看当前健康的指标

http://ip:9090/metrics
复制代码
# HELP go_gc_cycles_automatic_gc_cycles_total Count of completed GC cycles generated by the Go runtime.
# TYPE go_gc_cycles_automatic_gc_cycles_total counter
go_gc_cycles_automatic_gc_cycles_total 7
# HELP go_gc_cycles_forced_gc_cycles_total Count of completed GC cycles forced by the application.
# TYPE go_gc_cycles_forced_gc_cycles_total counter

复制代码

9. 将prometheus设置为系统服务

sudo vim /etc/systemd/system/prometheus.service
[Unit]
Description=prometheus
After=network-online.target

[Service]
User=root
Restart=on-failure
ExecStart=/usr/local/prometheus/prometheus --config.file=/usr/local/prometheus/prometheus.yml --storage.tsdb.path=/usr/local/prometheus/data
ExecReload=/bin/kill -HUP $MAINPID
[Install]
WantedBy=multi-user.target

复制代码

10. 系统启动

systemctl enable prometheus #设置成开机自启
systemctl daemon-reload #每次更改prometheus.service都需要执行一次,重新加载一下
systemctl start prometheus
systemctl status prometheus #看到acvting代表成功

复制代码

安装一个nodeExport监控宿主机的机器指标

1. 下载export

https://prometheus.io/download/#node_exporter

tar -xvf node_exporter-1.3.1.linux-amd64.tar.gz

复制代码

2. 启动这个export

./node_exporter --web.listen-address localhost:8085


复制代码

3. 在prometheus 配置这个export

- job_name: 'node'
    static_configs:
      - targets: ['localhost:8085']
复制代码
4. 设置为自启动
配置文件

[Unit]
Description=node_exporter
Documentation=node_exporter Monitoring System
After=network.target 

[Service]
ExecStart= /usr/local/prometheus/export/node_exporter-1.3.1.linux-amd64/node_exporter  --web.listen-address localhost:8085
[Install]
WantedBy=multi-user.target

复制代码
启动服务
设置为自启动服务设置如下
systemctl start node_export  开启服务
systemctl stop node_export   关闭服务
systemctl restart node_export    重启服务
systemctl status node_export    查看服务状态
systemctl enable node_export    将服务设置为开机自启动
systemctl disable node_export    禁止服务开机自启动
systemctl is-enabled node_export    查看服务是否开机启动
systemctl list-unit-files|grep enabled    查看开机启动的服务列表
systemctl --failed    查看启动失败的服务列表

复制代码
测试
查看机器内存
node_memory_active_bytes
复制代码

HwrUr6.png

查看机器负载
node_load1
复制代码

HwrpKP.png

启动一个springBoot项目

启动这个项目

java -jar demo-0.0.1-SNAPSHOT.jar &  
复制代码

prometheus 配置这个export

#SpringBoot应用配置
  - job_name: 'springbootPrometheusGrafana' 
    scrape_interval: 5s
    metrics_path: '/actuator/prometheus'
    static_configs:
      - targets: ['192.168.8.1:8080']

复制代码

安装 Grafana

1. 按照系统安装当前系统,安装

https://grafana.com/docs/grafana/latest/installation/
复制代码
  • 刷新配置:sudo systemctl daemon-reload

  • 启动:sudo systemctl start grafana-server

  • 停止:sudo systemctl stop grafana-server

  • 状态:sudo systemctl status grafana-server

下载合适的插件

https://grafana.com/grafana/dashboards/
复制代码

配置grafana的数据源为prometheus (http://ip:9090/)

springboot监控

H0dIVe.png

系统监控

H0wdJA.png


报警配置

配置邮件服务

[smtp] enabled = true host = smtp.qq.com:465 user = [email protected] password = yyzplvajtlgmbedc cert_file = key_file = skip_verify = false from_address = [email protected] from_name = Grafana ehlo_identity =

[emails] welcome_email_on_sign_up = false templates_pattern = emails/*.html

猜你喜欢

转载自juejin.im/post/7066079435379703822