Prometheus第一篇

近段时间由于工作要求接触了prometheus 做了以下总结
什么是Prometheus?
Prometheus是由SoundCloud开发的开源监控报警系统和时序列数据库(TSDB)。Prometheus使用Go语言开发,是Google BorgMon监控系统的开源版本。
2016年由Google发起Linux基金会旗下的原生云基金会(Cloud Native Computing Foundation), 将Prometheus纳入其下第二大开源项目。
Prometheus目前在开源社区相当活跃。
Prometheus和Heapster(Heapster是K8S的一个子项目,用于获取集群的性能数据。)相比功能更完善、更全面。Prometheus性能也足够支撑上万台规模的集群。
Prometheus的特点
• 多维度数据模型。
• 灵活的查询语言。
• 不依赖分布式存储,单个服务器节点是自主的。
• 通过基于HTTP的pull方式采集时序数据。
• 可以通过中间网关进行时序列数据推送。
• 通过服务发现或者静态配置来发现目标服务对象。
• 支持多种多样的图表和界面展示,比如Grafana等。
官网地址:https://prometheus.io/
架构图
在这里插入图片描述

监控端部署Prometheus Service
选择合适版本下载安装https://github.com/prometheus/prometheus/blob/v2.23.0/RELEASE.md

给prometheus加权限
chmod 777 prometheus
在这里插入图片描述

prometheus.yml采集配置文件
scrape_interval 采集间隔
evaluation_interval 触发告警检测的时间
在这里插入图片描述
添加采集主机ip
prometheus.yml

  - job_name: 'process'
    static_configs:
    - targets: ['192.168.xxx.xxx:9256']
  - job_name: 'node'
    static_configs:
    - targets: ['192.168.xxx.xxx:9100']
  - job_name: 'jmx'
    static_configs:
    - targets: ['192.168.xxx.xxx:8099']
  - job_name: 'mysql'
    static_configs:
    - targets: ['192.168.xxx.xxx:9104']
 - job_name: 'oracle'
    static_configs:
    - targets: ['192.168.xxx.xxx:9161']
  - job_name: 'redis'
    static_configs:
    - targets: ['192.168.xxx.xxx:9121']
  - job_name: 'kafak'
    static_configs:
    - targets: ['192.168.xxx.xxx:9308']
  - job_name: 'zookeeper'
    static_configs:
    - targets: ['192.168.xxx.xxx:9141']

重新加载配置文件
reload.sh

curl -XPOST http://localhost:9090/-/reload

start.sh启动采集

nohup ./prometheus --config.file=prometheus.yml --storage.tsdb.retention.time=3d --log.level=debug > logs/prometheus.log 2>&1 &

在被监控端部署exporter
https://github.com/prometheus/mysqld_exporter
https://github.com/prometheus/node_exporter
https://github.com/prometheus/jmx_exporter
https://github.com/ncabatoff/process-exporter
mysql
这个要现在数据库创建用户

CREATE USER 'exporter'@'localhost' IDENTIFIED BY 'exporter' WITH MAX_USER_CONNECTIONS 3;
GRANT PROCESS, REPLICATION CLIENT, SELECT ON *.* TO 'exporter'@'localhost';
 

grafana 7362

oracle

https://github.com/iamseth/oracledb_exporter
https://github.91chifun.workers.dev//https://github.com/iamseth/oracledb_exporter/releases/download/0.2.9/oracledb_exporter.0.2.9-ora18.5.linux-amd64.tar.gz

这个需要在oracle client
目前在数据库主机对应数据库用户下
启动时需要增加环境变量,我写了个启动脚本:

   cat start.sh 
    export ORACLE_HOME='/oravl01/oracle/oracle/12CR2'
    export TNS_ADMIN=$ORACLE_HOME/network/admin
    export NLS_LANG='simplified chinese_china'.ZHS16GBK
    export LD_LIBRARY_PATH=$ORACLE_HOME/lib
    export PATH=$ORACLE_HOME/bin:$PATH
    export DATA_SOURCE_NAME='ai_sel/ai_sel$@CMIUAT'
启动时会报这个错误:
./oracledb_exporter: error while loading shared libraries: libclntsh.so.18.1: cannot open shared object file: No such file or directory

需要在$ORACLE_HOME/lib下建个软连接

ln -s libclntsh.so.12.1 libclntsh.so.18.1
grafana 3333
redis
https://github.com/oliver006/redis_exporter
https://github.91chifun.workers.dev//https://github.com/oliver006/redis_exporter/releases/download/v1.11.1/redis_exporter-v1.11.1.linux-amd64.tar.gz
启动脚本
cat start.sh 
nohup ./redis_exporter --redis.addr redis://192.168.xxx.xxx:11000 --redis.password 'RM12u3a4’ &
grafana 763
kafak
https://github.com/danielqsj/kafka_exporter
https://github.91chifun.workers.dev//https://github.com/danielqsj/kafka_exporter/releases/download/v1.2.0/kafka_exporter-1.2.0.linux-amd64.tar.gz
启动脚本
cat start.sh 
nohup ./kafka_exporter --kafka.server=192.168.xxx.xxx:12001 &
grafana 7589    

zookeeper

https://github.com/dabealu/zookeeper-exporter
https://github.91chifun.workers.dev//https://github.com/dabealu/zookeeper-exporter/releases/download/v0.1.10/zookeeper-exporter-v0.1.10-linux.tar.gz

启动脚本

cat start.sh 
nohup ./kafka_exporter --kafka.server=192.168.xxx.xxx:12001 &

grafana 11442

这里只安装了四个
在这里插入图片描述

start.sh启动脚本

echo -e "\033[31m  Failed to start, try again several times--\033[0m"

PWD=$(pwd)
echo ${
    
    PWD}
if ps -ef |grep node_exporter|grep -v grep  |wc -l
then
 #  NODE_PATH=/data/uatBuser01/prometheus/node_exporter-1.0.0-rc.1.linux-amd64
   nohup $(pwd)/node_exporter-1.0.0-rc.1.linux-amd64/node_exporter
fi
if ps -ef |grep process-exporter |grep -v grep  |wc -l
then
#   PROCESS_PATH=/data/uatBuser01/prometheus/process
   nohup $(pwd)/process/process-exporter -network -threads=false -gather-smaps=false  -config.path $(pwd)/process/app.yaml &
fi
if ps -ef |grep localjmx_httpserver |grep -v grep  |wc -l
then   
 #   /opt/jdk1.8.0_151/bin/java -cp /opt/jdk1.8.0_151/lib/tools.jar:$(pwd)/jmx/localjmx_httpserver-0.10.0-jar-with-dependencies.jar io.prometheus.jmxagent.StartJmxLocal 0.0.0.0:8099 $(pwd)/jmx/jvm.yml &

   nohup /opt/jdk1.8.0_151/bin/java -cp /opt/jdk1.8.0_151/lib/tools.jar:${
    
    PWD}/jmx/localjmx_httpserver-0.11.0-jar-with-dependencies.jar io.prometheus.jmxagent.StartJmxLocal 0.0.0.0:8099 ${
    
    PWD}/jmx/jvm.yml &
fi

if ps -ef |grep  mysqld_exporter |grep -v grep  |wc -l
then
export DATA_SOURCE_NAME='exporter:exporter@(localhost:3306)/'
nohup $(pwd)/mysqld_exporter/mysqld_exporter & 
fi

stop.sh停止脚本


node_pid=`ps -ef|grep node_exporter|grep -v grep|awk '{print $2}'`
if [ -z "$node_pid" ];
then
   echo "[not find node_exporter pid]"
else
   echo "find result:$node_pid"
   kill $node_pid
   echo "[ node_exporter pid has been kill]"
fi
process_pid=`ps -ef|grep process-exporter|grep -v grep|awk '{print $2}'`
if [ -z "$process_pid" ];
then
   echo "[not find process-exporter pid]"
else
   echo "find result:$process_pid"
   kill $process_pid
   echo "[ process-exporter pid has been kill]"
fi
localjmx_pid=`ps -ef|grep localjmx_httpserver|grep -v grep|awk '{print $2}'`
if [ -z "$localjmx_pid" ];
then
   echo "[not find localjmx_httpserver pid]"
else
   echo "find result:$localjmx_pid"
   kill $localjmx_pid
   echo "[ localjmx_httpserver pid has been kill]"
fi
mysql_pid=`ps -ef |grep mysqld_exporter |grep -v grep |awk '{print $2}'`
if [ -z "$mysql_pid"];
then 
   echo "find result:$mysql_pid"
   kill $mysql_pid
   echo "[mysqld_exporter pid has been kill]"
fi

1. Web查看
http://172.22.xxx.xxx:9090/targets
在这里插入图片描述

学习思考创新 行动改进成功
头发掉起来

猜你喜欢

转载自blog.csdn.net/qq_41072487/article/details/110441449
今日推荐