Shell small project--alarm system

A needs analysis

  • Requirements: Use shell to customize various personalized alarm tools, but need unified management and standardized management.
    In Linux operation and maintenance, most of the work revolves around monitoring. Except for more mature solutions such as zabbix, sometimes zabbix cannot fully meet our needs. For example, to do some relatively unpopular monitoring, zabbix needs to be written from Define monitoring scripts, transmit data, etc. Sometimes there is a problem with the network, and communication from the client to the server cannot be communicated. Zabbix monitoring cannot report data to the server. At this time, a shell script can be used to temporarily monitor; here The shell script is distributed, and this monitoring script needs to be placed on each server. Each machine can be monitored independently without relying on other machines. The more important thing is to make a mail system, mail
  • Ideas: Specify a script package, including the main program, subprograms, configuration files, mail engine, output log, etc.
  • Main program: As the entrance of the whole script, it is the lifeblood of the whole system.
  • Configuration file: It is a control center, which is used to switch each subprogram and specify each associated log file.
  • Subroutine: This is the real monitoring script, which is used to monitor various indicators.
  • Mail engine: It is implemented by a python program, which can define the server that sends the mail, the sender and the sender password
  • Output log: The entire monitoring system must have log output.

Requirements: Our machines have various roles, but the same monitoring system must be deployed on all machines, which means that no matter what the role of all machines, the entire program framework is the same. The difference is that different customizations are made according to different roles. configuration file.
Program structure:
               Enter image description     
Under bin is the main program
conf under the configuration file
shares under the various monitoring scripts
mail under the mail engine
log under the log.

Two main screenplays

First we can create the home directory and all the script directories below,

[root@lijie-01 ~]# cd /usr/local/sbin
[root@lijie-01 sbin]# mkdir mon
[root@lijie-01 sbin]# cd mon
[root@lijie-01 mon]# ls
[root@lijie-01 mon]# mkdir bin conf shares log mail
[root@lijie-01 mon]# ls
bin  conf  log  mail  shares
[root@lijie-01 mon]# cd bin
[root@lijie-01 bin]# vim main.sh
[root@lijie-01 bin]#

Then we write the main script as follows:

 #!/bin/bash
# 是否发送邮件的开关,当send=1时,表示给下面所有的监控项目都会发送邮件,如果维护时,可以选择将发送邮件的这个总开关设置为0
export send=1    
# 过滤ip地址,表示上报告警的ip地址,因每台机器独立上报,需要单独过滤
export addr=`/sbin/ifconfig |grep -A1 "ens33: "|awk '/inet/ {print $2}'`
# 获取当前所在目录路径
dir=`pwd`
# 只需要最后一级目录名
last_dir=`echo $dir|awk -F'/' '{print $NF}'`
# 下面的判断目的是,保证执行脚本的时候,我们在bin目录里,不然监控脚本、邮件和日志很有可能找不到,这也意味着当执行main.sh这个脚本之前我们先要切换到这个目录里面来
if [ $last_dir == "bin" ] || [ $last_dir == "bin/" ]; then
    conf_file="../conf/mon.conf"
else
    echo "you shoud cd bin dir"
    exit
fi
#输出正确的日志和错误日志
exec 1>>../log/mon.log 2>>../log/err.log
#将系统负载标记上时间,
echo "`date +"%F %T"` load average"
# 执行子脚本
/bin/bash ../shares/load.sh
#先检查配置文件中是否需要监控502
if grep -q 'to_mon_502=1' $conf_file; then
    export log=`grep 'logfile=' $conf_file |awk -F '=' '{print $2}' |sed 's/ //g'`
    /bin/bash  ../shares/502.sh
fi

Three configuration files

The configuration file mainly defines some switches and some log paths. The configuration file needs to be placed in the path of ../conf/mon.conf. The content of mon.conf that we have defined in the main script

 ## to config the options if to monitor 
## 定义mysql的服务器地址、端口以及user、password
to_mon_cdb=0   ##是否监控cdb数据库,0 or 1, default 0,0 not monitor, 1 monitor
db_ip=10.20.3.13
db_port=3315
db_user=username
db_pass=passwd
## httpd   如果是1则监控,为0不监控
to_mon_httpd=0
## php 如果是1则监控,为0不监控
to_mon_php_socket=0
## http_code_502  需要定义访问日志的路径
to_mon_502=1
logfile=/data/log/xxx.xxx.com/access.log   ##状态码来源于这个日志
## request_count   定义日志路径以及域名
to_mon_request_count=0  #是否监控请求数
req_log=/data/log/www.discuz.net/access.log
domainname=www.discuz.net   ##如果机器数量不多,子脚本需要用到的资源是可以在子脚本中定义的,但如果机器数量很多,再到子脚本中定义通用性不强

Four monitoring items (subscripts)

load.sh content

 #! /bin/bash
#将当前负载值赋值给load
load=`uptime |awk -F 'average:' '{print $2}'|cut -d',' -f1|sed 's/ //g' |cut -d. -f1`
#如果负载大于10并且允许发送邮件,则调用邮件发送脚本
if [ $load -gt 10 ] && [ $send -eq "1" ]
then
    echo "$addr `date +%T` load is $load" >../log/load.tmp
    /bin/bash ../mail/mail.sh [email protected] "$addr\_load:$load" `cat ../log/load.tmp`
fi
#否则记录一条负载值到日志中
echo "`date +%T` load is $load"

502.sh content

#! /bin/bash
#时间设置为1分钟以前,因为监控是1分钟执行一次,所以这里我们查看到的日志是一分钟以前的
d=`date -d "-1 min" +%H:%M`
c_502=`grep :$d:  $log  |grep ' 502 '|wc -l`
if [ $c_502 -gt 10 ] && [ $send == 1 ]; then
     echo "$addr $d 502 count is $c_502">../log/502.tmp
     /bin/bash ../mail/mail.sh $addr\_502 $c_502  ../log/502.tmp
fi
echo "`date +%T` 502 $c_502"

disk.sh content

  

Five Mail Engines

Content of mail.sh ``` #This script is used for alarm convergence # Here $1 refers to the first parameter log=$1 followed by the code that sends the alarm email in the monitoring subscript #defines a timestamp t_s= date +%s#Define the timestamp two hours ago, which is used for the first time an exception occurs when the script is executed for the first time. Alarm t_s2= #If the date -d "2 hours ago" +%slog does not exist, write the timestamp two hours ago into the log if [ ! -f /tmp/$log ] then echo $t_s2 > /tmp/$log fi t_s2= tail -1 /tmp/$log|awk '{print $1}'echo $t_s>>/tmp/$log #Compare two timestamps v=$[$t_s-$t_s2] echo $v #If the interval between two timestamps is greater than 1 hour, alarm if [ $v -gt 3600 ] then ./mail.py $1 $2 $3 #$log.txt is used for counting, and a new count will start echo "0" > /tmp/$log.txt else if [ ! -f /tmp/$log.txt ] then echo "0" > /tmp/$log.txt fi nu= cat /tmp/$log.txtnu2=$[$nu+1] echo $nu2 >/tmp/$log.txt if [ $nu2 -gt 10 ] then ./mail.py $1 "trouble continue 10 min $2" "$3" echo "0" > /tmp/$log.txt fi fi  


五 运行监控项目
===========

首先编辑计划任务

crontab -e

任务计划如下:每分钟执行一次,首先进入/usr/local/sbin/mon/bin目录下,然后执行主脚本。必须要进入bin目录下,不然脚本没办法正常执行
          • cd /usr/local/sbin/mon/bin; bash main.sh







Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325811899&siteId=291194637