Requirements: Use a variety of personalized custom shell alert tool, but requires a unified management, standardized management.
Our machine roles varied, but all machines to be deployed on the same monitoring system, also said that all the machines no matter what the role, the entire framework is the same, difference is that according to different roles, different customized configurations file.
1. The idea: to specify a script package that contains the main program, subroutines, configuration files, e-mail engine, output logs
The main program: the entire script as the entrance, is the lifeblood of the whole system
Profile: a control center, each switch defined monitoring subroutine; and specify the respective associated application log files.
Subroutine: this is the real monitoring scripts used to monitor various indicators.
Spam Engine: is implemented by a python program that can define the mail server, the mail man and sender's password
Output log: the monitoring system must have the log output. Easy to debug and troubleshooting
2. The procedural framework
- The bin is the main program
- Under conf is the configuration file
- Shares are under various monitoring sub-script
- Under mail is spam engine
- Under log is a log.
3. Program
1)bin/main.sh
#!/bin/bash # Switch whether to send the message Export. 1 = Send # ip address filtering Export addr = `/ sbin / the ifconfig | grep -A1" eth0 "| awk -F '[:] +' '/ inet /. 4} {Print $'` # Finally, only a directory name, the dir = `pwd` last_dir the dir =` echo $ | awk -F '/' 'of NF} {Print $' ` # The following judgments purpose is to ensure that when we execute the script in the bin directory, or monitoring scripts, and log messages likely not be found if [$ last_dir == "bin" ] || [$ last_dir == "bin / "]; the then conf_file =" ../ the conf / mon.conf " the else echo" CD bin you shoud the dir " Exit Fi ../ log. 1 >> Exec / 2 >> ../ mon.log log / err.log # sub-execution monitoring load the shell echo "DATE +` "% F.% T" `Load Average" / bin / the bash. ./shares/load.sh # Perform monitoring sub-shell 502 of the web server ## to check whether configuration files need to be monitored 502 IF grep -q 'to_mon_502 = 1' $ conf_file; the then ## call log directory in the path mon.conf, as a global variable, 502 sub-shell can be called Export log = `grep 'logfile =' $ conf_file | awk -F '=' 'Print $ {2}' | Sed 'S / // g'` / bin / ../shares/502 the bash. SH fi |
2) conf / mon.conf
#Define mysql server address, port, and User, password to_mon_cdb 0 or ## = 0. 1, default 0,0 Not Monitor, Monitor. 1 db_ip = 10.20.3.13 DB_PORT = 3315 DB_USER = username db_pass the passwd = # Httpd If a monitor is not monitoring 0 to_mon_httpd = 0 # php to_mon_php_socket=0 # Http_code_502 define an access log path to_mon_502 = 0 logfile = / Data / log / xxx.xxx.com / the access.log # Request_count defined path, and the log domain to_mon_request_count = 0 req_log = / Data / log / www.discuz.net / the access.log DOMAINNAME = www.discuz.net |
3) mail/mail.sh
log = $ 1 # mail.sh first parameter t_s = `date +% s` time before the timestamp # define two hours, just to meet $ v -gt 3600, is called mail.sh i.e., able to send mail t_s2 = `date -d" 2 hours ago "+% s` if [ ! -f /tmp/$log ] then echo $t_s2 > /tmp/$log fi # 第一次使用2个小时之前的时间戳,第二次使用刚刚执行mail.sh时的时间戳 t_s2=`tail -1 /tmp/$log|awk '{print $1}'` echo $t_s>>/tmp/$log v=$[$t_s-$t_s2] echo $v # 1个小时之内,执行else后面的语句,告警超过10次,发送邮件,计数器清零 if [ $v -gt 3600 ] then ./mail.py $1 $2 $3 echo "0" > /tmp/$log.txt #如果存在,必须清空该文件的内容,并写入计数器的初始值0 else if [ ! -f /tmp/$log.txt ] then echo "0" > /tmp/$log.txt fi nu=`cat /tmp/$log.txt` nu2=$[$nu+1] echo $nu2>/tmp/$log.txt if [ $nu2 -gt 10 ] then ./mail.py $1 "trouble continue 10 min $2" "$3" echo "0" > /tmp/$log.txt fi fi |
4)shares/load.sh
#! /bin/bash load=`uptime |awk -F 'average:' '{print $2}'|cut -d',' -f1|sed 's/ //g' |cut -d. -f1` if [ $load -gt 10 ] && [ $send -eq "1" ] then echo "$addr `date +%T` load is $load" >../log/load.tmp /bin/bash ../mail/mail.sh [email protected] "$addr\_load:$load" `cat ../log/load.tmp` fi echo "`date +%T` load is $load" |
5)shares/502.sh
#! /bin/bash d=`date -d "-1 min" +%H:%M` c_502=`grep :$d: $log|grep '502'|wc -l` if [ $c_502 -gt 10 ] && [ $send == 1 ]; then echo "$addr $d 502 count is $c_502">../log/502.tmp /bin/bash ../mail/mail.sh $addr\_502 $c_502 ../log/502.tmp fi echo "`date +%T` 502 $c_502" |