Author: JackTian
Source: Public Account "Jake's IT Journey"
ID: Jake_Internet
Link: I wrote a script to monitor the abnormality of the ElasticSearch process!
Server configuration key-free environment preparation:
Before configuring keyless, you need to configure the corresponding relationship between the target host name and IP in the hosts file of the server.
vim /etc/hosts
IP1 hostname1
IP2 hostname2
......
Unzip the mianmiyaojiaoben.zip installation package in the current directory
cd /usr/local/jiaoben
unzip mianmiyaojiaoben.zip
Modify the mianmiyao_config configuration file, add the target host name and target host password, and call it by using a key-free script.
vim mianmiyao_config
AllHosts=hostname1,hostname2
Passwd='test23!\@Test^&*','test23!\@Test^&*'
In the configuration file, note:
-
AllHosts: You can configure the hostname of the current host leading to the target host, which can save the key itself, and the number is not limited. Multiple target hosts need to be separated by commas
-
Passwd: The password corresponding to the host, the sequence needs to correspond to the sequence of the host
-
If the original password is: test23!@Test^&*, the password characters with special symbols can be escaped with \ character
Contents of mianmiyao.sh script file:
vim mianmiyao.sh
#!/bin/bash -x
source mianmiyao_config
yum -y install expect expect-devel
#rm -rf /root/.ssh/*
/usr/bin/expect -d <<-EOF
set timeout 100
spawn ssh-keygen -t rsa
expect {
"*id_rsa):" { send "\r"; exp_continue }
"*(y/n)?" { send "y\r"; exp_continue }
"*passphrase)*" { send "\r"; exp_continue }
"*again:" { send "\r"; exp_continue }
"*-------+" { send "\r"}
}
expect eof
EOF
hostsarr=(${hosts//,/ })
passwdarr=(${passwd//,/ })
num=${#hostsarr[@]}
for((i=0;i<num;i++));
do
/usr/bin/expect <<-EOF
set timeout 100
spawn ssh-copy-id ${hostsarr[i]}
expect {
"*(yes/no)?" { send "yes\r"; exp_continue }
"*password:" { send "${passwdarr[i]}\r"; exp_continue }
"*authorized_keys*" { send "\r"}
}
expect eof
exit
EOF
done
Add execute permission to mianmiyao.sh file and execute this script
chmod +x mianmiyao.sh
./mianmiyao.sh
After the script is executed, you can manually execute the following command first. If you jump to the corresponding target server without entering a password, it means success.
ssh hostname2
Server deployment monitoring ElasticSearch environment preparation:
Add the corresponding ES cluster host name, ES port, and ES master node server host name to the cpufreedisk_config configuration file.
vim cpufreedisk_config
# 所有 ES 集群的主机名,用英文逗号分隔,需要在免密钥机器上执行
EsHosts=hostname1,hostname2
# ES 端口
EsPort=9200
# ES 主节点服务器的主机名
EsMaster=hostname1
Put the cpufreedisk.sh script file into the /usr/local/jiaoben/ directory of the ElasticSearch server
#!/bin/bash
# @Time : 2023/02/01
# @Author : JackTian
# @File : cpufreedisk.sh
# @Desc : 使用该脚本监控 ES 系统程序假死、挂掉、异常及服务器断网、宕机服务器恢复后,程序做判断恢复/检测服务器cpu内存磁盘。
# 使用前提:ES 集群服务器配置免密钥
# 使用方法:将 cpufreedisk.sh 脚本放置 ES 服务器的 /usr/local/jiaoben/ 目录下、在 cpufreedisk_config 中配置 ES 集群的主机名、端口、ES 主节点服务器的主机名
# 设置定时任务(可以事先手动执行)
# 0 6 * * * source /etc/profile && cd /usr/local/jiaoben && ./cpufreedisk.sh
source /usr/local/jiaoben/cpufreedisk_config
function esStatus
{
curl --connect-timeout 30 -m 60 $1:$esport > resultEsCurl.log
echo "`cat resultEsCurl.log | grep cluster_name`"
}
function esLost
{
iptemp=`cat /etc/hosts | grep -w $1 | grep '^[^#]' | awk '{print $1}'`
curl --connect-timeout 30 -m 60 $esMaster:$esport/_cat/nodes?v | grep $iptemp > resultEsCurl1.log
echo "`cat resultEsCurl1.log`"
}
function esDie
{
ssh $1 "source /etc/profile && jps | grep Elasticsearch | awk '{print \$1}' | xargs"
}
function restart
{
ssh $1 <<EOF
echo "请手动启动 ES 进程"
exit
EOF
}
today=$(date +"%Y-%m-%d")
todaytime=`date`
#针对 ES 做假死、宕机、挂掉,做日志记录和处理
serverroothostname=(${esHosts//,/ })
for rootHost in ${serverroothostname[*]}
do
esStatusResult=`esStatus $rootHost`
echo "$rootHost 的状态为: $esStatusResult"
if [ -n "$esStatusResult" ];then
esLostResult=`esLost $rootHost`
echo "$rootHost 的状态为: $esLostResult"
if [ -n "$esLostResult" ];then
echo "ES 运行状态正常。"
else
echo "$rootHost 脱离集群。"
echo "${todaytime}ES的${rootHost}节点脱离集群。请人工排查" >> /usr/local/jiaoben/ESmanager.log
restart $rootHost
fi
else
echo "${todaytime}xxx系统$rootHost 的 ES 进程运行状态异常,启动重启中..." >> /usr/local/jiaoben/ESmanager.log
echo "${todaytime}xxx系统$rootHost 重启" >> /usr/local/jiaoben/ESmanager.log
ssh $rootHost <<EOF >>/usr/local/jiaoben/ESmanager.log
mkdir -p /usr/local/jiaoben/
cd /usr/local/jiaoben/
echo "--------------------------------------服务器分割线-------------------------------------------"
echo "$rootHost磁盘信息"
df -h
echo "$rootHost内存信息(单位为:G)"
free -h
echo "$rootHost的CPU信息"
vmstat
exit
EOF
if [ $? -eq 0 ];then
esDieResult=`esDie $rootHost`
if [ -n "$esDieResult" ];then
echo "${todaytime}xxx系统 ES 出现假死,已执行重启临时解决,详情参看日志" >> /usr/local/jiaoben/ESmanager.log
else
echo "${todaytime}xxx系统 ES 未启动,已执行重启临时解决,详情参看日志" >> /usr/local/jiaoben/ESmanager.log
fi
else
echo "${todaytime}xxx系统 ES 服务器疑似宕机:无法 ssh 登录" >> /usr/local/jiaoben/ESmanager.log
fi
restart $rootHost
fi
done
Add executable permission to the cpufreedisk.sh script file and execute it
chmod +x cpufreedisk.sh
./cpufreedisk.sh
Set periodic timing tasks and execute them regularly every day.
crontab -e
# 使用该脚本监控 ES 系统程序假死、挂掉、异常及服务器断网、宕机服务器恢复后,程序做判断恢复/检测服务器cpu内存磁盘。
0 6 * * * source /etc/profile && cd /usr/local/jiaoben && ./cpufreedisk.sh
Recommended reading:
Wrote a script that automatically inspects multiple interface addresses!
7 very useful Shell script examples!
Super hardcore! 11 very practical examples of Python and Shell scripts!
The above is all the content to be shared today.
If you think this article is useful to you, please like this article, leave a comment or forward it, so that more friends can see it, because this will be the strongest motivation for me to continue to output more high-quality articles!