CDH operation and maintenance impala service abnormal monitoring + Dingding alarm
Preface
Some schedules in the early morning of the company are impala tasks. Because impala's DDL operation connection is not released, impala will be restarted regularly at 23:30, during which restart failures will occur, so we will do a monitoring and nail notification to find problems and solve them in time
table of Contents
- Get Dingding Information
- Python script to send Dingding information
- Set up a scheduled task
Get Dingding Information
Obtained webhook and endorsement
webhook:
https://oapi.dingtalk.com/robot/send?access_token=[AAAA] (here [AAAA] is used instead of real environment data) Endorsement
:
[BBBB] (here, [BBBB] is used instead of real environment data )
Edit Python3.x script to send DingTalk notification
#!/home/user/nlpconda/bin/python
# -*- coding: UTF-8 -*-
# desc 监控impala端口,发送钉钉报警信息
# create by DBC
# time 20201118
# modify
from datetime import datetime
import json
import urllib.request
import time
import hmac
import hashlib
import base64
import urllib.parse
import telnetlib
import sys
# 发送钉钉消息
def send_request(url, datas):
# 传入url和内容发送请求
# 构建一下请求头部
header = {
"Content-Type": "application/json",
"Charset": "UTF-8"
}
sendData = json.dumps(datas) # 将字典类型数据转化为json格式
sendDatas = sendData.encode("utf-8") # python3的Request要求data为byte类型
# 发送请求
request = urllib.request.Request(url=url, data=sendDatas, headers=header)
# 将请求发回的数据构建成为文件格式
opener = urllib.request.urlopen(request)
# 打印返回的结果
print(opener.read())
# 获取钉钉发送数据类型
# 13000000000处写钉钉群用户的真实手机号,并在content添加【@手机号】,将isAtAll值改成False就可以 @该手机用户
def get_ddmodel_datas(type):
# 返回钉钉模型数据,1:文本;2:markdown所有人;3:markdown带图片,@接收人;4:link类型
if type == 1:
my_data = {
"msgtype": "text",
"text": {
"content": "test】我就是我, 是不一样的烟火"
},
"at": {
"atMobiles": [
"13000000000"
],
"isAtAll": True
}
}
elif type == 3:
my_data = {
"msgtype": "markdown",
"markdown": {
"title": " ",
"text": " "
},
"at": {
"atMobiles": [
"13000000000"
],
"isAtAll": True
}
}
return my_data
# 获取钉钉通知URL的签名
# 参考钉钉开放平台进行简单封装 https://ding-doc.dingtalk.com/doc#/serverapi2/qf2nxq
def get_sign(timestamp, secret):
secret_enc = secret.encode('utf-8')
string_to_sign = '{}\n{}'.format(timestamp, secret)
string_to_sign_enc = string_to_sign.encode('utf-8')
hmac_code = hmac.new(secret_enc, string_to_sign_enc, digestmod=hashlib.sha256).digest()
sign = urllib.parse.quote_plus(base64.b64encode(hmac_code))
return sign
# 获取钉钉markdown字符串,如下样式
#| 表头 | 表头 |
#| :--- | :--- |
#| 单元格 | 单元格 |
#| 单元格 | 单元格 |
def sg_md_deal(head, data):
sig = '|'
cols_list = head
sig_cols = sig + sig.join(('**' + i + '**' for i in cols_list)) + sig + ' \n '
sig_tab = sig + sig.join([':---' for i in range(len(cols_list))]) + sig + ' \n '
sig_md_str = '### {} impala服务异常 \n'.format(datetime.now().strftime("%Y-%m-%d %H:%M"))
sig_md_str = sig_md_str + sig_cols + sig_tab
for value in data:
sig_val = sig + sig.join([' ' + str(i) for i in value]) + sig + ' \n '
sig_md_str = sig_md_str + sig_val
return sig_md_str
# 获取端口情况
def get_err_host_port(list_host, list_port):
list_err_host_port = []
for host in list_host:
for port in list_port:
try:
telnetlib.Telnet(host=host, port=port)
# 异常值进行通知,正常值忽略
# list_err_host_port.append((host, port, "正常"))
except:
list_err_host_port.append((host, port, "异常"))
return list_err_host_port
if __name__ == "__main__":
# 此处为webhook
my_url = "https://oapi.dingtalk.com/robot/send?" \
"access_token=【AAAA】"
# 此处为价签值
secret = '【BBBB】'
timestamp = str(round(time.time() * 1000))
sign = get_sign(timestamp=timestamp, secret=secret)
# 监控服务器列表
list_host = ['172.16.20.140', '172.16.20.141', '172.16.20.142', '172.16.20.143']
# 监控端口列表
# list_port = ['21000', '21050', '22000', '23000', '25000', '24000', '26000', '25010', '25020']
list_port = ['21050']
# 表头
head = [' 主机名', ' 端口', ' 状态']
# print('Main! The time is: %s' % datetime.now())
# 3.Markdown(带图片@对象)
my_data = get_ddmodel_datas(3)
my_data["markdown"]["title"] = "impala告警"
err_host_port = get_err_host_port(list_host=list_host,list_port=list_port)
# 如果无异常,退出程序,不发送钉钉消息。防止骚扰群用户,被用户(运维)屏蔽消息,起不到及时监控作用
if not err_host_port:
sys.exit()
my_data["markdown"]["text"] = sg_md_deal(head, err_host_port)
my_url = my_url + '×tamp=' + timestamp + '&sign=' + sign
print(my_url)
send_request(my_url, my_data)
Set up a scheduled task
crontab monitoring is executed every 3 minutes
# crontab -e 编辑
# 每5分钟调度一次,Python_home为Python3的路径,cron_task为上述脚本路径
*/5 * * * * ${Python_home}/python ${cron_task}/impala_restart_report_to_dingding.py
# 保存,并用crontab -l 查看
effect
Reference documents:
DingTalk Open Platform
https://ding-doc.dingtalk.com/doc#/serverapi2/qf2nxq
Python realizes Telnet connection
https://www.cnblogs.com/jieliu8080/p/10511128.html
Port used by Impala
https:/ /www.cnblogs.com/qiumingcheng/p/8045746.html
Implement nail alarm based on markdown message type to make the senses more comfortable
https://blog.51cto.com/bensonzy/2293957
Use Python3 to achieve Telnet function (with [Python implementation Telnet connection] Same)
https://www.cnblogs.com/fyly/p/10823539.html