Nagios how to implement a custom multi-function alarm

Nagios is a plug-in monitoring system that can monitor the operation status and network information services, etc., and can monitor the specified local or remote host parameters and service, while providing abnormal alarm notification functions. Nagios supports data acquisition client by writing a client plug-in, you can get a variety of monitoring data, and provides a Web management interface for data query. The main features of their products focus on monitoring the availability of services, according to an alarm threshold settings, but most alarm logic is achieved by monitoring the plug.

 

Currently Nagios alert notification methods are supported by SMS and email, it is clear that both notification method will lead to significant levels of alarms not been able to see, and resolve problems, and generates an alarm when the storm, both ways on notice users will not be able to meet the diverse needs of notice, so how will alarm through diverse needs of users to achieve a variety of notification methods, and effectively prevent alarm storms, it becomes a current priority to solve the problem.

 

Intelligent Alarm Platform Cloud Alert  is Rui like a cloud since a study of alarm management platform, it can be the perfect solution to this problem! Users can alert by different needs, specify multiple ways of notification, notification methods are currently supported phone, text messaging, micro-letters, mail, APP, nails and so on; and when the storm comes the alarm, the user can check intelligent algorithm noise reduction or compression rules set alarm, so that the same type of alarm is compressed, effectively prevent alarm storms. Users can also access to different monitoring platforms within the Cloud Alert unified management.

 

Then say next is how to set it ~

 

Divided into three parts, the first is the integration of this platform Nagios among the second set assignment strategy, which is specified criteria (user-defined) person to be notified when the next alarm occurs, and finally set the notification policy, based on the user's own needs set diversification of notification.

 

Of course, is to enter the official website www.aiops.com, login account.

 

 One. Nagios Integration

 

1. Create Nagios application in the Cloud Alert, click Integration - Monitoring tools - Nagios

 

2. Fill in the "Application Name", click "Save and get application key"

 

3. Download Agent installation package

 

在 Nagios 服务器中,使用 root 或 nagios 用户下载

 

wget

https://download.aiops.com/ca_agent/nagios/ca_agent-4.1.3.1-linux-x64.tar.gz

 

4. 安装Agent

 

注:下文以Nagios默认安装路径/user/local/nagios/为例,如果你的Nagios服务器不是安装在该目录,请自行替换。

 

tar -xzf ca_agent-4.1.3.1-linux-x64.tar.gz

cp -R ca_agent /usr/local/nagios/libexec/

cp ca_agent/plugin/nagios-plugin/nagios /usr/local/nagios/libexec/

chmod +x /usr/local/nagios/libexec/nagios

cp ca-agent/plugin/nagios-plugin/cloudalert.cfg /usr/local/nagios/etc/objects/

 

5. 修改配置

 

①修改/usr/local/nagios/etc/objects/cloudalert.cfg,设置pager为刚才点击保存所获取的appkey

 

vi /usr/local/nagios/etc/objects/cloudalert.cfg 

 

define contact{

contact_name                    cloudalert                 ; The name of        this contact template

alias                           ca                 ;

service_notification_period     24x7                    ; service notifications can be sent anytime

host_notification_period        24x7                    ; host notifications can be sent anytime

service_notification_options    w,u,c,r,f,s             ; send notifications for all service states, flapping events, and scheduled downtime events

host_notification_options       d,u,r,f,s               ; send notifications for all host states, flapping events, and scheduled downtime events

service_notification_commands   notify-service-by-cloudalert ; send service notifications via email

host_notification_commands      notify-host-by-cloudalert    ; send host notifications via email

pager --  --处填入您新建应用时生成的appkey  ; 

}

 

②修改/usr/local/nagios/etc/objects/contacts.cfg,新增cloudalert到默认联系组

 

vi /usr/local/nagios/etc/objects/contacts.cfg

 

define contactgroup{

contactgroup_name       admins

alias                   Nagios Administrators

members                 nagiosadmin,cloudalert

}

 

③修改/usr/local/nagios/etc/nagios.cfg,将cloudalert.cfg新增到nagios.cfg中

 

vi /usr/local/nagios/etc/nagios.cfg

 

cfg_file=/usr/local/nagios/etc/objects/cloudalert.cfg

 

④可选:为了让告警信息显示更加友好,建议修改nagios.cfg由原来us更改为iso8601

 

vi /usr/local/nagios/etc/nagios.cfg

 

6.重启Nagios

重启前请检查配置是不是正确

 

/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

 

请使用root账号重启Nagios

 

service nagios restart

 

7.集成结果验证

登录Nagios页面控制台发送通知

注:请确认对应服务的notifications_enabled为1

 

define service{

use                             local-service         ; Name of service template to use

host_name                       localhost

service_description             Tomcat18080

check_command                   check_http18080

notifications_enabled           1

}

 

 

8.查看agent日志,出现success字样代表成功

tail -f /usr/local/nagios/libexec/ca_agent/log/agent.log 

 

 

10-05-2015 15:48:53,056 CST INFO  [main] [com.upyoo.agent.NagiosClient@45] start to call alert ...

10-05-2015 15:48:53,063 CST INFO  [main] [com.upyoo.agent.CommandClient@82] alarmName:PROBLEM Service Alert: 127.0.0.1/Tomcat18080 is CRITICAL

 

10-05-2015 15:48:53,064 CST INFO  [main] [com.upyoo.agent.CommandClient@82] alarmContent:localhost/127.0.0.1/Tomcat18080 connect to address 127.0.0.1 and port 18080: Connection refused Date/Time: 2015-05-10 15:48:52

 

10-05-2015 15:48:53,064 CST INFO  [main] [com.upyoo.agent.CommandClient@82] entityName:127.0.0.1/Tomcat18080

 

10-05-2015 15:48:53,066 CST INFO  [main] [com.upyoo.agent.CommandClient@82] priority:CRITICAL

10-05-2015 15:48:53,066 CST INFO  [main] [com.upyoo.agent.CommandClient@82] app:9c4bc722-6677-9fc9-fbdc-003d8977d17e

 

10-05-2015 15:48:53,067 CST INFO  [main] [com.upyoo.agent.CommandClient@82]

10-05-2015 15:48:53,068 CST INFO  [main] [com.upyoo.agent.CommandClient@82]

10-05-2015 15:48:53,068 CST INFO  [main] [com.upyoo.agent.CommandClient@82]

10-05-2015 15:48:53,069 CST INFO  [main] [com.upyoo.agent.CommandClient@82]

10-05-2015 15:48:53,105 CST INFO  [main] [com.upyoo.agent.CommandClient@58] start to post url:http://api.aiops.com/alert/api/event

 

10-05-2015 15:48:53,180 CST INFO  [main] [com.upyoo.agent.CommandClient@65] body:{"app":"9c4bc722-6677-9fc9-fbdc-003d8977d17e","alarmContent":"localhost/127.0.0.1/Tomcat18080 connect to address 127.0.0.1 and port 18080: Connection refused Date/Time: 2015-05-10 15:48:52","eventId":"8G8OGOYUCOOLOENYOGGENOOOOONYNOLU","priority":"3","alarmName":"PROBLEM Service Alert: 127.0.0.1/Tomcat18080 is CRITICAL","eventType":"trigger","entityName":"127.0.0.1/Tomcat18080"}

 

10-05-2015 15:48:53,775 CST INFO  [main] [com.upyoo.agent.CommandClient@68] result:{"result":"success","message":null,"data":"3516","totalCount":0,"code":"200"} 

 

二.设置分派策略

 

 1.点击配置 — 分派策略 — 新建分派

 

2. 输入分派策略名称 — 选择应用 — 设置分派人(告警发生时通知的人),点击保存

这一步骤的可选择性就比较多了,用户可以根据【告警级别】【告警内容】【主机】【服务】【告警对象】【hostgroups】【servicegroups】等条件,来添加指定条件分派通知。

 

三.设置通知策略

 

1. 点击【配置】—【通知策略】-【新建通知】

 

2. 通知策略的可选择性也是很高的,用户可选择的地方有:告警状态、告警级别、通知方式、时间设置、延迟策略、通知人等,其中的意思分别如下:

 

告警状态:选择告警通知的状态。分别有发生时、认领时、关闭时、全选,4种选择。

 

告警级别:选择告警通知的级别。分别有提醒、警告、严重、所有,4种选择。

 

通知方式:选择告警通知的方式。分别有电话、短信、邮件、微信、APP,5种选择。

 

时间设置:选择告警通知的时间。分别有任何时间、工作时间、非工作时间,3种选择。

 

延迟策略:选择告警通知是否延迟。

 

通知人:选择告警通知的人。

 

例如:任何时间告警发生时严重级别的告警立刻微信通知所有人。

 

告警状态 — 发生时;告警级别 — 严重;通知方式 — 微信;时间设置 — 任何时间;延迟策略 — 立刻;通知人 — 全选

 

Nagios与CA告警级别映射关系

以上设置就满足了不同的告警需求,多样化的通知方式,使得告警达到通知必达的效果。

Guess you like

Origin www.cnblogs.com/ruixiangyun/p/12030306.html