Nagios principle introduction and installation deployment configuration use


Software system projects are inseparable from operation and maintenance tools. I have learned zabbix before and practiced building and using it. I recently came into contact with nagios. Compared with zabbix, it is simpler and easier to use, and basic operation and maintenance functions can also be used. Therefore, seniors on the comprehensive online After studying the relevant information, and also carried out the actual construction, deployment and configuration, the monitoring operation and maintenance of related system services with nagios was successfully realized. Based on this, I will summarize and record the introduction of relevant theoretical principles and the actual construction and deployment configuration process here, so as to facilitate continuous in-depth study in the future and provide reference for latecomers. There are inevitably omissions in the article, and readers are welcome to correct them. Thank you very much!

1. Basic introduction

1.1 Basic introduction

Nagios is an open source free network monitoring tool that can effectively monitor the status of Windows, Linux and Unix hosts, network devices such as switches and routers, printers, etc. When the system or service status is abnormal, an email or SMS alarm is sent to notify the website operation and maintenance personnel at the first time, and a normal email or SMS notification is sent after the status is restored.

Nagios is a monitoring system that monitors system running status and network information. Nagios can monitor the specified local or remote hosts and services, and provide exception notification functions.
Nagios can run on the Linux/Unix platform, and at the same time provide an optional browser-based WEB interface to facilitate system administrators to view network status, various system problems, and logs, etc.

1.2 Main functions

The functions that Nagios can monitor are:
1. Monitoring network services (SMTP, POP3, HTTP, NNTP, PING, etc.);
2. Monitoring host resources (processor load, disk utilization, etc.);
3. Simple plug-in design allows users to Conveniently expand the detection method of your own services;
4. Parallel service inspection mechanism;
5. Have the ability to define the network hierarchical structure, use the "parent" host definition to express the relationship between network hosts, this relationship can be used to discover and Clarify the downtime or unreachable status of the host;
6. When a service or host problem occurs and is resolved, an alarm will be sent to the contact (via EMail, SMS, user-defined methods); 7.
Some processing procedures can be defined so that it can be used in the service
8. Automatic log scrolling function; 9.
Can support and realize redundant monitoring of hosts;
10. The optional WEB interface is used to view the current network status, notification and fault history , log files, etc.;
11. You can view system monitoring information through your mobile phone;
12. You can specify a custom event processing controller;

2. Monitoring principle

  1. Nagios monitoring principle
Nagios的功能是监控服务和主机,但是他自身并不包括这部分功能,
所有的监控、检测功能都是通过各种插件来完成的。
  1. Nagios Monitoring Architecture
Nagios结构主要由Nagios core、Nagios-plugins和一些可选的Addon(NRPE,NSCA,NDOUtils等等)组成,
实际生产中,Nagios core提供的监控功能远不能满足需求,要想搭建一个完善的IT监控管理系统,需要在Nagios监控端与被监控端安装相应功能的Addon,
下载地址http://www.nagios.org/,也可根据实际需求编写所需的插件。
一般情况下,Nagios监控端部署于独立的一台服务器(Linux或Unix操作系统),
包括至少Nagios core,Nagios Plugins以及可选的NRPE,NSCA等Addon。
被监控端:Linux系统下,需安装Nagios plugins与可选NRPE、NSCA等Addon;windows下,安装NSClient++即可。
  1. Introduction to common plug-ins
几个常用的Addon如下: 
1.NRPE:允许在被监控的远程Linux/UNIX主机上执行插件以实现对主机本地资源或属性的监控; 
2.NSCA:该插件将远程Linux/Unix主机的被动检查结果发送到在监控端运行的Nagios守护程序; 
3.NSClient++:它是Windows系统的监视代理程序/守护程序,它是NSClient和NRPE_NT的替代品; 
4.NDOUtils:实现将Nagios中的所有状态信息存储在MySQL数据库中。
  1. Four kinds of monitoring return results
Nagios可以识别4种状态返回信息,
即 0(OK)表示状态正常/绿色、1(WARNING)表示出现警告/×××、2(CRITICAL)表示出现非常严重的错误/红色、3(UNKNOWN)表示未知错误/深×××。
Nagios根据插件返回来的值,来判断监控对象的状态,并通过web显示出来,以供管理员及时发现故障。
  1. monitoring process
启动Nagios后,它会周期性的自动调用插件去检测服务器状态,同时Nagios会维持一个队列,所有插件返回来的状态信息都进入队列,Nagios每次都从队首开始读取信息,并进行处理后,把状态结果通过web显示出来。
Nagios提供了许多插件,利用这些插件可以方便的监控很多服务状态。安装完成后,在nagios主目录下的/libexec里放有nagios自带的可以使用的所有插件,如,check_disk是检查磁盘空间的插件,check_load是检查CPU负载的,等等。每一个插件可以通过运行./check_xxx –h 来查看其使用方法和功能。
  1. Nagios monitors the remote management service process through NRPE
Nagios 执行安装在它里面的check_nrpe 插件,并告诉check_nrpe 去检测哪些服务。
通过SSL,check_nrpe 连接远端机子上的NRPE daemon
NRPE 运行本地的各种插件去检测本地的服务和状态(check_disk,…etc)
最后,NRPE 把检测的结果传给主机端的check_nrpe,check_nrpe 再把结果送到Nagios状态队列中。
Nagios 依次读取队列中的信息,再把结果显示出来。

insert image description here

insert image description here

3. Installation and deployment of monitoring terminal

Install nagios, LAP, nagios-plugins, nrpe, etc.
on the nagios monitoring end. Install nagios-plugin, nrpe on the monitored end.

3.1 Installation environment, installation version

Software needed to deploy Nagios:

LAP(Linux + Apache + PHP)
nagios-3.5.1.tar                //Nagios的核心文件,Nagios服务文件,选择稳定版就好,最新版的很多插件用不了
nagios-plugins-2.1.1.tar        //Nagios插件,用于各种脚本和命令
nrpe-2.15.tar                   //代理服务,用户监控非Nagios服务器的服务器本地私有信息代理
NSCP-0.4.3.143-x64              //Nsclient++,用于监控Windows,分32位和64位
pnp4nagios-0.6.25.tar           //非必需,用于结合nagios出图
vautour_style                   //nagios主题

Nagios core, Nagios-plugins and other software are installed on the monitoring machine to process the monitored data and provide a web interface for viewing and management. Of course, the information of the monitoring machine itself can also be monitored.
Install software such as Nagios-plugins and NRPE on the monitored machine, perform monitoring according to the request of the monitoring machine, and then transmit the results to the monitoring machine.

3.2 Installation steps

3.2.1 Install apache and php

1) Install dependencies

yum install -y gcc gcc-c++ glibc glibc-common php gd gd-devel libpng libmng libjpeg zlib

2) YUM way to install apache and Php

yum install -y httpd php

Start the httpd service and test whether PHP is available

service httpd start
elinks 192.168.109.140 --dump >> php_out.txt
echo $?
less php_out.txt

The relevant information of PHP is displayed, which means that PHP is normal.
Set the boot to start

chkconfig httpd on

3.2.2 Install nagios

1) Add nagios running user

groupadd nagcmd
usermod -G nagcmd apache
useradd nagios -G nagcmd

2) Download the nagios installation package

https://www.nagios.org/downloads/nagios-core/thanks/?skip=1&product_download=nagioscore-source
nagios        
pnp4nagios
nagios-plugins
 vautour_style.zip
Nrpe

3) Unzip:

tar -xf nagios-4.4.7.tar.gz
tar -xf nagios-plugins-2.3.3.tar.gz
tar -xf nrpe-4.0.2.tar.gz
unzip -d vautour_style vautour_style.zip

4) Install nagios
and enter the nagios path

./configure --with-command-group=nagcmd 
make all
make install 
make install-init 
make install-commandmode 
make install-config 
make install-webconf
//make install	安装生成/usr/local/nagios/,其中/usr/local/nagios/share即nagiosWEB访问界面的站点目录
//make install-init                安装生成/etc/rc.d/init.d/nagios  启动脚本                                
//make install-config            安装生成/usr/local/nagios/etc下的nagios相关配置文件                    
//make install-commandmode    设定相应nagios工作目录的权限
//make install-webconf            安装Nagios的WEB配置文件到Apache的conf.d目录下

Set boot

chkconfig nagios on

At this point, the installation process is over
5) Directory description

ls /usr/local/nagios/
bin       //nagios执行程序所在的目录
etc       //nagios配置文件所在目录,初始安装完成后,只有几个.cfg文件,
libexec  //监控所用命令,需要安装nagios-plugins插件才会有,检测命令,不装是空的
sbin    //Nagios的cgi文件所在目录,外部命令所需的文件存放目录
share  //nagiosde 前端页面
var     //日志文件,pid(进程)文件等等。

6) Create a web login user
Create a user who logs in to the nagios web program

htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin
service httpd restart  //重启httpd服务测试刚才创建的nagios用户是否可以登录。

Access the interface, enter the user name and password, log in
insert image description here
insert image description here

3.2.3 Install nagios-plugins plugin

1) Install dependencies and create users

yum install -y gcc gcc-c++ glibc glibc-common php gd gd-devel libpng libmng libjpeg
useradd nagios
mkdir /usr/local/nagios
chown nagios.nagios /usr/local/nagios

2) Compile and install the nagios-plugins plugin

cd nagios-plugins-2.3.3
./configure --with-nagios-user=nagios --with-nagios-group=nagcmd --with-openssl –prefix=/usr/local/nagios
make
make install

2) Restart nagios and httpd services

systemctl restart nagios
systemctl restart httpd

3.2.4 install nrpe

1) Compile and install nrpe

yum -y install gc gcc openssl* openssl-devel xinetd
安装nrpe(为了监控远程服务器):
#tar zxf nrpe-4.0.2.tar.gz
#cd nrpe-4.0.2
#./configure
#make all && make install-plugin

make all
make install
make install-plugin
make install-daemon
make install-config
make install-inetd
make install-init

2) Modify the configuration file

vi /usr/local/nagios/etc/nrpe.cfg

found in the configuration file

allowed_hosts=127.0.0.1

start service

systemctl start nrpe

Set boot

systemctl enable nrpe

Use the check_nrpe plug-in for testing. The check_nrpe and other nagios plug-ins we installed are installed in the /usr/local/nagios/libexec directory, and enter this directory to execute

./check_nrpe -H 127.0.0.1
NRPE v3.0.1

The correct display of the version number indicates that the installation was successful

4. Installation on the monitored terminal

4.1 Install nagios-plugin

1) Install dependencies and create users

yum install -y gcc gcc-c++ glibc glibc-common php gd gd-devel libpng libmng libjpeg
useradd nagios
mkdir /usr/local/nagios
chown nagios.nagios /usr/local/nagios

2) Compile and install the nagios-plugins plugin

cd nagios-plugins-2.3.3
./configure --with-nagios-user=nagios --with-nagios-group=nagcmd --with-openssl –prefix=/usr/local/nagios
make
make install

4.2 install nrpe

1) Compile and install
Unzip the downloaded nrpe 3.0 source package

tar zxvf nrpe-3.0.1.tar
cd nrpe-3.0.1
./configure
make all
make install
make install-plugin
make install-daemon
make install-config
make install-inetd
make install-init

2) Adjust the configuration

vi /usr/local/nagios/etc/nrpe.cfg
allowed_hosts=192.168.109.138
启动服务
systemctl start nrpe
设置开机启动
systemctl enable nrpe

Use the check_nrpe plug-in to test on the monitoring server. The check_nrpe and other nagios plug-ins we installed are installed in the /usr/local/nagios/libexec directory, and enter this directory to execute

./check_nrpe -H 192.168.109.138
NRPE v3.0.1

The correct display of the version number indicates that the installation was successful
insert image description here

5. Basic configuration

5.1 Server configuration

5.1.1 Configuration file description

nagios.cfg:Nagios 主配置文件
resource.cfg:变量定义文件,又称为资源文件,在些文件中定义变量,以便由其他配置文件引用,如$USER1$,好吧,其实就就是全局变量
cgi.cfg:控制CGI访问的配置文件,如何新加了cgi配置文件,需要在这里增加

objects:objects 是一个目录,在此目录下有很多配置文件模板,用于定义Nagios 对象
objects/commands.cfg:命令定义配置文件,其中定义的命令可以被其他配置文件引用
objects/contacts.cfg:定义联系人和联系人组的配置文件
objects/localhost.cfg:定义监控本地主机的配置文件
objects/printer.cfg:定义监控打印机的一个配置文件模板,默认没有启用此文件
objects/switch.cfg:定义监控路由器的一个配置文件模板,默认没有启用此文件
objects/templates.cfg:定义主机和服务的一个模板配置文件,可以在其他配置文件中引用
objects/timeperiods.cfg:定义Nagios 监控时间段的配置文件
objects/windows.cfg监控Windows 主机的一个配置文件模板,默认没有启用此文件

5.1.2 Define monitoring commands

Add the following content to commands.cfg
To monitor remote Linux hosts through NPRE, you need to use the check_nrpe plug-in, and its syntax is as follows:

check_nrpe -H <host> [-n] [-u] [-p <port>] [-t <timeout>] [-c <command>] [-a <arglist...>]
define command {
     command_name check_nrpe
     command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}

5.1.3 Added host monitoring configuration

1) In the etc directory, create a new servers folder, and create a new host configuration file

mkdir servers
cd servers
vi host1.cfg
监控主机配置
define host {
        use                             linux-server
        host_name                       138
        alias                           My first Apache server
        address                         192.168.109.138
        check_interval                  3
        max_check_attempts              3
        check_period                    24x7
        notification_interval           10
        notification_period             24x7
        contact_groups                  admins
}

nagios.cfg open configuration

cfg_dir=/usr/local/nagios/etc/servers

5.2 Client configuration nrpe

Modify the configuration file:

vi /usr/local/nagios/etc/nrpe.cfg

found in the configuration file

allowed_hosts=127.0.0.1

This line indicates that local access is allowed. Here is the client, we need to allow server-side access, we need to add another line below this line

allowed_hosts=192.168.1.8

Set the ip of the nagios server.

6. Send email configuration

6.1 sendmail send mail installation configuration

1) Install sendmail

yum install sendmail* mailx -y

2) Configure sending information

vi /etc/mail.rc
###################################################
set from=*********@qq.com		         qq
set smtp=smtp.qq.com
set smtp-auth-user=*********@qq.com		qq
set smtp-auth-password=*********		qq的stmp授权码
set smtp-auth=login
##################################################

3) reboot

systemctl restart sendmail

6.2 Configure host to monitor mail sending commands

command.cfg command configuration adjustments

define command{
     command_name    notify-host-by-email
     command_line    /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$\nState: $HOSTSTATE$\nAddress: $HOSTADDRESS$\nInfo: $HOSTOUTPUT$\n\nDate/Time: $LONGDATETIME$\n" | /bin/mail -s "** $NOTIFICATIONTYPE$ Host Alert: $HOSTNAME$ is $HOSTSTATE$ **" $CONTACTEMAIL$
}

6.3 Configure contact contacts

contacts.cfg contact configuration recipient email

define contact {
    contact_name            nagiosadmin             ; Short name of user
    use                     generic-contact         ; Inherit default values from generic-contact template (defined above)
    alias                   Nagios Admin            ; Full name of user
    email                   *******@qq.com        ; <<***** CHANGE THIS TO YOUR EMAIL ADDRESS ******
}

And added to the contact_groups configuration

Add configuration for host and service configuration in the configuration file under servers

contact_groups                  admins

restart nagios

systemctl restart nagios

insert image description here
insert image description here
insert image description here
insert image description here

7. References

http://www.nagioschina.com/
https://blog.51cto.com/u_437549/2316512

Guess you like

Origin blog.csdn.net/shy871/article/details/125696179