Zabbix- Detailed monitoring process

# 一 、 Change password and Chinese version

As an O & M with poor English, it has been quietly changed to the Chinese version. If you have a good English, just read the English version. If the English is not good, you can change it. After all, the Chinese version is more suitable for beginners to learn faster ~

[External chain image transfer failed, the source site may have an anti-theft chain mechanism, it is recommended to save the image and upload it directly (img-LkVTNhod-1585731121820) ? imageMogr2 / auto-orient / strip% 7CimageView2 / 2 / w / 1240)]

image.png

Remember to click update below after changes

# 2, create a host and host group

  • First define a host group:
    image.png

image.png

  • Then you can add the host:
    image.png

image.png

  • After setting, click Add. This host appears in the list

# 三 、 Surveillance (items)
### 1. First create three application sets

[External chain image transfer failed, the source site may have an anti-theft chain mechanism, it is recommended to save the image and upload it directly (img-kRCzUQqH-1585731121821) ? imageMogr2 / auto-orient / strip% 7CimageView2 / 2 / w / 1240)]

[External chain image transfer failed, the source site may have an anti-theft chain mechanism, it is recommended to save the image and upload it directly (img-ika5yFwl-1585731121821) (https://upload-images.jianshu.io/upload_images/21294643-04148d4180728281.png ? imageMogr2 / auto-orient / strip% 7CimageView2 / 2 / w / 1240)]

image.png

  • Skip creating two other application sets
    image.png

### 2. Define monitoring items:

image.png

image.png

For any monitored item, if you want to be able to be monitored, you must define on the zabbix-server side that you can connect to the zabbix-agent side and be able to get commands. Or it is defined on the agent side to allow the server side to obtain commands. Generally, these are built-in commands, and they all have their names, which we call them key.

The following monitors the CPU interrupts per second

in Number of CPU interrupts per second, including time interrupts
image.png

  • Regarding the key value, we can set it directly on the web page (automatically executed by the server), or use the command line command (manually executed) to obtain:
[root@zabbix-server zabbix]# zabbix_get -s 192.168.19.130 -p 10050 -k "system.cpu.intr"
1101429
  • On our agent side, you can also use the command to view intrthe rate change:
    image.png
    zabbix will collect historical data (all data becomes the past, O (∩_∩) O ha!), And it will also collect hourly average data as Trend data is only collected once every hour, so the resources temporarily used by trends are very small.

##### 2.1 Define a monitoring item without parameters

[External chain image transfer failed, the source site may have an anti-theft chain mechanism, it is recommended to save the image and upload it directly (img-GFvii8iY-1585731121823) (https://upload-images.jianshu.io/upload_images/21294643-9876e81bef1be4e4.png ? imageMogr2 / auto-orient / strip% 7CimageView2 / 2 / w / 1240)]

image.png

  • After setting, click update, it will automatically jump to the following page:
    image.png

  • After the definition is complete, we return to all the hosts and wait for 5 seconds. We can see that the options behind our node1 node have become green: if not, remember to refresh
    image.png

  • We can also go back to our dashboard, and we can see that one of our monitoring items is enabled:
    image.png

image.png

  • So, where is our data? You can click 最新数据to add our none1 node to the host, apply it, and you can see the following status:
    image.png

  • As you can see, we also have a graphics page, click to see the distribution of graphics:
    image.png

  • In fact, there are many indicators that we are concerned about, and we can add them one by one.

##### 2.2 Define a parameterized monitoring item with parameters

The monitoring item we just defined is very simple, keyyou can specify one , but some monitoring items are with parameters, so that our monitoring items have more flexibility. Next, let's briefly explain a monitoring item that requires parameters : the
image.png
picture shows the []meaning of the required parameters, the values ​​inside are the parameters, and the bands <>cannot be omitted. We use this example to illustrate:
ifindicates the interface name; <mode>indicates which mode, including but not limited to: packets (packets), bytes (bytes), errors (errors), dropped (dropped) (the above content can be passed ifconfigView)

image.png

image.png

image.png

image.png

  • Similarly, we can also view through the command line:
[root@zabbix-server zabbix]# zabbix_get -s 192.168.19.130 -p 10050 -k "net.if.in[ens33,packets]"
36836
  • Let's take a look at the display of the web page:
    image.png

image.png

### 3. Quickly define similar indicators

  • If we want to define a similar indicator, we can directly select the clone and then simply modify a little bit of parameters.
  • Taking the net.if.in[ens33,packets]example we just defined , if we want to define one, we outcan do the following:
    image.png

image.png

image.png

  • If we want to define it in bytes, do the same:
    image.png

  • If necessary, the byte can be cloned out. It will not be demonstrated one by one ~

  • You can look at the indicators we have defined now:
    image.png

  • We come to the inspection-> the latest data, you can see that the monitoring items we have defined are already worth:
    image.png

### 4. Delete items

  • If there is a monitoring item, we can't use it, we can delete it. But if you delete it directly, the default data will be left, so we have to clear the data first, and then delete it. The specific steps are as follows:
  • Configuration-> Host-> Monitoring Items-> Select an unnecessary monitoring item
    image.png

# 四 、 trigger
### 1. Introduction

When our collected values ​​are defined, we can define the trigger.
The definition of our trigger is: to define the unreasonable interval or unreasonable state of the data collected by a specific item. Usually a logical expression.

In general, the more reliable way to assess whether the sampling value is within a reasonable interval is to determine the result based on the average value of the last N times; this last N times usually have two definitions:

  1. The average of the results obtained in the last N minutes
  2. The average of the last N results

Note: Do not use character strings if they can be saved with numeric values

### 2. Trigger expression

The basic trigger expression format is as follows:

{<server>:<key>.<function>(<parameter>)}<operator><constant>
  • server: Host name;
  • key: The key of the corresponding monitoring item of the relationship on the host;
  • function: The function used to evaluate whether the collected data is within a reasonable range. The functions currently supported by the trigger are avg (average), count (count), change (change), date (date), dayofweek (week), delta (incremental), diff, iregexp, last (recent), max (maximum value), min (minimum value), nodata (no data), now (now), sum (sum), etc.
  • parameter: Function parameter; most numeric functions can accept the number of seconds as their parameter, and if you use "#" as a prefix before the numeric parameter, it means the most recent value, such as sum (300) means all within 300 seconds The sum of the values, and sum (# 10) means the sum of the last 10 values;

### 3. Define a trigger

We can look at rate of packets(in)the value and use it as a standard to determine our abnormal value: in the
image.png
figure we can see that our maximum value is 7, the minimum value is 4, and the average value is 4.59. In this case, we can define that anything above 5 is an abnormal value.

  • Let's define a trigger below:
    image.png
    Create a trigger in the upper right corner
    [External chain image transfer failed, the source site may have an anti-theft chain mechanism, it is recommended to save the image and upload it directly (img-GUNLOgI4-1585731121829) (https: // upload- images.jianshu.io/upload_images/21294643-eff1b21ccf90f02f.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)]
    After the generation is complete, we click Add at the bottom of the page, and a trigger is successfully defined , At the same time the page automatically jumps:
    image.png
  • Then let's take a look at the item we just defined the trigger:
    image.png

We can see that there is a line in it, which is the value we just defined, and the line that exceeds the line is the abnormal state, which looks very intuitive.
However, now even if this line is exceeded, it will only generate a trigger event and will not do anything else. Therefore, we need to define an action.

# 五 、 defined action (action)
### 1. Introduction

We need to specify what to do based on a corresponding event, which is generally to execute a remote command or send an alarm.

First, we must define a medium in advance, and second, we must also define the endpoint on which the user receives messages (of course, on the user, we also call it the user's medium).

  • We can take a look at the types of media built into the system:
    [External image transfer failed, the source site may have an anti-theft chain mechanism, it is recommended to save the image and upload it directly (img-Add7SDZy-1585731121830) (https: // upload-images .jianshu.io / upload_images / 21294643-146faa7d082213ab.png? imageMogr2 / auto-orient / strip% 7CimageView2 / 2 / w / 1240)]

These are large media types, and there are more subdivisions. Let ’s take Emailan example: the
image.png
same, we can also define multiple of the same type, or for Emailexample, we can define a Tencent server, a NetEase Server, an Ali server, etc.

### 2. Define a media

We still take the Emailexample. Let's simply define a medium:
image.png

  • The media is defined, so how can we then receive emails from users? For example, let our Admin users receive emails. Go
    to Administration —> Users —> Admin —> Alarm Media.
    Let's add one in:
    image.png
    after adding it is like this:
    image.png
    Then we can update it.
    A user can add multiple received media types.

### 3. Define an action

Actions are triggered under certain conditions, for example, if a trigger is triggered, it will trigger our action.

  • We define an action based on redis. First, we use yum to install on the agent side redis:
[root@zabbix-client ~]# yum -y install epel-release; yum -y install redis

Modify the configuration file:

[root@zabbix-client ~]# vim /etc/redis.conf
bind 0.0.0.0        #不做任何认证操作

After the modification is complete, we start the service and check the port:

[root@zabbix-client ~]# systemctl start redis
[root@zabbix-client ~]# netstat -lntp|grep redis
tcp        0      0 0.0.0.0:6379            0.0.0.0:*               LISTEN      2434/redis-server 0 

Then, we can go to the website to define the relevant operations:

  • Create an redisapplication set:
    image.png

##### 3.1 Define monitoring items

image.png

image.png
The monitoring item has been successfully added.

  • We can check his value:
    image.png

##### 3.2 Define trigger

After defining the monitoring items, we can also define a trigger. When there is a problem with the service, we can know in time:

  • Configuration —> Host —> zabbix-client —> Trigger —> Create Trigger
    image.png

  • The trigger has been successfully added. Let's take a look at
    (External chain image transfer failed, the source site may have an anti-theft chain mechanism, it is recommended to save the image and upload it directly (img-LvPsjmfl-1585731121831) (https://upload-images.jianshu.io/upload_images/21294643 -fbfd54c15ecef504.png? imageMogr2 / auto-orient / strip% 7CimageView2 / 2 / w / 1240)]

  • Let's manually shut down the redis service to check:

[root@zabbix-client ~]# systemctl stop redis

image.png

  • As you can see, the problem is now displayed. And for a long time, when our service is opened, it will be changed to the resolved state:
[root@zabbix-client ~]# systemctl start redis

image.png

##### 3.3 Define action

[External chain image transfer failed, the source site may have an anti-theft chain mechanism, it is recommended to save the image and upload it directly (img-rni74SqU-1585731121832) (https://upload-images.jianshu.io/upload_images/21294643-7994e10282815b77.png ? imageMogr2 / auto-orient / strip% 7CimageView2 / 2 / w / 1240)]
[External image transfer failed, the source site may have an anti-theft chain mechanism, it is recommended to save the image and upload it directly (img-2xBFK2UL-1585731121832) (https : //upload-images.jianshu.io/upload_images/21294643-9f1bdc3b71d5f7b3.png? imageMogr2 / auto-orient / strip% 7CimageView2 / 2 / w / 1240)]
image.png
image.png

  • We can see that there are two operations that need to be performed on the virtual machine. One is to modify the sudo configuration file to enable the zabbix user to temporarily have administrator rights; the other is to modify the zabbix configuration file to allow it to receive remote commands. We proceed as follows:
[root@zabbix-client ~]# vim /etc/sudoers
## Allow root to run any commands anywhere 
root    ALL=(ALL)       ALL
zabbix  ALL=(ALL)       NOPASSWD:ALL     #添加此行

[root@zabbix-client ~]# vim /etc/zabbix/zabbix_agentd.conf
EnableRemoteCommands=1     #允许接收远程命令
LogRemoteCommands=1     #把接收的远程命令记入日志

[root@zabbix-client ~]# systemctl restart zabbix-agent
  • We have added what needs to be done in the first step, which is to restart the service. What if the restart is not successful? We need to add the second step:
    image.png
    image.png
  • After the addition is complete, we can take a look:
    image.png
  • After the operation is added, if the service is automatically restored, we can send a message to prompt:
    image.png
  • After adding, it will automatically jump to the following page:
    image.png
  • Now we can manually stop the service for testing:
[root@zabbix-client ~]# systemctl stop redis
  • Then we came to the problem page to check and found that there was indeed a problem, and it has been resolved: Wait a while
    [External chain image transfer failed, the source site may have an anti-theft chain mechanism, it is recommended to save the image and upload it directly (img-bByvQ21x-1585731121834 ) (https://upload-images.jianshu.io/upload_images/21294643-87df2cfbe9740b3c.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)]
    You can also go to the agent side to see if the port is open:
[root@zabbix-client ~]# netstat -lntp|grep redis
tcp        0      0 0.0.0.0:6379            0.0.0.0:*               LISTEN      2744/redis-server 0 

#查看邮件是否发送成功
[root@zabbix-server ~]# yum -y install mailx
[root@zabbix-server ~]# mail
Heirloom Mail version 12.5 7/5/10.  Type ? for help.
"/var/spool/mail/root": 1 message 1 new
>N  1 [email protected]  Tue Mar 24 17:16  20/867   "Resolved: redis service down"
&

It can be seen that the port is opened normally, and our action trigger has been completed.
Supplement: We can also use scripts to send alerts. The storage path of our scripts can be found in the configuration file, defined as:AlterScriptsPath=/usr/lib/zabbix/alertscripts

  • Then we will turn off this action to prepare for the later mail alarm.
    [External link image transfer failed, the source site may have an anti-theft chain mechanism, it is recommended to save the image and upload it directly (img-WHHhUse8-1585731121834) ? imageMogr2 / auto-orient / strip% 7CimageView2 / 2 / w / 1240)]
Published 92 original articles · Likes0 · Visits 1425

Guess you like

Origin blog.csdn.net/Forgetfanhua/article/details/105249849