Stick to #Day 303~ Make up for yesterday’s, and write today’s by the way (Yesterday, I worked on zabbix in the company until 1:00 in the morning, but I forgot to write a blog), mainly zabbix monitoring

Since the 51 database can't scp things to 51Web, then try wget, the following is successful:

51 Database monitoring cpu operation:

su - root

yum install-y vsftpd

/etc/init.d/vsftpdstart

exit

Redirect file output to /var/ftp/pub

51Web:

su - root

yum install-y lftp

exit

wget ftp://51 database ip/pub/test1.txt

(It should be noted here that after lftp enters, it directly reaches /var/ftp in the database, and it also needs to enter pub)

In this way, 51Web has obtained the information of the database cpu. Then you 51Web can scp this file to 233 (since 233 can't ping 51Web, but 51Web can ping 233, so you have to use scp), and then 233 can monitor itself, intercept the file sent over, and the same is true for other machines. You can send information to 233 and let 233 monitor itself. It's very cool and beautiful. All monitoring can be monitored on one host, that is, you can see all the monitoring content in one picture, and you don't need to switch graphics. Host list. Calling the police can also be implemented more simply. It is simply handsome, cool, and beyond analogy. Why are there tears in my eyes? Because I'm just crazy, after this is done, just write the ip of the monitoring center directly on the document. Oops, just now I saw that 0.1% of the CPU in the 51 database is left, which scared the baby to death, but he will automatically recover, as long as it is not full, it may recover automatically after waiting for a while when it is full, and it will definitely call the police when it is full.

Small impression: scp and wget are too powerful, they perfectly solve the problem of information transmission, wget -P/tmp/litaoDir ftp... You can specify which directory to download to, wget -N does not rename, but does not overwrite, He watches the time to update, not the content update, what if I want to directly overwrite the previous one with the same name? Don't use -N and -P, directly use wget -O path + name (the same will overwrite the previous). Like scp, wget can download more than one at a time

Worth monitoring: 51 databases, 229 databases' disk free space and cpu usage. . . There are also 51Web, 229Web cpu usage, SYSAUX table space

When zabbix is ​​monitoring, if a null value monitoring item is obtained, it will be disabled and display unsupported key-value floating-point types cannot have null values. How to solve it? 1. After 10 minutes, zabbix will recheck the Supported status of the current item.

2. Delete the current item and recreate it

3. Modify the recheck time of zabbix, for example, change it to 10 minutes, click administration--->General--->the drop-down bar on the right to select "other"--->Refreshunsupported items (in sec) and change it to 1 (the unit is seconds) ----> update (if you can't find it, replace zabbix Chinese with English to find it).

The CPU alarm may go up for a while, and it will automatically come down when it comes down. When it comes down, you need to send an alarm saying that the problem has been solved, so as not to worry about yourself, the solution: there is a recovery operation after the operation in the action, the developer Really caring.

Brother Hai asked me to check the specific log

Wei Ying told me to mark it in the form

As a daily routine, write a script to clear the size of the SYSAUX table space, and wait for it to be cleaned up at night. Now the script is written and I dare not execute it. There are some problems, but after the modification, it is perfect now. As I thought before, it seems that there is no problem and it is perfect, but there will be problems after execution. Only after the modification is completed and then executed without any problem is the real perfection.

Write a zabbix that monitors the size of the table space, I'm going, it's so annoying, why the script to intercept the database table space can be executed, but it can't be executed if it is put into the scheduled task? I thought of a solution is 51Web to write a script ssh51oracle and execute that script. Another method I tried for two hours before it came out, what a trial + inspiration = miracle. The problem is that sqlplus cannot be executed alone in /etc/crontab, because sqlplus needs to have environment variables as a prerequisite, I dare not add ./etc/bashrc to the script, so I tried every possible way, and finally succeeded, in / Using the script name root su - oracle -c in etc/crontab can successfully execute the sqlplus script. It's a miracle. I dare say that not many people can do this. For the script of sqlplus, there will be a fake SQL interactive page, don't be deceived, it is best to redirect the output and throw it into the garbage heap.

Brother Long asked me to deal with the SYSAUX exception in the table space, 229 was cleaned up manually, and I will write a script to clean it up by myself in 51 and try it out. Zabbix monitors the size of the database table space. When it reaches 92%, it will alert the mobile phone. You can also set actions in the later stage. When the table space reaches 92%, it will execute a custom script to clean up the size of the table space. You can first automate the cleaning of the SYSAUX table. (Move other tables first), but it is best to prepare the script to execute it first, and then put it on if there is no problem. After the automatic cleaning is completed, send a "problem solved" message to the mobile phone to notify. As for how to simplify the interception, you can write SQL statements, such as: select dbid from dba_hist_snapshot; only query the dbid you want. As for the implementation of cleanup actions and alarms, ssh to 51web to execute the script in 51web. The content of the script is that the oracle from ssh to 51 executes the script in 51oracle, and the content of the script is to clean up the table space. ssh This involves expect, as well as the knowledge points of fingerprint authentication and script permissions. First do the script to clean up the table space, then do zabbix to monitor the size of the table space, the last combination is perfect! Awesome!

The solution was written in the previous server health monitoring document: the physical space size of the database

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325975228&siteId=291194637