zabbix3.0.2 使用percona的mysql插件的遇到的诡异问题解决记录

-->前言

在使用percona zabbix mysql模版插件的过程中,碰到的一些问题记录在此,后续如果再碰到的话,也一起记录下来,好记星不如烂笔头,这是真理啊~

1,报错记录ERROR: Can't connect to local MySQL

调试报错:

[root@db_master_2 zabbix_agentd.d]# /usr/bin/php -q /var/lib/zabbix/percona/scripts/ss_get_mysql_stats.php --host localhost --items gg

ERROR: Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (2)[root@db_master_2 zabbix_agentd.d]#

[root@db_master_2 zabbix_agentd.d]#

解决方案:

做软连接,ln -s /usr/local/mysql/mysql.sock /var/lib/mysql/mysql.sock,做完软连接后需要重启zabbix_agentd才能生效。

[root@db_m2_slave1 ~]# mkdir -p /var/lib/mysql/

[root@db_m2_slave1 ~]# ln -s /usr/local/mysql/mysql.sock /var/lib/mysql/mysql.sock

[root@db_m2_slave1 ~]#

[root@db_m2_slave1 ~]# killall zabbix_agentd

[root@db_m2_slave1 ~]# /usr/sbin/zabbix_agentd -c /etc/zabbix/zabbix_agentd.conf

[root@db_m2_slave1 ~]#

2,报错记录server端获取不了数据

现象是agent能获得数据,但是server端获得不了数据:

1agentdmysql服务器能获得数据:

[root@db_m2_slave1 ~]# /usr/bin/php -q/var/lib/zabbix/percona/scripts/ss_get_mysql_stats.php --host localhost --itemsgg

gg:6

[root@db_m2_slave1 ~]#

2zabbix-server端获取不数据

[root@zabbix_serv_121_12 scripts]#/usr/local/zabbix/bin/zabbix_get -s 192.161.3.72 -p10050 -k "MySQL.Threads-connected"

ERROR: run the command manually toinvestigate the problem: /usr/bin/php -q/var/lib/zabbix/percona/scripts/ss_get_mysql_stats.php --host localhost --itemsgg

[root@zabbix_serv_121_12 scripts]#

那么问题在哪里呢?这要从zabbix-sever和zabbix-agentd的原理流程分析起了,sever是通过zabbix的根目录去调用/etc/zabbix/zabbix_agentd.d/userparameter_percona_mysql.conf里面获取参数MySQL.Threads-connected的,所以我们去找这个userparameter_percona_mysql.conf的此参数值的获取方法。

root@db_m2_slave1 ~]# more /etc/zabbix/zabbix_agentd.d/userparameter_percona_mysql.conf |grep MySQL.Threads-connected

UserParameter=MySQL.Threads-connected,/var/lib/zabbix/percona/scripts/get_mysql_stats_wrapper.sh  iu

[root@db_m2_slave1 ~]#

然后执行此参数方法:查看执行记录,果然报错,调用不出来记录:

[root@db_m2_slave1 ~]# sh /var/lib/zabbix/percona/scripts/get_mysql_stats_wrapper.sh  kt

ERROR: run the command manually to investigate the problem: /usr/bin/php -q /var/lib/zabbix/percona/scripts/ss_get_mysql_stats.php --host localhost --items gg

使用bash跟踪问题,查看到问题在于“+ '[' -e /tmp/localhost-mysql_zabbix_stats.txt ']'

”后报错,如下所示:

[root@db_m2_slave1 ~]# bash -x /var/lib/zabbix/percona/scripts/get_mysql_stats_wrapper.sh  kt

+ echo ''

+ ITEM=kt

+ HOST=localhost

++ dirname /var/lib/zabbix/percona/scripts/get_mysql_stats_wrapper.sh

+ DIR=/var/lib/zabbix/percona/scripts

+ CMD='/usr/bin/php -q /var/lib/zabbix/percona/scripts/ss_get_mysql_stats.php --host localhost --items gg'

+ CACHEFILE=/tmp/localhost-mysql_zabbix_stats.txt

+ '[' kt = running-slave ']'

+ '[' -e /tmp/localhost-mysql_zabbix_stats.txt ']'

+ /usr/bin/php -q /var/lib/zabbix/percona/scripts/ss_get_mysql_stats.php --host localhost --items gg

+ '[' -e /tmp/localhost-mysql_zabbix_stats.txt ']'

+ echo 'ERROR: run the command manually to investigate the problem: /usr/bin/php -q /var/lib/zabbix/percona/scripts/ss_get_mysql_stats.php --host localhost --items gg'

ERROR: run the command manually to investigate the problem: /usr/bin/php -q /var/lib/zabbix/percona/scripts/ss_get_mysql_stats.php --host localhost --items gg

那么就去看下这个文件在不在,果然不在,那么建个空文件,然后赋予zabbix帐号权限:

[root@db_m2_slave1 ~]# vim/tmp/localhost-mysql_zabbix_stats.txt

[root@db_m2_slave1 ~]# chown -Rzabbix:zabbix /tmp/localhost-mysql_zabbix_stats.txt

[root@db_m2_slave1 ~]#

然后再bash调用执行命令:

[root@db_m2_slave1 ~]# bash -x /var/lib/zabbix/percona/scripts/get_mysql_stats_wrapper.sh  kt

+ echo ''

+ ITEM=kt

+ HOST=localhost

++ dirname /var/lib/zabbix/percona/scripts/get_mysql_stats_wrapper.sh

+ DIR=/var/lib/zabbix/percona/scripts

+ CMD='/usr/bin/php -q /var/lib/zabbix/percona/scripts/ss_get_mysql_stats.php --host localhost --items gg'

+ CACHEFILE=/tmp/localhost-mysql_zabbix_stats.txt

+ '[' kt = running-slave ']'

+ '[' -e /tmp/localhost-mysql_zabbix_stats.txt ']'

++ stat -c %Y /tmp/localhost-mysql_zabbix_stats.txt

+ TIMEFLM=1463491777

++ date +%s

+ TIMENOW=1463491788

++ expr 1463491788 - 1463491777

+ '[' 11 -gt 300 ']'

+ '[' -e /tmp/localhost-mysql_zabbix_stats.txt ']'

+ cat /tmp/localhost-mysql_zabbix_stats.txt

+ sed 's/ /\n/g; s/-1/0/g'

+ grep kt

+ awk -F: '{print $2}'

[root@db_m2_slave1 ~]#

然后在zabbix-server上测试验证:

[root@zabbix_serv_121_12 scripts]# /usr/local/zabbix/bin/zabbix_get -s 192.161.3.72 -p10050 -k "MySQL.Threads-connected"

9

[root@zabbix_serv_121_12 scripts]#

3,监控图无缘无故断了一段时间

监控图上面的图突然断了,没有显示,如zabbix-serber上check下,报错:

[root@zabbix_serv_121_12 scripts]# /usr/local/zabbix/bin/zabbix_get -s 192.161.3.72 -p10050 -k "MySQL.Threads-connected"

rm: cannot remove `/tmp/localhost-mysql_cacti_stats.txt': Operation not permitted

8

[root@zabbix_serv_121_12 scripts]#

去agent端授予权限,发现文件不存在:

[root@db_m2_slave1 zabbix_agentd.d]# chown -R zabbix:zabbix localhost-mysql_cacti_stats.txt

chown: cannot access `localhost-mysql_cacti_stats.txt': No such file or directory

[root@db_m2_slave1 zabbix_agentd.d]#

为什么文件会丢失呢?去分析执行文件sh脚本,看到有rm -f$CACHEFILE;有删除操作,而CACHEFILE的定义是CACHEFILE="/tmp/$HOST-mysql_cacti_stats.txt",也就是说这里rm了,那我可以用情况命令echo“”> $CACHEFILE;来取代下,尝试看看,脚本修改如下:

echo "" >> /tmp/$HOST-mysql_cacti_stats.txt

ITEM=$1

HOST=localhost

DIR=`dirname $0`

CMD="/usr/bin/php -q $DIR/ss_get_mysql_stats.php --host $HOST --items gg"

#CACHEFILE="/tmp/zabbix/$HOST-mysql_cacti_stats.txt:3317"

CACHEFILE="/tmp/$HOST-mysql_cacti_stats.txt"

if [ "$ITEM" = "running-slave" ]; then

    # Check for running slave

    #RES=`HOME=~zabbix mysql -e 'SHOW SLAVE STATUS\G' | egrep '(Slave_IO_Running|Slave_SQL_Running):' | awk -F: '{print $2}' | tr '\n' ','`

    RES=`/usr/local/mysql/bin/mysql -e 'SHOW SLAVE STATUS\G' | egrep '(Slave_IO_Running|Slave_SQL_Running):' | awk -F: '{print $2}' | tr '\n' ','`

    if [ "$RES" = " Yes, Yes," ]; then

        echo 1

    else

        echo 0

    fi

    exit

elif [ -e $CACHEFILE ]; then

    # Check and run the script

    #TIMEFLM=`stat -c %Y /tmp/zabbix/$HOST-mysql_cacti_stats.txt:3317`

    TIMEFLM=`stat -c %Y /tmp/$HOST-mysql_cacti_stats.txt`

    TIMENOW=`date +%s`

        if [ `expr $TIMENOW - $TIMEFLM` -gt 300 ]; then

        #rm -f $CACHEFILE这里也可以直接注释掉不加下面的echo "" > $CACHEFILE

        echo "" > $CACHEFILE

        $CMD 2>&1 > /dev/null

    fi

else

    $CMD 2>&1 > /dev/null

Fi

# Parse cache file

if [ -e $CACHEFILE ]; then

    cat $CACHEFILE | sed 's/ /\n/g; s/-1/0/g'| grep $ITEM | awk -F: '{print $2}'

else

    echo "ERROR: run the command manually to investigate the problem: $CMD"

fi

然后重启agentd,再去zabbix-server 检测有值了,如下所示:

[root@zabbix_serv_121_12 scripts]# /usr/local/zabbix/bin/zabbix_get -s 192.161.3.72 -p10050 -k "MySQL.innodb-transactions"

1131684198

[root@zabbix_serv_121_12 scripts]# /usr/local/zabbix/bin/zabbix_get -s 192.161.3.72 -p10050 -k "MySQL.Threads-connected"

4

[root@zabbix_serv_121_12 scripts]#

猜你喜欢

转载自blog.csdn.net/u010735147/article/details/81017528