php-fpm 进程异常断开[记一次线上问题处理]

最近遇到一个线上问题,先说说大概背景,代码是分布式的,多个域请求跳转,涉及近十张表左右的数据更新和写入,用到nosql有redis和rabbitmq,流程比较长,总结一下这种情况下如何去找到错误

1.当然首先需要通过日志查到问题点,定位到哪个域哪个接口出的问题,先看看代码逻辑是否有问题

2.发现这个问题是偶现的,同样的数据量下也不一定会出现

3.开始查看php-fpm 慢日志,如果发现有可以拿出去分析下在哪里停掉了

request_slowlog_timeout = 2
slowlog = /home/wwwlogs/php_fpm_slow.log

 

发现日志如下

[03-Feb-2021 12:11:11]  [pool www] pid 47
script_filename = /home/data/webroot/pay-svc
/www/index.php
[0x00007f46edaeb7c0] stream_select() /home/data/webroot/pay-svc
/vendor/php-amqplib/php-amqplib/PhpAmqpLib/Wire/IO/StreamIO.php:336
[0x00007f46edaeb660] do_select() /home/data/webroot/pay-svc
/vendor/php-amqplib/php-amqplib/PhpAmqpLib/Wire/IO/AbstractIO.php:86
[0x00007f46edaeb500] select() /home/data/webroot/pay-svc
/vendor/php-amqplib/php-amqplib/PhpAmqpLib/Wire/AMQPReader.php:134
[0x00007f46edaeb3a0] wait() /home/data/webroot/pay-svc
/vendor/php-amqplib/php-amqplib/PhpAmqpLib/Wire/AMQPReader.php:163
[0x00007f46edaeb298] rawread() /home/data/webroot/pay-svc
/vendor/php-amqplib/php-amqplib/PhpAmqpLib/Wire/AMQPReader.php:107
[0x00007f46edaeb0f0] read() /home/data/webroot/pay-svc
/vendor/php-amqplib/php-amqplib/PhpAmqpLib/Connection/AbstractConnection.php:553
[0x00007f46edaeaf58] wait_frame() /home/data/webroot/pay-svc
/vendor/php-amqplib/php-amqplib/PhpAmqpLib/Connection/AbstractConnection.php:608
[0x00007f46edaeae28] wait_channel() /home/data/webroot/pay-svc
/vendor/php-amqplib/php-amqplib/PhpAmqpLib/Channel/AbstractChannel.php:231
[0x00007f46edaeac90] next_frame() /home/data/webroot/pay-svc
/vendor/php-amqplib/php-amqplib/PhpAmqpLib/Channel/AbstractChannel.php:349
[0x00007f46edaeab28] wait() /home/data/webroot/pay-svc
/vendor/php-amqplib/php-amqplib/PhpAmqpLib/Connection/AbstractConnection.php:775
[0x00007f46edaea9b0] x_open() /home/data/webroot/pay-svc
/vendor/php-amqplib/php-amqplib/PhpAmqpLib/Connection/AbstractConnection.php:246
[0x00007f46edaea7e0] connect() /home/data/webroot/pay-svc
/vendor/php-amqplib/php-amqplib/PhpAmqpLib/Connection/AbstractConnection.php:203
[0x00007f46edaea588] __construct() /home/data/webroot/pay-svc
/vendor/php-amqplib/php-amqplib/PhpAmqpLib/Connection/AMQPStreamConnection.php:70
[0x00007f46edaea408] __construct() /home/data/webroot/pay-svc
/protect/Component/RabbitMq/RabbitMQConnection.php:44
[0x00007f46edaea2e0] __construct() /home/data/webroot/pay-svc
/protect/Component/RabbitMq/RabbitMQProducer.php:21
[0x00007f46edaea1b8] __construct() /home/data/webroot/pay-svc
/protect/Component/RabbitMq/RabbitMQProducer.php:26
[0x00007f46edaea060] createRabbitMQProducer() /home/data/webroot/pay-svc
/protect/Logic/Queue.php:19
[0x00007f46edae9ef0] writeRabbitMQ() /home/data/webroot/pay-svc
/protect/Logic/CoreBNotify.php:69
[0x00007f46edae9da8] writeRabbitMq() /home/data/webroot/pay-svc
/protect/Logic/CoreBNotify.php:40
[0x00007f46edae9c48] invokeBNotify() /home/data/webroot/pay-svc
/protect/Ctrller/BatchSetOffline.php:19

然后找到运维查看php-fpm 超时配置,发现只配置3秒

www.conf: |-
    [www]
    listen = 127.0.0.1:9000
    user = wwwuser
    group = wwwuser
    pm = static
    pm.max_children = 20
    pm.start_servers = 5
    pm.min_spare_servers = 5
    pm.max_spare_servers = 35
    pm.max_requests = 1024
    request_terminate_timeout = 300
    request_slowlog_timeout = 3
    rlimit_files = 65535
    rlimit_core = 0
    php_flag[display_errors] = off
    php_admin_flag[log_errors] = on
    php_value[session.save_handler] = files
    slowlog = /home/data/logs/www-slow.log
    php_admin_value[error_log] = /home/data/logs/www-error.log
    php_flag[display_errors] = off
    php_admin_flag[log_errors] = on

这个接口通过循环大量数据处理更新db,再写入mq,所以应该就是这里出现问题了,单进程超时退出了

如果有更好的处理方式方案请教导一下博主

本文为dahai原创文章,转载无需和我联系,但请注明来自大海技术博客https://www.ypyunedu.com

Guess you like

Origin blog.csdn.net/qq_27229113/article/details/114321435