Talk about the pits you stepped on when using the error collection platform sentry

foreword

Introduction to Sentry

Sentry is a professional enterprise-grade error tracking and log analysis tool designed to help developers, administrators and product managers track, analyze and resolve application errors and performance issues.

Key features and benefits of Sentry include:

Error Tracking: Sentry can track errors in the application and log them so that developers can quickly locate and fix problems.

Log analysis: Sentry can analyze application logs and provide detailed information, such as error levels, call stacks, database access, etc., to help developers quickly locate and solve problems.

Notifications and Alerts: Sentry can notify developers of bugs and performance issues via email, Slack, PagerDuty, etc., for timely response and resolution.

Extensibility: Sentry supports custom error messages, extended error tracking functions, etc., developers can customize and expand according to their needs.

Team collaboration: Sentry supports team collaboration, can easily share error and log information, and supports multiple people to edit and comment at the same time.

Overall, Sentry is a powerful, easy-to-use, enterprise-level error tracking and log analysis tool that can help developers and administrators better manage and resolve errors and performance issues in applications.

This article mainly talks about some problems encountered in the process of using sentry

problem collection

Question 1: uWSGI listen queue of socket “127.0.0.1:42563” (fd: 3) full !!! (101/100)

This is because uWSGI's listening queue is full, the default listening queue length is 100, just increase the listening queue

For specific operations, modify /onpremise/sentry/sentry.conf.py

SENTRY_WEB_OPTIONS = {
    
    
    ....
    "listen":10240,
   ....
}  

However, it has been adjusted, and there is a high probability that it will be reported after restarting.

Listen queue size is greater than the system max net.core.somaxconn (128)

At this time, you need to modify the system parameters. If it is deployed through the host machine, execute vim /etc/sysctl.conf and add the following content

# 用于设置内核无法及时处理网络接口收到的数据包时允许发送到队列的最大数据包数目,默认为128。也就是每个监听的socket,在没有accept之前,等待处理的socket队列长度
net.core.somaxconn = 10240

Then execute sysctl -p to reload the parameters. However, if sentry is deployed based on docker-compose, such additions will have no effect. You have to do the following configuration in docker-compose.yml

Example:

version: '3'
services:
   ...:
    image: ...
    container_name: ...
    privileged: true
    sysctls:
      net.core.somaxcomm: '10240'

You can view the following documents
https://github.com/docker/compose/issues/3765#issuecomment-402929969

问题二:a client request body is buffered to a temporary file

After running for about a week, report again

uWSGI listen queue of socket "127.0.0.1:43523" (fd: 3) full !!! (10241/10240)

It means that the queue is full again, so I looked at nginx, and the error of problem 2 occurred. This problem is because the buffer of the client request body is too small to write a temporary file

Therefore, you can configure the following parameters

client_max_body_size 100m;
 client_body_buffer_size 10M;

Increase the size of the request body buffer. But tuning this didn't solve it

uWSGI listen queue of socket "127.0.0.1:43523" (fd: 3)

Later, increase the number of uWSGI threads (default workers is 4, thread is 3) and keepalive (default is 30s) duration, the specific operation is as follows, modify /onpremise/sentry/sentry.conf.py, according to the number of cpu cores of the system, appropriate Adjust the number of workers and threads, the sample configuration is as follows

SENTRY_WEB_OPTIONS = {
    
    
   ....
    "so-keepalive": True,
    # Keep this between 15s-75s as that's what Relay supports
    "http-keepalive": 60,
    # the number of web workers
    "workers": 8,
    "threads": 8,
 
}

Question 3: worker 3 lifetime reached, it was running for 86401 second(s)

After running the program for about a week, no more reports

uWSGI listen queue of socket "127.0.0.1:43523" (fd: 3) full 

Turn to report the error of question three, and later find the solution through https://stackoverflow.com/questions/66489889/uwsgi-resets-worker-on-lifetime-reached-causes-downtime , which is to change max-worker-lifetime to The specific reasons for max-worker-lifetime-delta
can be found at
https://github.com/unbit/uwsgi/issues/2020

sample configuration

Modify /onpremise/sentry/sentry.conf.py

SENTRY_WEB_OPTIONS = {
    
    
    ....
    "max-worker-lifetime-delta":86400,
   ....
}  

So far, Sentry has been running for more than half a year without the above problems

Summarize

This article is mainly to record the problems encountered in the process of using sentry, why it is recorded, because I am in the process of troubleshooting, I first went to the official github to see if there is a solution to the answer, and I saw that it was either a pure The theory is that it is recommended to upgrade the version. After searching some information through search engines and trying a lot, I found that the problem was not solved, or it seemed to be solved, and it reappeared later. So I wrote this article, first, I can review it, and second, I hope it can help partners who have similar problems

Guess you like

Origin blog.csdn.net/kingwinstar/article/details/130336360
Recommended