Connection reset by peer (second understand)


Pave the way for knowledge

Crazy Maker Circle " SpringCloud Nginx High Concurrency Core Programming " Essential for Java engineers/architects [ link ]

1 error message

Connection reset by peer

It will appear in the nginx error log

Connection reset by peer) while reading response header from upstream, client: 1.1.1.1, server: 102.local, request: "GET / HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000

Log viewing method

This error was found in the error log of nginx. In order to more fully grasp the abnormality of nginx operation, it is strongly recommended to add it to the global configuration of nginx

error_log   logs/error.log notice;

In this way, the detailed exception information of nginx can be recorded.

2 Reason 1: The connection has been closed upstream.

The server has indeed closed the connection: upstream sent an RST to reset the connection.

errno = 104 The error indicates that you are calling the write or send method on a connection where the opposite socket has been closed. In this case, after calling the write or send method, the opposite socket will send a RESET signal to the local socket. After that, if you continue to perform write or send operations, you will get errno of 104, and the error description is connection reset by peer.

If the opposite socket has performed the close operation and the local socket continues to send and receive data on this connection, it will trigger the opposite socket to send an RST packet. According to TCP's four-way handshake principle, at this time, the local socket should also start to execute the close operation process, instead of sending and receiving data.

  • For example, when the backend is a php program , if php runs slowly and exceeds the number of seconds set by request_terminate_timeout in php-fpm.conf. request_terminate_timeout is used to set the maximum time when a php script runs. If the php-fpm process manager forcibly terminates the current program and closes the network connection between fastcgi and nginx, then a Connection reset by peer error will appear in nginx.

In other words, the reason for this error is: the running time of the php program exceeds the value set by request_terminate_timeout.

In the php-fpm environment, there is a setting item for this value in etc/php-fpm.conf in the php installation directory, and it can be set to a value of 0 or greater. In this way, setting the request_terminate_timeout of php to a larger value or 0 can reduce the connection reset by peer error caused by nginx due to the long line when the php script is executed.

  • For example, when the backend is a java program ,

Java is also similar, the Java side cannot actively close the connection. If the upstream tomcat or netty has closed the connection, then nginx must be Connection reset by peer

3 Reason 2: Inconsistent data length

​ Caused by the inconsistency of the data length agreed by the sender and the receiver, the receiver is notified that the length of the data to be received is less than the length of the data actually sent by the sender.

4 Reason three: FastCGI cache is small and timeout is too small.

The buffer of nginx is too small and the timeout is too small. Mainly refers to the php environment. If nginx wants to parse the php script language, it must provide support for php by configuring the fastcgi module.

**Summary of the problem: **The data stream generated by the image bit 64 is too large, which causes the QR code image generation of the applet sharing popup to fail

Add the following parameter configuration to the nginx http module:

fastcgi_buffer_size 128k;

fastcgi_buffers 4 128k;

fastcgi_busy_buffers_size 128k;

fastcgi_temp_file_write_size 128k;

Background error:

img

Troubleshoot:

Client------>nginx------->h5------>nginx---------->client

The client clicks through the nginx page of h5, nginx reverse proxy to h5 [no exception]

h5 calls the corresponding interface through the client request [no exception]

The interface return data is displayed to the client through nginx [Exception]

Ps: The picture is parsed and generated by bit 64 and returned to the client, due to the long data length

Solution:

Adjust the parameters of the nginx configuration file, the modified parameters:

fastcgi_buffer_size 256k;
fastcgi_buffers 4 256k;
fastcgi_busy_buffers_size 256k;
fastcgi_temp_file_write_size 256k;

Let's briefly talk about the buffer mechanism of Nginx. For the response from FastCGI Server, Nginx buffers it in the memory and sends it to the client browser in turn. The size of the buffer is controlled by two values, fastcgi_buffers and fastcgi_buffer_size.

For example, the following configuration:

fastcgi_buffers      8 4K;
fastcgi_buffer_size  4K;

fastcgi_buffers controls nginx to create up to 8 buffers with a size of 4K, while fastcgi_buffer_size is the size of the first buffer when processing Response, not included in the former. So the maximum memory buffer size that can be created in total is 8 4K+4K = 36k. These buffers are dynamically generated based on the actual Response size, and are not created all at once. For example, for an 8K page, Nginx will create a total of 2 buffers of 24K.

When Response is less than or equal to 36k, of course all data is processed in memory. What if Response is greater than 36k? This is the role of fastcgi_temp. The extra data will be temporarily written to the file and placed under this directory. At the same time, you will see a similar warning in error.log.

Obviously, if the buffer is set too small, Nginx will frequently read and write the hard disk, which has a great impact on performance, but it is not as big as possible. It is meaningless, ha ha!

FastCGI buffer settings main parameters

fastcgi_buffers 4 64k

This parameter specifies the response from the FastCGI process, how much and how big the local buffer will be used for reading. If the page size generated by a PHP or JAVA script is 256kb, then 4 64kb buffers will be allocated for caching; if If the page is larger than 256kb, the part larger than 256kb will be cached in the path specified by fastcgi_temp. This is not a good way. The memory data is processed faster than the hard disk. Generally, this value should be the median value of the page size generated by the PHP or JAVA script in the site. The page size generated by most scripts is 256kb, so you can set the value to 16 16k, 4 64k, etc.

fastcgi_buffer_size=64k

How big a buffer is needed to read the first part of the fastcgi response? This value indicates that a 64kb buffer is used to read the first part of the response (response header), which can be set as the fastcgi_buffers option buffer size.

fastcgi_connect_timeout=300

The timeout period for connecting to the backend fastcgi, in seconds, the same below.

fastcgi_send_timeout=300

Request the timeout time from fastcgi (this specified value is the timeout time for sending the request to fastcgi after the two handshake has been completed)

fastcgi_reAd_timeout=300

The timeout period for receiving the fastcgi response is similarly after the 2-way handshake.

5 Reason 4: The proxy_buffer cache is small

The reason is that the requested header file is too large, resulting in a 502 error

The solution is to increase the cache of the header

http{

client_header_buffer_size 5m;

 

location / {

  proxy_buffer_size 128k;
  proxy_busy_buffers_size 192k;
  proxy_buffers 4 192k;

  }

}


Reason 5: Keepalive is not set

​ The ngx_http_upstream_check_module module, when using tcp to detect the back-end status, only performs the TCP three-way handshake, and does not actively disconnect the connection, but waits for the server to disconnect. When the backend is nginx or tomcat (on linux), the backend will send a fin packet to close the connection after the timeout. This error log recv() failed (104: Connection reset by peer) was thrown when the backend was IIS. The capture found that IIS did not send fin packets to disconnect the link, but sent RST after the timeout. The package reset the connection, so it caused this problem.
​ From this problem, it is also reflected that the module ngx_http_upstream_check_module still needs to improve the detection mechanism. If the connection is actively closed after the back-end status is detected, the problem of connect reset should not occur.

The problem has been solved by modifying the source code

static ngx_check_conf_t ngx_check_types[] = {
{ NGX_HTTP_CHECK_TCP,
ngx_string("tcp"),
ngx_null_string,
0,
ngx_http_upstream_check_peek_handler,
ngx_http_upstream_check_peek_handler,
NULL,
NULL,
NULL,
0,
1 },

Change the 1 in the last line to 0. According to the data structure analysis, this 1 means keepalived is enabled, so the client will not actively disconnect, because this is a tcp port connectivity check, and keepalived is not required. Change it to 0 to prohibit keepalived.

The modified code is as follows:

static ngx_check_conf_t ngx_check_types[] = {
{ NGX_HTTP_CHECK_TCP,
ngx_string("tcp"),
ngx_null_string,
0,
ngx_http_upstream_check_peek_handler,
ngx_http_upstream_check_peek_handler,
NULL,
NULL,
NULL,
0,
0 },

Reason six: set lingering_close

​ Even if you disable http keepalive, nginx will still try to process HTTP 1.1 pipeline requests. You can configure
lingering_close off to disable this behavior, but this is not a recommended practice because it violates the HTTP protocol. see

http://nginx.org/en/docs/http/ngx_http_core_module.html#lingering_close

Nginx quickly locates exceptions

Error message Error description
"Upstream prematurely closed connection" The exception when requesting the uri is caused by the user disconnecting the connection when the upstream has not returned a response to the user. It has no effect on the system and can be ignored.
“recv() failed (104: Connection reset by peer)” (1) The number of concurrent connections of the server exceeds its carrying capacity, and the server will Down some of these connections; (2) The client closes the browser, and the server is still sending data to the client; (3) The browser press Stop
“(111: Connection refused) while connecting to upstream” When connecting, the user will receive this error if the back-end upstream hangs up or fails.
“(111: Connection refused) while reading response header from upstream” When the user reads the data after the connection is successful, if the backend upstream hangs or fails, the error will be received
“(111: Connection refused) while sending request to upstream” When sending data after successful connection between Nginx and upstream, if the backend upstream hangs or fails, you will receive this error
“(110: Connection timed out) while connecting to upstream” Timeout when nginx connects to upstream
“(110: Connection timed out) while reading upstream” Nginx timed out while reading the response from upstream
“(110: Connection timed out) while reading response header from upstream” Nginx timed out while reading the response header from upstream
“(110: Connection timed out) while reading upstream” Nginx timed out while reading the response from upstream
“(104: Connection reset by peer) while connecting to upstream” upstream sent RST to reset the connection
“upstream sent invalid header while reading response header from upstream” The response header sent by upstream is invalid
“upstream sent no valid HTTP/1.0 header while reading response header from upstream” The response header sent by upstream is invalid
“client intended to send too large body” Used to set the maximum allowable client request content, the default value is 1M, the body sent by the client exceeds the set value
“reopening logs” The user sends the kill -USR1 command
“gracefully shutting down”, The user sends the kill -WINCH command
“no servers are inside upstream” No server configured under upstream
“no live upstreams while connecting to upstream” All servers under upstream are down
“SSL_do_handshake() failed” SSL handshake failed
“ngx_slab_alloc() failed: no memory in SSL session shared cache” Caused by insufficient ssl_session_cache size, etc.
“could not add new SSL session to the session cache while SSL handshaking” Caused by insufficient ssl_session_cache size, etc.

reference:

https://github.com/alibaba/tengine/issues/901

https://my.oschina.net/u/1024107/blog/1838968

https://blog.csdn.net/zjk2752/article/details/21236725

http://nginx.org/en/docs/http/ngx_http_core_module.html#lingering_close

https://blog.csdn.net/crj121624/article/details/79956283

Back to ◀ Crazy Maker Circle

t/u/1024107/blog/1838968

https://blog.csdn.net/zjk2752/article/details/21236725

http://nginx.org/en/docs/http/ngx_http_core_module.html#lingering_close

https://blog.csdn.net/crj121624/article/details/79956283

Back to ◀ Crazy Maker Circle

Crazy Maker Circle-Java high-concurrency research community, open the door to big factories for everyone

Guess you like

Origin blog.csdn.net/crazymakercircle/article/details/109777163