Solution to a large number of TIME_WAIT problems in server http connections

 

Recently, a large number of connections in the TIME_WAIT state have appeared on the user's tomcat server  , resulting in the inability of subsequent connections and the situation that the service does not respond.

First use the command to view the current number of various states:

netstat -n | awk '/^tcp/ {++S[$NF]} END {for(a in S) print a, S[a]}'    

After execution, the general result is as follows:

    TIME_WAIT 8

    CLOSE_WAIT 323

    SYN_SENT 1

    ESTABLISHED 6171

The three most commonly used states are:

ESTABLISHED means communicating

TIME_WAIT means active shutdown

CLOSE_WAIT means passive close

 

1. Reasons for TIME_WAIT

To establish a connection, TCP needs to exchange at least three packets, which is also called the three-way handshake of TCP. However, when TCP terminates the connection, since both parties need to send a FIN section to the peer for confirmation, so TCP termination of the connection generally requires the exchange of four segments. Specifically: 
1) The application process (active close) first calls close, which causes TCP to send a FIN segment, indicating that the data has been distributed, and requests to close the socket. 
2) The application process at the other end (passive close) accepts the FIN and is confirmed by the TCP at the end (the confirmation process is that TCP sends an ACK segment to the peer socket). The acceptance of FIN is also passed to the upper application process as an end-of-file character. The end of file here is not the EOF of the application process. In the TCP byte stream, the reading or writing of EOF is realized by sending and receiving a special FIN section. 
3) The other end (passive close) application process will call close to close its socket after receiving the file end character, which causes the TCP of this end to also send a FIN segment. 
4) After the active close terminal (active close) receives this FIN, TCP confirms it. (TCP sends an ACK section. It is worth noting that the status of the active closing end is TIME_WAIT before it receives the FIN). 

The meaning of the existence of the TIME_OUT state From the figure, it is clear that the TIME_WAIT state occurs on the active close side, and the time point is after the ACK K+1 segment is sent, the reason is to prevent the ACK segment from being lost in the network (lost ), at this time the passive close enters the LAST_ACK state, which means waiting for the ACK segment. If the ACK segment is really lost at this time (the LAST_ACK on the passive close side times out), then the passive close side will send a FIN K segment again to the peer end. This is why in the figure, the subsection of FIN appears twice. 

For the HTTP protocol based on TCP, it is the server side that closes the TCP connection. In this way, the server side will enter the TIME_WAIT state. It is conceivable that there will be a large number of TIME_WAIT states for the Web Server with a large number of visits. If the server receives it in one second 1000 requests, then there will be a backlog of 240*1000=240,000 TIME_WAIT records, and maintaining these states will bring a burden to the server. Of course, modern operating systems use fast lookup algorithms to manage these TIME_WAITs, so for a new TCP connection request, it will not take too much time to determine whether there is a TIME_WAIT in the hit, but it is always bad to have so many states to maintain.

 

2. The solution to a large number of TIME_WAIT situations

Modify the /etc/sysctl.conf file:

#For a new connection, how many SYN connection requests the kernel needs to send before deciding to give up, should not be greater than 255, the default value is 5, which corresponds to about 180 seconds   
net.ipv4.tcp_syn_retries=2  
#net.ipv4.tcp_synack_retries=2  
#Indicates how often TCP sends keepalive messages when keepalive is enabled. The default is 2 hours, change to 300 seconds  
net.ipv4.tcp_keepalive_time=1200  
net.ipv4.tcp_orphan_retries=3  
#Indicates that if the socket is requested to be closed by the local end, this parameter determines the time it remains in the FIN-WAIT-2 state  
net.ipv4.tcp_fin_timeout=30    
# Indicates the length of the SYN queue, the default is 1024, and the increased queue length is 8192, which can accommodate more network connections waiting to be connected.  
net.ipv4.tcp_max_syn_backlog = 4096  
# Indicates that SYN Cookies are enabled. When the SYN waiting queue overflows, enable cookies to deal with it, which can prevent a small number of SYN attacks. The default value is 0, which means it is closed.  
net.ipv4.tcp_syncookies = 1  
  
# Indicates that reuse is enabled. Allow TIME-WAIT sockets to be reused for new TCP connections, defaults to 0, which means close  
net.ipv4.tcp_tw_reuse = 1  
#Indicates to enable fast recycling of TIME-WAIT sockets in TCP connections, the default is 0, which means close  
net.ipv4.tcp_tw_recycle = 1  
  
##Reduce the number of probes before timeout   
net.ipv4.tcp_keepalive_probes=5   
##Optimize network device receive queue   
net.core.netdev_max_backlog=3000   

 After modification, execute:

/sbin/sysctl -p

This parameter will take effect.

Among the parameters modified above, the most important are 4 parameters:

net.ipv4.tcp_tw_reuse  

net.ipv4.tcp_tw_recycle

net.ipv4.tcp_fin_timeout
net.ipv4.tcp_keepalive_*
In general, modifying these is basically enough.

 

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326154300&siteId=291194637