httpclient resolution process produces a large number of improper use of CLOSE_WAIT

Case 1:

Recently, our k8s + rancher's docker environment there is a very strange question, in the absence of any operation, our web Paozhaopaozhuo suddenly hung up. We all use cases leading to automate execution failed. So I began to Aberdeen and Romania following the investigation process.

The first is to see the log on our web, I found no fault throw. We open with the automated cases, the server has again initiated the request, but returned from the server to the client's response is abnormal, indicating that the server has not even requested. So we check the Nginx configuration, there is no problem, that's why, when we open the netstat command

netstat -n | awk '/^tcp/ {++S[$NF]} END {for(a in S) print a, S[a]}' 

Socket server has found a lot of CLOSE_WAIT state, this time to find the cause of the problem, because a lot of CLOSE_WAIT cause tomcat suspended animation, so we can not provide a web service.

 

image.png

As shown below:

 

Four waving .png

 

Figure can be seen waving according to tcp four out of state, this state is because the client initiates a connection request to close the socket, send a FIN packet to the server, then the server CLOSE_WAIT in the state, but he did not server program further issue ack signal, thus leading to this resource has been occupied program. Then we query the Nginx logs and found there was a large number of 499 status code. View Nginx 499 is defined as "client has closed connection". Description of the client, etc. impatience, closes the connection active.

 

Nginx log .png

So we found that this period of time, we only get the cookie micro-services constantly initiate http requests, so we stop the micro-services, observation for some time, found close_wait no further growth, then the source of the problem to find. Look at the code we found that micro-services:

 try {
                response = HttpUtil.processJsonPost(client
                        , context
                        , gotestUrl
                        , headers,bodyData);
                String jsonBody = EntityUtils.toString(response.getEntity(), "UTF-8");
                JSONObject responseObj = JSONObject.parseObject(jsonBody);
                if (responseObj.getInteger("code")== 200){
                    logger.log(Level.INFO,"success! ");
                }else {
                    logger.log(Level.WARNING,"something going wrong! ");
                }
            }catch (Exception ex){
                logger.log(Level.SEVERE, ex.getMessage(), ex);
            }

http connection code are not closed, resulting in a large number of close_wait appear. Modify the code:

            try {
                response = HttpUtil.processJsonPost(client
                        , context
                        , gotestUrl
                        , headers,bodyData);
                String jsonBody = EntityUtils.toString(response.getEntity(), "UTF-8");
                JSONObject responseObj = JSONObject.parseObject(jsonBody);
                if (responseObj.getInteger("code")== 200){
                    logger.log(Level.INFO,"success! already send cookie to gotest");
                }else {
                    logger.log(Level.WARNING,"something going wrong! cannot seccessfully send cookie to gotest");
                }
            }catch (Exception ex){
                logger.log(Level.SEVERE, ex.getMessage(), ex);
            }finally {
                if (null != client){
                    try {
                        client.close();
                    }catch (IOException e) {
                        logger.log(Level.SEVERE, e.getMessage(), e);
                    }
                }
                if (null != response) {
                    try {
                        response.close();
                    }catch (IOException e) {
                        logger.log(Level.SEVERE, e.getMessage(), e);
                    }
                }
            }

重启微服务,果然close_wait没有再增长了,问题解决。
经过这次问题的排查,也给了自己一下警示:
1.代码一定要规范,尤其是在申请资源的部分,写之前就需要注释,不要忘记释放资源;
2.排查问题的时候,需要逐步地去分析,一个一个地排除影响因子,才能更快更准确地定位问题。

 

案例2:

 

ESTABLISHED 表示正在进行网络连接的数量
TIME_WAIT 表示表示等待系统主动关闭网络连接的数量
CLOSE_WAIT 表示被动等待程序关闭的网络连接数量

上篇文章给出了解决TIME_WAIT太多的方法,本篇文章以HttpClient为例说明解决大量CLOSE_WAIT状态的方法。

HttpClient是大量使用的用于HTTP连接的包,首先需要说明的是HttpClient 3.x和4.x之间API差距很多,不过强烈建议使用4.x的版本。除此之外,4.x中每个x之间也有一些差别(比如一些弃用的类,新增加的类等),这里以4.2.3版本进行说明。

HttpClient使用的HTTP 1.1协议进行连接,相对于HTTP 1.0来说有一个持续连接的增强,为了充分利用持续连接的特性,在一次连接结束之后,即使将HttpResponse使用close方法关闭,并且将调用了HttpGet或HttpPost的releaseConnection方法,示例代码如下:

 HttpGet method = null;

 HttpResponse response = null;

 try {

     method = new HttpGet(url);

     response = client.execute(method);

 } catch(Exception e) {

    

 } finally {

     if(response != null) {

         EntityUtils.consumeQuietly(response.getEntity());

     }

     if(method != null) {

         method.releaseConnection();

     }

 }

这个时候仍然发现连接处于CLOSE_WAIT状态,这是因为HttpClient在执行close的时候,如果发现Response的Header中Connection是Keep-alive则连接不会关闭,以便下次请求相同网站的时候进行复用,这是产生CLOSE_WAIT连接的原因所在。

最简单的一种解决方法在execute方法之前增加Connection: close头信息,HTTP协议关于这个属性的定义如下:

HTTP/1.1 defines the "close" connection option for the sender to signal that the connection will be closed after completion of the response. For example:
	Connection: close 

示例代码如下:

 HttpGet method = null;

 HttpResponse response = null;

 try {

     method = new HttpGet(url);

     method.setHeader(HttpHeaders.CONNECTION, "close");

     response = client.execute(method);

 } catch(Exception e) {

    

 } finally {

     if(response != null) {

         EntityUtils.consumeQuietly(response.getEntity());

     }

     if(method != null) {

         method.releaseConnection();

     }

 }

当然,也有人建议每次请求之后关闭client,但这一点不符合HttpClient设计的原则——复用。如果每次连接完成之后就关闭连接,效率太低了。因此,需要使用PoolingClientConnectionManager,并且设置maxTotal(整个连接池里面最大连接数,默认为20)和defaultMaxPerRoute(每个主机的最大连接数,默认为2),另外client还有一个ClientPNames.CONN_MANAGER_TIMEOUT参数,用来设置当连接不够获取新连接等待的超时时间,默认和CoreConnectionPNames.CONNECTION_TIMEOUT相同。可以根据实际情况对PoolingClientConnectionManager进行设置,以达到效率最优。

还有一种情况也会造成大量CLOSE_WAIT连接,即HttpResponse的状态码不是200的时候,需要及时调用method.abort()方法对连接进行释放,详细可以参考这篇文章

参考资料:
使用httpclient必须知道的参数设置及代码写法、存在的风险
解决:HttpClient导致应用出现过多Close_Wait的问题
Using HttpClient Properly to Avoid CLOSE_WAIT TCP Connections
close_wait troubleshooting
Using HttpClient properly to avoid CLOSE_WAIT TCP connections
HttpClient连接池抛出大量ConnectionPoolTimeoutException: Timeout waiting for connection异常排查
httpclient4.2.1 连接池
爬虫简单示例,用httpClient4.2.1实现(转载)

 

 

 


 

发布了19 篇原创文章 · 获赞 149 · 访问量 80万+

Guess you like

Origin blog.csdn.net/truelove12358/article/details/103122167