最近利用httpclient写爬虫时遇到java.net.BindException: Address already in use: connect
仔细检查后发现,自己在利用httpclient的一个静态方法中直接new new HttpClient();并在使用完毕后关闭
这就导致每发出一个http请求都会新建一个httpclient,占用一个端口,在多线程中高速重复调用该方法,就会导致下列错误
java.net.BindException: Address already in use: connect
at java.net.DualStackPlainSocketImpl.connect0(Native Method)
at java.net.DualStackPlainSocketImpl.socketConnect(DualStackPlainSocketImpl.java:79)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:172)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
所以可以通过自己构建连接池,通过享元模式来解决该问题
/**
* httpclient连接池。。。解决HttpClient引起的TCP连接数高问题
*
*/
public class HttpPool
{
static CloseableHttpClient httpClient = null;
public static synchronized CloseableHttpClient getHttpClient()
{
if (httpClient == null)
{
PoolingHttpClientConnectionManager cm = new PoolingHttpClientConnectionManager();
// 连接池最大连接数
cm.setMaxTotal(200);
// 单条链路最大连接数(一个ip+一个端口 是一个链路)
cm.setDefaultMaxPerRoute(100);
// 指定某条链路的最大连接数
ConnectionKeepAliveStrategy kaStrategy = new DefaultConnectionKeepAliveStrategy()
{
@Override
public long getKeepAliveDuration(HttpResponse response, HttpContext context)
{
long keepAlive = super.getKeepAliveDuration(response, context);
if (keepAlive == -1)
{
keepAlive = 60000;
}
return keepAlive;
}
};
httpClient = HttpClients.custom().setConnectionManager(cm).setKeepAliveStrategy(kaStrategy).build();
}
return httpClient;
}
}