Nio connection pool is hanged

Today, I found that Jetty did not respond. It was fine to restart. Before restarting, I grabbed a dump and analyzed the stack information inside. I found that all Jetty worker threads were locked by a lock:

 

"qtp598461443-127" prio=5 tid=127 WAITING
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:842)
Local Variable: java.util.concurrent.locks.AbstractQueuedSynchronizer$Node#286
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1178)
at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:186)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:262)
at org.apache.http.nio.pool.AbstractNIOConnPool.lease(AbstractNIOConnPool.java:271)
Local Variable: org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager$InternalPoolEntryCallback#35
Local Variable: org.apache.http.concurrent.BasicFuture#187
at org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager.requestConnection(PoolingNHttpClientConnectionManager.java:265)
Local Variable: org.apache.http.concurrent.BasicFuture#98
Local Variable: org.apache.http.impl.nio.client.AbstractClientExchangeHandler$1#34
Local Variable: org.apache.http.nio.conn.NoopIOSessionStrategy#1
at org.apache.http.impl.nio.client.AbstractClientExchangeHandler.requestConnection(AbstractClientExchangeHandler.java:358)
Local Variable: org.apache.http.client.config.RequestConfig#96
Local Variable: org.apache.http.conn.routing.HttpRoute#215
at org.apache.http.impl.nio.client.DefaultClientExchangeHandlerImpl.start(DefaultClientExchangeHandlerImpl.java:125)
at org.apache.http.impl.nio.client.InternalHttpAsyncClient.execute(InternalHttpAsyncClient.java:141)
Local Variable: org.apache.http.nio.client.methods.HttpAsyncMethods$RequestProducerImpl#51
Local Variable: org.apache.http.concurrent.BasicFuture#99
Local Variable: org.apache.http.nio.protocol.BasicAsyncResponseConsumer#40
Local Variable: org.apache.http.impl.nio.client.DefaultClientExchangeHandlerImpl#42
at org.apache.http.impl.nio.client.CloseableHttpAsyncClient.execute(CloseableHttpAsyncClient.java:74)
at org.apache.http.impl.nio.client.CloseableHttpAsyncClient.execute(CloseableHttpAsyncClient.java:107)
Local Variable: org.apache.http.HttpHost#127
Local Variable: org.apache.http.client.protocol.HttpClientContext#49
******
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1322)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:473)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119)
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:479)
Local Variable: org.eclipse.jetty.security.ConstraintSecurityHandler#1
Local Variable: org.eclipse.jetty.security.authentication.BasicAuthenticator#1
Local Variable: org.eclipse.jetty.security.authentication.DeferredAuthentication#1
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:929)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:403)
Local Variable: org.eclipse.jetty.server.DispatcherType#1
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:184)
Local Variable: org.eclipse.jetty.server.session.SessionHandler#1
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:864)
Local Variable: java.lang.String#241755
Local Variable: sun.misc.Launcher$AppClassLoader#1
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:247)
Local Variable: org.eclipse.jetty.util.SingletonList#3
Local Variable: org.eclipse.jetty.server.handler.ContextHandlerCollection#1
Local Variable: org.eclipse.jetty.webapp.WebAppContext#1
at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:151)
Local Variable: org.eclipse.jetty.server.Response#610
Local Variable: org.eclipse.jetty.server.Request#498
Local Variable: org.eclipse.jetty.server.handler.HandlerCollection#1
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:114)
at org.eclipse.jetty.server.Server.handle(Server.java:352)
at org.eclipse.jetty.server.HttpConnection.handleRequest(HttpConnection.java:596)
Local Variable: org.eclipse.jetty.server.Server#1
Local Variable: java.lang.String#204075
at org.eclipse.jetty.server.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:1051)
Local Variable: org.eclipse.jetty.server.HttpConnection$RequestHandler#416
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:590)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:212)
Local Variable: org.eclipse.jetty.http.HttpParser#411
at org.eclipse.jetty.server.HttpConnection.handle(HttpConnection.java:426)
Local Variable: org.eclipse.jetty.server.nio.SelectChannelConnector$3#616
at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:508)
Local Variable: org.eclipse.jetty.io.nio.SelectChannelEndPoint#585
at org.eclipse.jetty.io.nio.SelectChannelEndPoint.access$000(SelectChannelEndPoint.java:34)
at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:40)
at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:451)
Local Variable: org.eclipse.jetty.io.nio.SelectChannelEndPoint$1#414
at java.lang.Thread.run(Thread.java:662)

 

 

I marked the key stack information in red. When the Jetty worker thread applied for a connection to the asynchronous connection pool PoolingNHttpClientConnectionManager, it could not get the lock, and the thread hung there.

Looking at the code of Apache's Nio connection pool, I found that there is a very important lock:

 

/**
 * Abstract non-blocking connection pool.
 *
 * @param <T> route
 * @param <C> connection object
 * @param <E> pool entry
 *
 * @since 4.2
 */
@ThreadSafe
public abstract class AbstractNIOConnPool<T, C, E extends PoolEntry<T, C>>
                                                  implements ConnPool<T, E>, ConnPoolControl<T> {

    private final ConnectingIOReactor ioreactor;
    private final NIOConnFactory<T, C> connFactory;
    private final SocketAddressResolver<T> addressResolver;
    private final SessionRequestCallback sessionRequestCallback;
    private final Map<T, RouteSpecificPool<T, C, E>> routeToPool;
    private final LinkedList<LeaseRequest<T, C, E>> leasingRequests;
    private final Set<SessionRequest> pending;
    private final Set<E> leased;
    private final LinkedList<E> available;
    private final ConcurrentLinkedQueue<LeaseRequest<T, C, E>> completedRequests;
    private final Map<T, Integer> maxPerRoute;
    private final Lock lock;
    private final AtomicBoolean isShutDown;

    private volatile int defaultMaxPerRoute;
    private volatile int maxTotal;
.

 

 

This lock is invincible. All state modifications to the connection pool link need to acquire this lock, such as acquiring the link:

 

/**
 * @since 4.3
 */
public Future<E> lease(
        final T route, final Object state,
        final long connectTimeout, final long leaseTimeout, final TimeUnit tunit,
        final FutureCallback<E> callback) {
    Args.notNull(route, "Route");
Args.notNull(tunit, "Time unit");
Asserts.check(!this.isShutDown.get(), "Connection pool shut down");
    final BasicFuture<E> future = new BasicFuture<E>(callback);
    this.lock.lock();
    try {
        final long timeout = connectTimeout > 0 ? tunit.toMillis(connectTimeout) : 0;
        final LeaseRequest<T, C, E> request = new LeaseRequest<T, C, E>(route, state, timeout, leaseTimeout, future);
        final boolean completed = processPendingRequest(request);
        if (!request.isDone() && !completed) {
            this.leasingRequests.add(request);
}
        if (request.isDone()) {
            this.completedRequests.add(request);
}
    } finally {
        this.lock.unlock();
}
    fireCallbacks();
    return future;

 

 

Or when the work is done, return the connection:

 

protected void requestCompleted(final SessionRequest request) {
    if (this.isShutDown.get()) {
        return;
}
    @SuppressWarnings("unchecked")
    final
T route = (T) request.getAttachment();
    this.lock.lock();
    try {
        this.pending.remove(request);
        final RouteSpecificPool<T, C, E> pool = getPool(route);
        final IOSession session = request.getSession();
        try {
            final C conn = this.connFactory.create(route, session);
            final E entry = pool.createEntry(request, conn);
            this.leased.add(entry);
pool.completed(request, entry);
onLease(entry);
} catch (final IOException ex) {
            pool.failed(request, ex);
}
    } finally {
        this.lock.unlock();
}
    fireCallbacks();
}

 

 

In this way, if a thread hangs before releasing the lock in the process of using connection resources, the entire connection pool will be spent, the worker thread that wants to get the connection will hang on this lock, and the worker thread that wants to return the connection will also hang. After this lock is locked and all threads hang, the whole service can't work.

 

Then I looked at the entire thread stack information, and there was really a thread holding a lock that was hanged when releasing the connection:

 

 

"I/O dispatcher 48" prio=5 tid=83 RUNNABLE at sun.nio.ch.FileDispatcher.preClose0(Native Method) at sun.nio.ch.SocketDispatcher.preClose(SocketDispatcher.java:41) Local Variable: java.io.FileDescriptor#261 Local Variable: sun.nio.ch.SocketDispatcher#1 at sun.nio.ch.SocketChannelImpl.implCloseSelectableChannel(SocketChannelImpl.java:677) Local Variable: java.lang.Object#2674 at java.nio.channels.spi.AbstractSelectableChannel.implCloseChannel(AbstractSelectableChannel.java:201) at java.nio.channels.spi.AbstractInterruptibleChannel.close(AbstractInterruptibleChannel.java:97) Local Variable: java.lang.Object#2669 Local Variable: sun.nio.ch.SocketChannelImpl#342 at org.apache.http.impl.nio.reactor.IOSessionImpl.close(IOSessionImpl.java:226) at org.apache.http.impl.nio.NHttpConnectionBase.close(NHttpConnectionBase.java:513) at org.apache.http.impl.nio.conn.CPoolEntry.closeConnection(CPoolEntry.java:74) at org.apache.http.impl.nio.conn.CPoolEntry.close(CPoolEntry.java:100) at org.apache.http.nio.pool.AbstractNIOConnPool.processPendingRequest(AbstractNIOConnPool.java:375) Local Variable: org.apache.http.conn.routing.HttpRoute#189 at org.apache.http.nio.pool.AbstractNIOConnPool.processNextPendingRequest(AbstractNIOConnPool.java:343) Local Variable: java.util.LinkedList$ListItr#1 Local Variable: org.apache.http.nio.pool.LeaseRequest#1 at org.apache.http.nio.pool.AbstractNIOConnPool.release(AbstractNIOConnPool.java:317) Local Variable: org.apache.http.nio.pool.AbstractNIOConnPool$2#25 at org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager.releaseConnection(PoolingNHttpClientConnectionManager.java:302) Local Variable: org.apache.http.impl.nio.conn.CPoolEntry#49 at org.apache.http.impl.nio.client.AbstractClientExchangeHandler.releaseConnection(AbstractClientExchangeHandler.java:238) Local Variable: org.apache.http.impl.nio.conn.CPoolProxy#12 at org.apache.http.impl.nio.client.MainClientExec.responseCompleted(MainClientExec.java:387) Local Variable: org.apache.http.nio.protocol.BasicAsyncResponseConsumer#37 Local Variable: org.apache.http.impl.nio.client.InternalState#95 Local Variable: org.apache.http.client.protocol.HttpClientContext#16 Local Variable: org.apache.http.message.BasicHttpResponse#3 at org.apache.http.impl.nio.client.DefaultClientExchangeHandlerImpl.responseCompleted(DefaultClientExchangeHandlerImpl.java:168) at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.processResponse(HttpAsyncRequestExecutor.java:412) Local Variable: org.apache.http.nio.protocol.HttpAsyncRequestExecutor$State#8 Local Variable: org.apache.http.impl.nio.client.DefaultClientExchangeHandlerImpl#40 at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.inputReady(HttpAsyncRequestExecutor.java:305) at org.apache.http.impl.nio.DefaultNHttpClientConnection.consumeInput(DefaultNHttpClientConnection.java:267) Local Variable: org.apache.http.impl.nio.conn.ManagedNHttpClientConnectionImpl#42 at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:81) at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:39) at org.apache.http.impl.nio.reactor.AbstractIODispatch.inputReady(AbstractIODispatch.java:116) at org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:164) at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:339) Local Variable: sun.nio.ch.SelectionKeyImpl#94 Local Variable: org.apache.http.impl.nio.reactor.IOSessionImpl#6 at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:317) Local Variable: java.util.HashMap$KeyIterator#18 Local Variable: sun.nio.ch.Util$H2#3 at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:278) at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:106) Local Variable: org.apache.http.impl.nio.reactor.BaseIOReactor#2 at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:590) Local Variable: org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker#2 at java.lang.Thread.run(Thread.java:662)SelectionKeyImpl#94 Local Variable: org.apache.http.impl.nio.reactor.IOSessionImpl#6 at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:317) Local Variable: java.util.HashMap$KeyIterator#18 Local Variable: sun.nio.ch.Util$H2#3 at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:278) at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:106) Local Variable: org.apache.http.impl.nio.reactor.BaseIOReactor#2 at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:590) Local Variable: org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker#2 at java.lang.Thread.run(Thread.java:662)SelectionKeyImpl#94 Local Variable: org.apache.http.impl.nio.reactor.IOSessionImpl#6 at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:317) Local Variable: java.util.HashMap$KeyIterator#18 Local Variable: sun.nio.ch.Util$H2#3 at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:278) at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:106) Local Variable: org.apache.http.impl.nio.reactor.BaseIOReactor#2 at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:590) Local Variable: org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker#2 at java.lang.Thread.run(Thread.java:662)reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:317) Local Variable: java.util.HashMap$KeyIterator#18 Local Variable: sun.nio.ch.Util$H2#3 at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:278) at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:106) Local Variable: org.apache.http.impl.nio.reactor.BaseIOReactor#2 at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:590) Local Variable: org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker#2 at java.lang.Thread.run(Thread.java:662)reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:317) Local Variable: java.util.HashMap$KeyIterator#18 Local Variable: sun.nio.ch.Util$H2#3 at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:278) at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:106) Local Variable: org.apache.http.impl.nio.reactor.BaseIOReactor#2 at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:590) Local Variable: org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker#2 at java.lang.Thread.run(Thread.java:662)execute(AbstractIOReactor.java:278) at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:106) Local Variable: org.apache.http.impl.nio.reactor.BaseIOReactor#2 at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:590) Local Variable: org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker#2 at java.lang.Thread.run(Thread.java:662)execute(AbstractIOReactor.java:278) at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:106) Local Variable: org.apache.http.impl.nio.reactor.BaseIOReactor#2 at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:590) Local Variable: org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker#2 at java.lang.Thread.run(Thread.java:662)

 

 

If a thread hangs, all threads are hung, and there is no protection mechanism. Does this thread pool of apache need to be improved?

 

As for why the thread was hung when the connection was closed, after checking a lot of information, developers at home and abroad have always said that it may be a bug caused by improperly setting the SO_LINGER attribute of the TCP connection. The SO_LINGER option is used to set the time to delay the shutdown, waiting for the completion of the data transmission in the SOCKET send buffer. When this option is not set, after calling close(), some cleanup work will be done immediately after sending the FIN and return. If the SO_LINGER option is set, and the wait time is a positive value, it will wait a while before cleaning up. This property of my program is set to 100, and it may hang when it encounters a system bug or other reasons in the process of processing this wait. If this property is not set, there will be no hang problem.

 

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326931338&siteId=291194637