Druid-1.1.23 source code analysis-getConnection to get the connection

motivation:

The access speed of some function pages of the project is very slow, I want to optimize it. These function pages are not commonly used, just some simple query display. But often the first access speed is very slow (>10s). Visit once, and then visit the speed very fast (<100ms). Because it is a multi-data source project, this is the case for several pages, and these pages all use the same database. So it is suspected that there is a network access management system in our unit, and the connection may be closed if it is not used for a long time. Therefore, the database connection pool obtains a connection. The first access to the connection pool requires a new connection to be created to obtain an available connection.

I have tried many druid configurations: testWhileIdle, timeBetweenEvictionRunsMillis, removeAbandoned, removeAbandonedTimeoutMillis, initialSize, minIdle, maxActive, etc., but this problem still occurs.

Through druid monitoring, we found:

The data source has not been used for a long time, and the data source information (physical open number: 1, physical closed number: 0, logical open number: 1, logical closed number: 1, open connection number: 0, connection pool idle number: 1) , At this time, because the connection is idle for too long, it is no longer available. Here is a special note, my minIdle setting is obviously 10, why is the number of idle connection pools less than 10?

Then, the business calls the getConnection() method of the data source to obtain the connection. First, an idle connection is obtained (in fact, it is no longer available), and then testWhileIdle is configured, so the test finds that it is unavailable, and a new connection is created. As a result, the first access speed of the page is slow.

The reason is found, the next step is to find a solution.

I searched for druid-related configuration information through Baidu. I tried all these configurations and it still didn’t work. I wondered whether it was a druid version problem, so upgrading to the latest version 1.1.23 still didn’t work. Finally decided to analyze the druid source code.

Debug and interpret the source code:

1. Call getConnection to get the connection.

dataSource.getConnection()

2.进入com.alibaba.druid.pool.DruidDataSource.getConnection

3.init() should have been initialized, so you don't need to worry about it; because the druid monitoring and firewall are configured, filters.size() is 2.

4. Enter com.alibaba.druid.filter.FilterChainImpl.dataSource_connect. At this time this.pos=0, filterSize=2

5. Look at the nextFilter() method: it first returns the StatFilter filter, and then increments pos by 1.

6.进入StatFilter.dataSource_getConnection

7.chain.dataSource_connect will enter the 4 steps above, but the pos at this time is 1. So finally enter WallFilter.dataSource_getConnection, and WallFilter is the inherited com.alibaba.druid.filter.FilterAdapter, and does not rewrite dataSource_getConnection method. So it actually entered FilterAdapter.dataSource_getConnection.

8. After calling chain.dataSource_connect(dataSource, maxWaitMillis) at this time, it will enter the above 4 steps again, but the pos at this time is 2. So this.pos <filterSize is false, and it will not enter the if execution.

9. Execute dataSource.getConnectionDirect(maxWaitMillis), this is a bit long, so I won’t take a screenshot.

    public DruidPooledConnection getConnectionDirect(long maxWaitMillis) throws SQLException {
        int notFullTimeoutRetryCnt = 0;
        for (;;) {
            // handle notFullTimeoutRetry
            DruidPooledConnection poolableConnection;
            try {
                poolableConnection = getConnectionInternal(maxWaitMillis);
            } catch (GetConnectionTimeoutException ex) {
                if (notFullTimeoutRetryCnt <= this.notFullTimeoutRetryCount && !isFull()) {
                    notFullTimeoutRetryCnt++;
                    if (LOG.isWarnEnabled()) {
                        LOG.warn("get connection timeout retry : " + notFullTimeoutRetryCnt);
                    }
                    continue;
                }
                throw ex;
            }

            if (testOnBorrow) {
                boolean validate = testConnectionInternal(poolableConnection.holder, poolableConnection.conn);
                if (!validate) {
                    if (LOG.isDebugEnabled()) {
                        LOG.debug("skip not validate connection.");
                    }

                    discardConnection(poolableConnection.holder);
                    continue;
                }
            } else {
                if (poolableConnection.conn.isClosed()) {
                    discardConnection(poolableConnection.holder); // 传入null,避免重复关闭
                    continue;
                }

                if (testWhileIdle) {
                    final DruidConnectionHolder holder = poolableConnection.holder;
                    long currentTimeMillis             = System.currentTimeMillis();
                    long lastActiveTimeMillis          = holder.lastActiveTimeMillis;
                    long lastExecTimeMillis            = holder.lastExecTimeMillis;
                    long lastKeepTimeMillis            = holder.lastKeepTimeMillis;

                    if (checkExecuteTime
                            && lastExecTimeMillis != lastActiveTimeMillis) {
                        lastActiveTimeMillis = lastExecTimeMillis;
                    }

                    if (lastKeepTimeMillis > lastActiveTimeMillis) {
                        lastActiveTimeMillis = lastKeepTimeMillis;
                    }

                    long idleMillis                    = currentTimeMillis - lastActiveTimeMillis;

                    long timeBetweenEvictionRunsMillis = this.timeBetweenEvictionRunsMillis;

                    if (timeBetweenEvictionRunsMillis <= 0) {
                        timeBetweenEvictionRunsMillis = DEFAULT_TIME_BETWEEN_EVICTION_RUNS_MILLIS;
                    }

                    if (idleMillis >= timeBetweenEvictionRunsMillis
                            || idleMillis < 0 // unexcepted branch
                            ) {
                        boolean validate = testConnectionInternal(poolableConnection.holder, poolableConnection.conn);
                        if (!validate) {
                            if (LOG.isDebugEnabled()) {
                                LOG.debug("skip not validate connection.");
                            }

                            discardConnection(poolableConnection.holder);
                             continue;
                        }
                    }
                }
            }

            if (removeAbandoned) {
                StackTraceElement[] stackTrace = Thread.currentThread().getStackTrace();
                poolableConnection.connectStackTrace = stackTrace;
                poolableConnection.setConnectedTimeNano();
                poolableConnection.traceEnable = true;

                activeConnectionLock.lock();
                try {
                    activeConnections.put(poolableConnection, PRESENT);
                } finally {
                    activeConnectionLock.unlock();
                }
            }

            if (!this.defaultAutoCommit) {
                poolableConnection.setAutoCommit(false);
            }

            return poolableConnection;
        }
    }

10. Call the getConnectionInternal method to get the connection.

    private DruidPooledConnection getConnectionInternal(long maxWait) throws SQLException {
        if (closed) {
            connectErrorCountUpdater.incrementAndGet(this);
            throw new DataSourceClosedException("dataSource already closed at " + new Date(closeTimeMillis));
        }

        if (!enable) {
            connectErrorCountUpdater.incrementAndGet(this);

            if (disableException != null) {
                throw disableException;
            }

            throw new DataSourceDisableException();
        }

        final long nanos = TimeUnit.MILLISECONDS.toNanos(maxWait);
        final int maxWaitThreadCount = this.maxWaitThreadCount;

        DruidConnectionHolder holder;

        for (boolean createDirect = false;;) {
            if (createDirect) {
                createStartNanosUpdater.set(this, System.nanoTime());
                if (creatingCountUpdater.compareAndSet(this, 0, 1)) {
                    PhysicalConnectionInfo pyConnInfo = DruidDataSource.this.createPhysicalConnection();
                    holder = new DruidConnectionHolder(this, pyConnInfo);
                    holder.lastActiveTimeMillis = System.currentTimeMillis();

                    creatingCountUpdater.decrementAndGet(this);
                    directCreateCountUpdater.incrementAndGet(this);

                    if (LOG.isDebugEnabled()) {
                        LOG.debug("conn-direct_create ");
                    }

                    boolean discard = false;
                    lock.lock();
                    try {
                        if (activeCount < maxActive) {
                            activeCount++;
                            holder.active = true;
                            if (activeCount > activePeak) {
                                activePeak = activeCount;
                                activePeakTime = System.currentTimeMillis();
                            }
                            break;
                        } else {
                            discard = true;
                        }
                    } finally {
                        lock.unlock();
                    }

                    if (discard) {
                        JdbcUtils.close(pyConnInfo.getPhysicalConnection());
                    }
                }
            }

            try {
                lock.lockInterruptibly();
            } catch (InterruptedException e) {
                connectErrorCountUpdater.incrementAndGet(this);
                throw new SQLException("interrupt", e);
            }

            try {
                if (maxWaitThreadCount > 0
                        && notEmptyWaitThreadCount >= maxWaitThreadCount) {
                    connectErrorCountUpdater.incrementAndGet(this);
                    throw new SQLException("maxWaitThreadCount " + maxWaitThreadCount + ", current wait Thread count "
                            + lock.getQueueLength());
                }

                if (onFatalError
                        && onFatalErrorMaxActive > 0
                        && activeCount >= onFatalErrorMaxActive) {
                    connectErrorCountUpdater.incrementAndGet(this);

                    StringBuilder errorMsg = new StringBuilder();
                    errorMsg.append("onFatalError, activeCount ")
                            .append(activeCount)
                            .append(", onFatalErrorMaxActive ")
                            .append(onFatalErrorMaxActive);

                    if (lastFatalErrorTimeMillis > 0) {
                        errorMsg.append(", time '")
                                .append(StringUtils.formatDateTime19(
                                        lastFatalErrorTimeMillis, TimeZone.getDefault()))
                                .append("'");
                    }

                    if (lastFatalErrorSql != null) {
                        errorMsg.append(", sql \n")
                                .append(lastFatalErrorSql);
                    }

                    throw new SQLException(
                            errorMsg.toString(), lastFatalError);
                }

                connectCount++;

                if (createScheduler != null
                        && poolingCount == 0
                        && activeCount < maxActive
                        && creatingCountUpdater.get(this) == 0
                        && createScheduler instanceof ScheduledThreadPoolExecutor) {
                    ScheduledThreadPoolExecutor executor = (ScheduledThreadPoolExecutor) createScheduler;
                    if (executor.getQueue().size() > 0) {
                        createDirect = true;
                        continue;
                    }
                }

                if (maxWait > 0) {
                    holder = pollLast(nanos);
                } else {
                    holder = takeLast();
                }

                if (holder != null) {
                    if (holder.discard) {
                        continue;
                    }

                    activeCount++;
                    holder.active = true;
                    if (activeCount > activePeak) {
                        activePeak = activeCount;
                        activePeakTime = System.currentTimeMillis();
                    }
                }
            } catch (InterruptedException e) {
                connectErrorCountUpdater.incrementAndGet(this);
                throw new SQLException(e.getMessage(), e);
            } catch (SQLException e) {
                connectErrorCountUpdater.incrementAndGet(this);
                throw e;
            } finally {
                lock.unlock();
            }

            break;
        }

        if (holder == null) {
            long waitNanos = waitNanosLocal.get();

            StringBuilder buf = new StringBuilder(128);
            buf.append("wait millis ")//
               .append(waitNanos / (1000 * 1000))//
               .append(", active ").append(activeCount)//
               .append(", maxActive ").append(maxActive)//
               .append(", creating ").append(creatingCount)//
            ;
            if (creatingCount > 0 && createStartNanos > 0) {
                long createElapseMillis = (System.nanoTime() - createStartNanos) / (1000 * 1000);
                if (createElapseMillis > 0) {
                    buf.append(", createElapseMillis ").append(createElapseMillis);
                }
            }

            if (createErrorCount > 0) {
                buf.append(", createErrorCount ").append(createErrorCount);
            }

            List<JdbcSqlStatValue> sqlList = this.getDataSourceStat().getRuningSqlList();
            for (int i = 0; i < sqlList.size(); ++i) {
                if (i != 0) {
                    buf.append('\n');
                } else {
                    buf.append(", ");
                }
                JdbcSqlStatValue sql = sqlList.get(i);
                buf.append("runningSqlCount ").append(sql.getRunningCount());
                buf.append(" : ");
                buf.append(sql.getSql());
            }

            String errorMessage = buf.toString();

            if (this.createError != null) {
                throw new GetConnectionTimeoutException(errorMessage, createError);
            } else {
                throw new GetConnectionTimeoutException(errorMessage);
            }
        }

        holder.incrementUseCount();

        DruidPooledConnection poolalbeConnection = new DruidPooledConnection(holder);
        return poolalbeConnection;
    }

11. Continue to debug and find the place where the logical open count is counted.

12. Continue to debug and find the place where the connection count is being opened. The holder contains the database connection object conn encapsulated by druid.

13. Then debug, run to the break command on line 1692, and jump out of the for loop.

14. The poolableConnection is returned to the poolableConnection in step 10. Then proceed to debug. testOnBorrow is generally false by default, and opening the check affects performance.

15. Next, test whether the connection is closed, and idle check. testWhileIdle

16. Entered DruidAbstractDataSource.testConnectionInternal

    protected boolean testConnectionInternal(DruidConnectionHolder holder, Connection conn) {
        String sqlFile = JdbcSqlStat.getContextSqlFile();
        String sqlName = JdbcSqlStat.getContextSqlName();

        if (sqlFile != null) {
            JdbcSqlStat.setContextSqlFile(null);
        }
        if (sqlName != null) {
            JdbcSqlStat.setContextSqlName(null);
        }
        try {
            if (validConnectionChecker != null) {
                boolean valid = validConnectionChecker.isValidConnection(conn, validationQuery, validationQueryTimeout);
                long currentTimeMillis = System.currentTimeMillis();
                if (holder != null) {
                    holder.lastValidTimeMillis = currentTimeMillis;
                    holder.lastExecTimeMillis = currentTimeMillis;
                }

                if (valid && isMySql) { // unexcepted branch
                    long lastPacketReceivedTimeMs = MySqlUtils.getLastPacketReceivedTimeMs(conn);
                    if (lastPacketReceivedTimeMs > 0) {
                        long mysqlIdleMillis = currentTimeMillis - lastPacketReceivedTimeMs;
                        if (lastPacketReceivedTimeMs > 0 //
                                && mysqlIdleMillis >= timeBetweenEvictionRunsMillis) {
                            discardConnection(holder);
                            String errorMsg = "discard long time none received connection. "
                                    + ", jdbcUrl : " + jdbcUrl
                                    + ", jdbcUrl : " + jdbcUrl
                                    + ", lastPacketReceivedIdleMillis : " + mysqlIdleMillis;
                            LOG.error(errorMsg);
                            return false;
                        }
                    }
                }

                if (valid && onFatalError) {
                    lock.lock();
                    try {
                        if (onFatalError) {
                            onFatalError = false;
                        }
                    } finally {
                        lock.unlock();
                    }
                }

                return valid;
            }

            if (conn.isClosed()) {
                return false;
            }

            if (null == validationQuery) {
                return true;
            }

            Statement stmt = null;
            ResultSet rset = null;
            try {
                stmt = conn.createStatement();
                if (getValidationQueryTimeout() > 0) {
                    stmt.setQueryTimeout(validationQueryTimeout);
                }
                rset = stmt.executeQuery(validationQuery);
                if (!rset.next()) {
                    return false;
                }
            } finally {
                JdbcUtils.close(rset);
                JdbcUtils.close(stmt);
            }

            if (onFatalError) {
                lock.lock();
                try {
                    if (onFatalError) {
                        onFatalError = false;
                    }
                } finally {
                    lock.unlock();
                }
            }

            return true;
        } catch (Throwable ex) {
            // skip
            return false;
        } finally {
            if (sqlFile != null) {
                JdbcSqlStat.setContextSqlFile(sqlFile);
            }
            if (sqlName != null) {
                JdbcSqlStat.setContextSqlName(sqlName);
            }
        }
    }

17. Send sql to test the connection, how to test it is not in-depth. Here, if the test is normal, return true, otherwise return false, or throw an exception.

18. The following mainly analyzes the situation of exceptions thrown by debugging. This is because we are on the network access specification management system, and the connection will be disconnected if it is idle over time.

(Ps: The database server actively disconnects the connection, it may be checked if the connection is closed before testWhileIdle, and it will not run to this point.)

18. So then return the false value to step 16, namely validate=false. Then go down:

19.discardConnection discards the connection.

    public void discardConnection(DruidConnectionHolder holder) {
        if (holder == null) {
            return;
        }
        //这里是获取jdbc的conn(即不是被druid封装的conn)
        Connection conn = holder.getConnection();
        if (conn != null) {
            JdbcUtils.close(conn);//物理关闭连接
        }

        lock.lock();
        try {
            if (holder.discard) {
                return;
            }

            if (holder.active) {
                activeCount--;//正在打开连接数-1
                holder.active = false;
            }
            discardCount++;//丢弃连接数+1

            holder.discard = true;

            if (activeCount <= minIdle) {//正在打开连接数小于等于最小空闲数
                emptySignal();//唤醒CreateConnectionThread线程。
            }
        } finally {
            lock.unlock();
        }
    }

 

20. Wake up the CreateConnectionThread thread and create a new connection to the connection pool. What the CreateConnectionThread thread does, the relevant analysis will have time to share in the future.

21. After discarding the unavailable connection, execute continue and jump to the beginning of the for loop. That is, it starts from step 10.

22. Get the connection in getConnectionInternal through takeList(). The acquisition logic guess here is that if there is an idle connection in the connection pool, the idle connection will be returned directly. If there is no idle connection, you need to wait for the CreateConnectionThread thread to create a new thread before obtaining an available connection.

    DruidConnectionHolder takeLast() throws InterruptedException, SQLException {
        try {
            while (poolingCount == 0) {
                emptySignal(); // send signal to CreateThread create connection

                if (failFast && isFailContinuous()) {
                    throw new DataSourceNotAvailableException(createError);
                }

                notEmptyWaitThreadCount++;
                if (notEmptyWaitThreadCount > notEmptyWaitThreadPeak) {
                    notEmptyWaitThreadPeak = notEmptyWaitThreadCount;
                }
                try {
                    notEmpty.await(); // signal by recycle or creator
                } finally {
                    notEmptyWaitThreadCount--;
                }
                notEmptyWaitCount++;

                if (!enable) {
                    connectErrorCountUpdater.incrementAndGet(this);
                    if (disableException != null) {
                        throw disableException;
                    }

                    throw new DataSourceDisableException();
                }
            }
        } catch (InterruptedException ie) {
            notEmpty.signal(); // propagate to non-interrupted thread
            notEmptySignalCount++;
            throw ie;
        }

        decrementPoolingCount();
        DruidConnectionHolder last = connections[poolingCount];
        connections[poolingCount] = null;

        return last;
    }

23. Finally, getConnectionDirect finally obtains an available connection, and then returns it to dataSource.getConnection().

to sum up

Through analysis, I can choose the following 3 options:

1. Continue to study druid in depth. Realize regular detection of the number of idle connections in the database connection pool, ensure that connections that are not available for idle connections are discarded, and create new connections. It seems that this function can be achieved through the removeAbandoned configuration, but I configured it and it seems to have no effect. Have to study. The number of idle connections in the connection pool <minIdle number is also to be studied.

Advantages: Optimize page access speed and improve user experience.

Disadvantages: 1. Has an impact on performance. It will automatically create a connection if it is not used for a long time, and then destroy the connection. The endless loop. 2. Spend energy to research druid.

2. Configure the network access standard management system (access system) . Extend the server's database connection idle time, for example, to 3 or 7 days.

Advantages: 1. Optimize page access speed and improve user experience. 2. No need to modify the project.

Disadvantages: 1. Has an impact on performance. In order to maintain the connection for a long time, the performance of the alignment system is affected. 2. Need to contact the admission system administrator to configure.

3. Keep the problem for later processing.

Advantages: The brain can rest. Recently, I studied this hair loss again.

Disadvantages: 1. The user may be a bit slow when visiting the page for the first time, and the experience is not good.

In the end, I decided to adopt the third option. Anyway, these pages are less used, and the speed is very fast after the first visit, which has little effect. After expanding the related functions of the data source in the future, we will see whether the situation is optimized.

Guess you like

Origin blog.csdn.net/xiaozaq/article/details/107242154