The principle of database connection pool operation:

The operating principle of database connection pool:

1) The database connection pool will create initialSize connections when it is initialized. When there is a database operation, a connection will be taken out of the pool; if the number of connections currently in use in the pool is equal to maxActive, it will wait for a while and wait for other operations
  If a connection is released, if the waiting time exceeds maxWait, an error will be reported;
  if the number of connections currently in use does not reach maxActive, it is judged whether the current connection is idle, if there is, then the idle connection is used directly, if not, a new one is established connection.
  After the connection is used, the physical connection is not closed, but it is put into the pool to wait for other operations to be reused.

2) At the same time, there is a mechanism inside the connection pool to determine that if the current total number of connections is less than miniIdle, a new idle connection will be established to ensure that the number of connections gets miniIdle.
  If a connection in the current connection pool is still not used after timeBetweenEvictionRunsMillis is idle, it will be physically closed.
  Some database connections have timeout restrictions (mysql connection is disconnected after 8 hours), or due to network interruption and other reasons, the connection pool connection will fail. At
  this time, set a testWhileIdle parameter to true to ensure that the connection pool is internal Regularly detect the availability of connections, unavailable connections will be discarded or rebuilt, and to the
  greatest extent possible to ensure that the Connection object obtained from the connection pool is available. Of course, in order to ensure absolute availability, you can also use testOnBorrow as true (that is, to detect the availability of the Connection object when it is obtained), but this will affect performance.

Points to note about database connection pool

1. Concurrency issues
  In order to maximize the versatility of the connection management service, a multithreaded environment must be considered, that is, concurrency issues. This problem is relatively easy to solve, because each language itself provides support for concurrency management such as java, c#, etc., using the synchronized (java) lock (C#) keyword can ensure that the threads are synchronized. Use method can refer to related literature.

2. Transaction processing
  We know that transactions are atomic. At this time, the operation of the database is required to comply with the "ALL-OR-NOTHING" principle, that is, for a group of SQL statements, either all or nothing is done.
  We know that when two threads share a connection Connection object, and each has its own transaction to be processed, it is a headache for the connection pool, because even if the Connection class provides corresponding transaction support, we still cannot be sure of that The database operation corresponds to that transaction, which is caused by the fact that we have 2 threads in the transaction operation. To this end, we can use each transaction to monopolize a connection to achieve, although this method is a bit of a waste of connection pool resources but can greatly reduce the complexity of transaction management.

3. Allocation and release of connection pool

The allocation and release of the connection pool have a great impact on the performance of the system. Reasonable allocation and release can increase the reuse of connections, thereby reducing the overhead of establishing new connections, and at the same time can speed up user access.
  A List can be used for connection management. That is, put all the connections that have been created into the List for unified management. Whenever a user requests a connection, the system checks whether there are any connections that can be allocated in this list. If there is one, assign the most suitable connection to him (how to find the most suitable connection will be pointed out in the key topic); if not, throw an exception to the user, whether the connection in the list can be assigned by a thread After special management, I will introduce the specific implementation of this thread.

4. Configuration and maintenance of the
  connection pool    How many connections should be placed in the connection pool in order to maximize the performance of the system? The system can set parameters such as the minimum number of connections (minConnection) and the maximum number of connections (maxConnection) to control the connections in the connection pool. For example, the minimum number of connections is the number of connections created by the connection pool when the system starts. If you create too many, the system will start slowly, but the system will respond quickly after creation; if you create too few, the system will start quickly, but the response will be slow. In this way, you can set a smaller minimum number of connections during development, and the development will be faster, and set a larger one when the system is actually used, because it will be faster for visiting customers. The maximum number of connections is the maximum number of connections allowed in the connection pool. The specific setting depends on the amount of system access and can be obtained through software requirements.
  How to ensure the minimum number of connections in the connection pool? There are two strategies, dynamic and static. Dynamic means that the connection pool is checked every certain time. If the number of connections is found to be less than the minimum number of connections, a corresponding number of new connections will be added to ensure the normal operation of the connection pool. Static is to check when the idle connection is not enough.

The type of database connection pool

The first and second generation connection pools: distinguishing whether a database connection pool belongs to the first generation product or the second generation product has one of the most important characteristics is to look at the threading model adopted in its architecture and design, because this directly affects concurrency Access to the database connection performance in the environment.

Generally speaking, the single-threaded synchronization architecture design belongs to the first-generation connection pool, and the second-generation multi-threaded asynchronous architecture belongs to the second generation. The more representative one is Apache Commons DBCP. In the 1.x version, the single-threaded design mode has been continued, and the multi-threaded model was adopted in 2.x.

Using version release time to distinguish between two generations of products is a good way to be lazy. The following is the release time of the latest version of these common database connection pools:
Insert picture description here
As can be seen from the table, C3P0 has not been updated for a long time. DBCP update speed is very slow, basically inactive, while Druid and HikariCP are in active update, this is what we are talking about the second generation of products.
The surpassing of the second-generation product over the first-generation product is subversive. Except for some "historical reasons", it is difficult to find a second reason to convince yourself not to choose the second-generation product, but any success is not accidental. The success of the second-generation product To a large extent, thanks to the foundation laid by the previous generation of products, standing on the shoulders of giants, the designers of the new generation of connection pools have pushed this "tooled" product to the extreme. Among them, the two most representative products are: HikariCP and Druid

C3P0 (completely dead)

C3P0 is the first database connection pool I used. For a long time, it has been synonymous with database connection pool in the Java field. Hibernate used it as a built-in database connection pool for a long time. Its stability is recognized. The C3P0 function is simple and easy to use, good stability, which is its advantage, but the shortcomings in performance make it completely in the cold. The performance of C3P0 is very poor, so poor that it is at the bottom even compared to its contemporaries, not to mention Druid, HikariCP, etc. Normally speaking, it is normal to have a problem, and it can be corrected, but the most fatal problem of c3p0 is that the architecture design is too complicated, making refactoring an impossible task. With the upsurge of the domestic Internet, c3p0, with its performance flaws, completely withdrew from the stage of history.
Insert picture description here

DBCP (salted fish turned over)

DBCP (DataBase Connection Pool) belongs to the core sub-project of Commons, the top-level Apache project (it was first in Jakarta Commons), and it has extensive influence in the Apache ecosystem. For example, the most well-known Tomcat is integrated internally. DBCP, OpenJPA that implements the JPA specification, is also integrated with DBCP by default. However, DBCP does not independently implement the connection pool function. It internally relies on another sub-project Pool in Commons. The core "pool" of the connection pool is provided by the Pool component. Therefore, the performance of DBCP is actually Pool's Performance, the dependency relationship between DBCP and Pool is as follows: It
Insert picture description here
can be seen that because the core functions depend on Pool, DBCP itself can only be updated in a small version, and the change of the real big version is completely dependent on the pool. For a long time, the pool remained at the 1.x version, which directly led to the lack of update of DBCP. Many applications that rely on DBCP have no choice but to replace them after encountering performance bottlenecks. DBCP loyal supporter Tomcat is in its tomcat 7.0 version, and has redesigned and developed a set of connection pools (Tomcat JDBC). Pool). Fortunately, things finally ushered in a turning point in 2013. The Commons-Pool 2.0 version was released in September 2013. In February 2014, DBCP finally ushered in its own version 2.0, a newly designed "pool" based on the new threading model. Let DBCP rejuvenate. Although there is still a certain gap compared with the new generation of connection pools, the gap is not large. The DBCP 2.x version has steadily reached the same level of performance indicators as the new generation of products (see the figure below).
Insert picture description here
DBCP finally turned over with Salted Pool and fought a beautiful turnaround, but the long wait has completely consumed the patience of users. Compared with the new generation of product projects, DBCP does not have any advantages. Just ask, who would have a choice Under the premise of choosing the one that is not good? Perhaps the only reason to choose DBCP2 now is feelings.

HikariCP (invincible performance)

HikariCP is known as "Performance Killer" (It's Faster). How does it perform? Let’s first look at the data provided by the official website:
Insert picture description here
not only is the performance strong, but the stability is not bad, as shown in the following figure:
Insert picture description here
How does it achieve such a strong performance What about? The instructions given on the official website are as follows:

Streamlined bytecode: optimize the code until the compiled bytecode is the least, so that the CPU cache can load more program code;
optimized proxy and interceptor: reduce the code, for example, HikariCP's Statement proxy only has 100 lines of code;
custom Array type (FastStatementList) instead of ArrayList: avoid range check every time get() is called, avoid scanning from beginning to end when calling remove();
custom collection type (ConcurrentBag): improve the efficiency of concurrent reading and writing;
others Defect optimization, such as the study of method calls that take more than one CPU time slice (but did not say how to optimize it).
It can be seen that from the above optimizations and the information that can be found now, the performance advantages of HakariCP should be agreed, coupled with its own compact body, in the current "cloud era, microservices "Under the background, HakariCP will surely be favored by more people.

Druid (comprehensive function)

In recent years, Ali has been acting frequently on open source projects. In addition to projects such as fastJson and dubbo, there are also large-scale software such as AliSQL. Today, Druid is one of Ali's many outstanding open source projects. In addition to providing excellent connection pool functions, it also integrates SQL monitoring, blacklist interception and other functions. In its own words, Druid is "born for monitoring." With the help of the appeal of Ali's platform, the product has won a large number of users' fans once it was released. Judging from user feedback, Druid did not disappoint users.

Compared with other products, another big advantage of Druid is that the Chinese documentation is more comprehensive (is it a Chinese project after all?). On the github wiki page, the problems that may be encountered in daily use are listed. For a new user Speaking, the content provided above is enough to guide it to complete the configuration and use of the product.

The following figure shows the performance test data provided by Druid himself:
Insert picture description here
In project development, I still prefer to use Durid. It is not only a database connection pool, it also contains a ProxyDriver, a series of built-in JDBC component libraries, and a SQL Parser. .

Advantages of Druid over other database connection pools With
powerful monitoring features, through the monitoring functions provided by Druid, you can clearly know the working conditions of the connection pool and SQL.
a. Monitor SQL execution time, ResultSet holding time, returned rows, updated rows, error times, error stack information;

b. The time-consuming interval distribution of SQL execution. What is the time-consuming interval distribution? For example, a certain SQL is executed 1000 times, including 50 times in the interval of 0.1 milliseconds, 800 times in 10 milliseconds, 100 times in 100 milliseconds, 30 times in 100 1000 milliseconds, 15 times in 1-10 seconds, and 5 times in more than 10 seconds. Through the time-consuming interval distribution, it is possible to know the execution time-consuming situation of SQL very clearly;

c. Monitor the number of physical connection creation and destruction of the connection pool, the number of logical connection applications and closures, the number of non-empty waits, and the PSCache hit rate.

Easy to expand. Druid provides an extended API in the Filter-Chain mode. You can write your own Filter to intercept any method in JDBC, and you can do anything on it, such as performance monitoring, SQL auditing, user name and password encryption, logs, and so on.
Druid integrates the excellent features of open source and commercial database connection pools, and optimizes it with Alibaba's experience in large-scale and demanding production environments.
Summary:
Today, although every application (requiring RDBMS) is inseparable from the connection pool, in actual use, the connection pool can already be "invisible". In other words, under normal circumstances, after the initial configuration of the connection pool is completed, there is no need to make any changes. Whether you choose Druid or HikariCP, or even DBCP, they are stable and efficient enough! Previously discussed a lot of issues about the performance of the connection pool, but these performance differences are compared to other connection pools. For the entire system application, the second-generation connection pool experience the difference in the use process It is minimal. There is basically no system performance degradation due to the accessories and use of the connection pool, unless it is when the database load of a single point application is high enough (during stress testing), but even so, general The optimization method is to change the cluster at a single point, rather than deadlocking on a single point of connection pool.

Guess you like

Origin blog.csdn.net/qq_41536934/article/details/111964192