How to implement a connection pool? This article will take you in-depth and simple to understand!

-Preface-

[2w1h] is a very effective way of thinking and learning in the technical field, namely What, Why, and How; sticking to [2w1h] can quickly improve our deep thinking ability.

Today we discuss the "connection pool" through [2w1h]: What is a connection pool (what)? Why do I need a connection pool (why)? How to make a connection pool (how)?

-What is a connection pool? -

Think deeply about the nature of connection pooling, but don't think too complicated!

"Pool" is a very visual description, it is a kind of container, used for storage; in programming, we often use arrays, linked lists, queues, and maps to express.

"Connection" is the channel used to transmit data in the network; "connection" is what we really want to use, and "pool" is a way to manage "multiple connections".

If you don't use the "pool" to manage the "connections" uniformly, the "connections" will be scattered throughout the program; for the convenience of use, we often establish a connection during use, and close the connection after use. So "connection pool" provides us with the convenience of using "connection".

At the same time, the pool is used for storage, so the "connection" in the "connection pool" must be a long-established connection, such as a tcp connection, a websocket connection, etc., which are ready to use and put back when used up. If you do not really understand the essence of "connection pool", there may be a joke of "http connection pool" in the interview!

According to the downstream type, we commonly have database connection pool, cache connection pool, and service connection pool , as shown in the following figure:

Figure 1 Database connection pool

Figure 2 Cached connection pool

 

Figure 3 Service connection pool

In programming, we often encounter process pools, thread pools, coroutine pools, memory pools, object pools, etc.

-Why       do I need a connection pool?     -

In addition to the connection pool can be very convenient for connection management, in a word, the connection pool greatly improves the efficiency of data transmission at high throughput.

In two ways:

1. Avoid repeated three-way handshake and four-way handshake

The establishment of a long connection requires a three-way handshake, and the release of a connection requires a four-way handshake. These are two actions that occur at the system level. For a single connection, it takes little time, but in a high-throughput scenario, it cannot be time-consuming. ignore.

Therefore, the connection pool's out-of-the-box and use-out features avoid a large number of invalid and time-consuming three-way handshake and four-way handshake, and save system resources.

2. Add parallel lanes to achieve full-duplex parallelism

Data communication includes simplex, half-duplex and full-duplex. The simplex communication is as shown in the figure below. The data can only go from A to B, which does not meet the scenario of accessing downstream services.

Figure 4 Simplex communication

Half-duplex communication is shown in the figure below. Data can be from A to B or from B to A, but data can only be transmitted in one direction at the same time, and the channel utilization rate is 50%.

 

Figure 5 Half-duplex communication

The full-duplex communication is as shown in the figure below. There can be data transmission from A to B and from B to A at the same time, and the channel utilization rate is 100%. Long connection is full-duplex communication.

 

Figure 6 Full duplex communication

In IO-intensive Internet applications, when a full-duplex communication channel still cannot meet the demand for data throughput, how to solve it?

There is such a formula in the Internet performance test index:

QPS (throughput) = number of concurrent / average response time

With the average response time unchanged, a moderate increase in the number of concurrency can increase throughput; therefore, the use of multiple full-duplex communications can improve throughput to a certain extent (the average response time does not increase significantly), while the connection pool Just the best way to achieve it.

To summarize: Why do you need a connection pool?

(1) Convenient connection management;

(2) Avoid repeated three-way handshake and four-way handshake;

(3) Better realize full-duplex parallelism.

-       How to make a connection pool?     - 

To achieve a connection pool, the most important thing is to balance and keep alive , as shown in the following figure:

Figure 7 Principle of connection pool implementation

The "pool" of the connection pool is implemented through the queue data structure. The first-in-first-out feature of the queue ensures the balance of using connections, and each connection can be used evenly.

The connection pool provides two APIs: get() and free() . get() is used to "dequeue" an available connection from the head of the queue, and free() is used to release the used connection from the end of the queue. To the queue .

The business code will reduce the get() action at low peaks, so the connection in the connection pool will fail when it is not used for a long time. At this time, the keep-alive thread will simulate the business program when it detects that the use of get() is low Call get() to get the connection and send the heartbeat packet, and then put the keep-alive connection back into the queue through free() to achieve the purpose of keeping all connections in the connection pool alive.

 

After fully understanding the above content, you can understand the extended knowledge of advanced connection pool!

- Advanced connection pooling     - 

Advanced connection pools are usually used in microservice systems, as shown in the figure below: Connection pools connect multiple nodes downstream.

Figure 8 Advanced connection pool

Advanced connection pool has several characteristics:

1. High availability: When any downstream server goes down, the connection pool will close related invalid connections to prevent client access;

2. High scalability: When a server node is added downstream, the connection pool will find and establish a connection to the new server node for client access;

3. Load balancing: The connection pool will allocate data requests according to the service capacity of the downstream server;

4. Middleware: When the downstream server is a MySQL-like database and is sharded, the connection pool will place the request on the corresponding data node and aggregate the data.

Those who make great things in ancient times are not only the super talents, but also the perseverance!

Persist in technical learning, you will achieve something!

-Focus on "The Beauty of Architecture"-

Guess you like

Origin blog.csdn.net/musicml/article/details/109685445