Replaced with HikariCP connection pool, too fast!

background

In our usual coding, we usually save some objects, which mainly considers the cost of object creation.

For example, such as thread resources, database connection resources, or TCP connections, etc., the initialization of such objects usually takes a long time. If frequently applied for and destroyed, it will consume a lot of system resources and cause unnecessary performance loss.

And these objects have a remarkable feature, that is, they can be recycled and used repeatedly through lightweight reset work.

At this time, we can use a virtual pool to save these resources, and when we use them, we can quickly get one from the pool.

In Java, pooling technology is widely used. Common ones include database connection pool, thread pool, etc. This article focuses on connection pool and thread pool, which we will introduce in subsequent blogs.

Common pooling package Commons Pool 2

Let's first look at Commons Pool 2, a common pooling package in Java, to understand the general structure of an object pool.

According to our business needs, using this set of APIs can easily implement object pool management.

<!-- https://mvnrepository.com/artifact/org.apache.commons/commons-pool2 -->
<dependency>
    <groupId>org.apache.commons</groupId>
    <artifactId>commons-pool2</artifactId>
    <version>2.11.1</version>
</dependency>

GenericObjectPool is the core class of the object pool. By passing in an object pool configuration and an object factory, an object pool can be quickly created.

public GenericObjectPool(
            final PooledObjectFactory<T> factory,
            final GenericObjectPoolConfig<T> config)

Recommend an open source and free Spring Boot most complete tutorial:

https://github.com/javastacks/spring-boot-best-practice

the case

Jedis, a common client of Redis, uses Commons Pool to manage the connection pool, which can be said to be a best practice. The figure below is the main code block of Jedis using the factory to create objects.

The main method of the object factory class is makeObject, its return value is PooledObject type, and the object can be simply wrapped and returned using new DefaultPooledObject<>(obj).

redis.clients.jedis.JedisFactory, use the factory to create objects.

@Override
public PooledObject<Jedis> makeObject() throws Exception {
  Jedis jedis = null;
  try {
    jedis = new Jedis(jedisSocketFactory, clientConfig);
    //主要的耗时操作
    jedis.connect();
    //返回包装对象
    return new DefaultPooledObject<>(jedis);
  } catch (JedisException je) {
    if (jedis != null) {
      try {
        jedis.quit();
      } catch (RuntimeException e) {
        logger.warn("Error while QUIT", e);
      }
      try {
        jedis.close();
      } catch (RuntimeException e) {
        logger.warn("Error while close", e);
      }
    }
    throw je;
  }
}

Let's introduce the object generation process again. As shown in the figure below, when an object is acquired, it will first try to take one out of the object pool. If there is no free object in the object pool, use the method provided by the factory class to generate a new one. .

public T borrowObject(final Duration borrowMaxWaitDuration) throws Exception {
    //此处省略若干行
    while (p == null) {
        create = false;
        //首先尝试从池子中获取。
        p = idleObjects.pollFirst();
        // 池子里获取不到,才调用工厂内生成新实例
        if (p == null) {
            p = create();
            if (p != null) {
                create = true;
            }
        }
        //此处省略若干行
    }
    //此处省略若干行
}

Where does the object exist? The responsibility of this storage is assumed by a structure called LinkedBlockingDeque, which is a two-way queue.

Next, look at the main properties of GenericObjectPoolConfig:

// GenericObjectPoolConfig本身的属性
private int maxTotal = DEFAULT_MAX_TOTAL;
private int maxIdle = DEFAULT_MAX_IDLE;
private int minIdle = DEFAULT_MIN_IDLE;
// 其父类BaseObjectPoolConfig的属性
private boolean lifo = DEFAULT_LIFO;
private boolean fairness = DEFAULT_FAIRNESS;
private long maxWaitMillis = DEFAULT_MAX_WAIT_MILLIS;
private long minEvictableIdleTimeMillis = DEFAULT_MIN_EVICTABLE_IDLE_TIME_MILLIS;
private long evictorShutdownTimeoutMillis = DEFAULT_EVICTOR_SHUTDOWN_TIMEOUT_MILLIS;
private long softMinEvictableIdleTimeMillis = DEFAULT_SOFT_MIN_EVICTABLE_IDLE_TIME_MILLIS;
private int numTestsPerEvictionRun = DEFAULT_NUM_TESTS_PER_EVICTION_RUN;
private EvictionPolicy<T> evictionPolicy = null;
// Only 2.6.0 applications set this
private String evictionPolicyClassName = DEFAULT_EVICTION_POLICY_CLASS_NAME;
private boolean testOnCreate = DEFAULT_TEST_ON_CREATE;
private boolean testOnBorrow = DEFAULT_TEST_ON_BORROW;
private boolean testOnReturn = DEFAULT_TEST_ON_RETURN;
private boolean testWhileIdle = DEFAULT_TEST_WHILE_IDLE;
private long timeBetweenEvictionRunsMillis = DEFAULT_TIME_BETWEEN_EVICTION_RUNS_MILLIS;
private boolean blockWhenExhausted = DEFAULT_BLOCK_WHEN_EXHAUSTED;

There are many parameters. To understand the meaning of the parameters, let's first look at the life cycle of a pooled object in the entire pool.

As shown in the figure below, there are two main operations of the pool: one is the business thread, and the other is the detection thread.

When the object pool is initialized, three main parameters must be specified:

  • maxTotal The upper limit of objects managed in the object pool
  • maxIdle maximum idle number
  • minIdle minimum idle number

Among them, maxTotal is related to the business thread. When the business thread wants to get the object, it will first check whether there is an idle object.

If there is one, return one; otherwise, enter the creation logic. At this point, if the number in the pool has reached the maximum value, the creation will fail and an empty object will be returned.

When an object is acquired, there is a very important parameter, that is, the maximum waiting time (maxWaitMillis). This parameter has a relatively large impact on the performance of the application side. The parameter defaults to -1, which means that it will never time out until an object is free.

As shown in the figure below, if the object creation is very slow or the use is very busy, the business thread will continue to block (blockWhenExhausted defaults to true), which will cause normal services to fail to run.

interview questions

The general interviewer will ask: How big will you set the timeout parameter? I usually set the maximum waiting time as the maximum delay that the interface can tolerate.

The most comprehensive arrangement of the latest Java interview questions: https://www.javastack.cn/mst/

For example, if the response time of a normal service is about 10ms, it will feel stuck when it reaches 1 second, so this parameter can be set to 500~1000ms.

After the timeout, NoSuchElementException will be thrown, and the request will fail quickly without affecting other business threads. This idea of ​​Fail Fast is widely used on the Internet.

Parameters with the word evcit mainly deal with object eviction. In addition to being expensive to initialize and destroy, pooled objects also occupy system resources at runtime.

For example, the connection pool will occupy multiple connections, and the thread pool will increase scheduling overhead. Under sudden traffic conditions, the business will apply for object resources beyond the normal situation and put them in the pool. When these objects are no longer used, we need to clean them up.

Objects that exceed the value specified by the minEvictableIdleTimeMillis parameter will be forcibly recycled. This value is 30 minutes by default; the softMinEvictableIdleTimeMillis parameter is similar, but it will only be removed when the current number of objects is greater than minIdle, so the former action is more violent Some.

There are also 4 test parameters: testOnCreate, testOnBorrow, testOnReturn, and testWhileIdle, respectively specifying whether to check the validity of pooled objects during creation, acquisition, return, and idle detection.

Turning on these checks can ensure the availability of resources, but it will consume performance, so the default is false.

In the production environment, it is recommended to only set testWhileIdle to true, and adjust the idle detection interval (timeBetweenEvictionRunsMillis), such as 1 minute, to ensure resource availability and efficiency.

JMH test

How big is the performance gap between using a connection pool and not using a connection pool?

The following is a simple JMH test example (see warehouse), which performs a simple set operation and sets a random value for the key of redis.

@Fork(2)
@State(Scope.Benchmark)
@Warmup(iterations = 5, time = 1)
@Measurement(iterations = 5, time = 1)
@BenchmarkMode(Mode.Throughput)
public class JedisPoolVSJedisBenchmark {
   JedisPool pool = new JedisPool("localhost", 6379);

   @Benchmark
   public void testPool() {
       Jedis jedis = pool.getResource();
       jedis.set("a", UUID.randomUUID().toString());
       jedis.close();
   }

   @Benchmark
   public void testJedis() {
       Jedis jedis = new Jedis("localhost", 6379);
       jedis.set("a", UUID.randomUUID().toString());
       jedis.close();
   }
   //此处省略若干行
}

Use the meta-chart to plot the test results, and display the results as shown in the figure below. You can see that the connection pool method is used, and its throughput is 5 times that of the non-use connection pool method!

Database connection pool HikariCP

HikariCP comes from the Japanese "光る", which means light, which means that the software works as fast as the speed of light. It is the default database connection pool in SpringBoot.

The database is a component we often use in our work. There are many client connection pools designed for the database. Its design principle is basically the same as what we mentioned at the beginning of this article, which can effectively reduce the resources for creating and destroying database connections. consume.

The same connection pool, their performance is also different, the picture below is an official test chart of HikariCP, you can see its excellent performance, the official JMH test code see Github.

The general interview question is like this: Why is HikariCP fast?

There are three main aspects:

  • It uses FastList instead of ArrayList, and reduces the operation of out-of-bounds checking by initializing the default value
  • Optimized and simplified the bytecode, and reduced the performance loss of dynamic proxy by using Javassist, such as using invokestatic instruction instead of invokevirtual instruction
  • Implemented a lock-free ConcurrentBag, reducing lock competition in concurrent scenarios

Some performance optimization operations of HikariCP are very worthy of our reference. In the following blogs, we will analyze several optimization scenarios in detail.

The database connection pool also faces the problem of a maximum value (maximumPoolSize) and a minimum value (minimumIdle). There is also a very high-frequency interview question here: How big do you usually set the connection pool to?

Many students think that the larger the size of the connection pool, the better. Some students even set this value to more than 1000, which is a misunderstanding.

According to experience, only 20 to 50 database connections are enough. The specific size should be adjusted according to the business attributes, but it is definitely inappropriate to be too large.

HikariCP officially does not recommend setting the value of minimumIdle, it will be set to the same size as maximumPoolSize by default. If your database server-side connection resources are relatively idle, you might as well remove the dynamic adjustment function of the connection pool.

In addition, according to the type of database query and transaction, multiple database connection pools can be configured in an application. Few people know this optimization technique. Let me briefly describe it here.

There are usually two types of business: one requires fast response time and returns data to the user as soon as possible; the other can be executed slowly in the background, which takes a long time and does not require high timeliness.

If these two types of business share a database connection pool, it is easy to compete for resources, which in turn affects the response speed of the interface.

Although microservices can solve this situation, most services do not have this condition, and the connection pool can be split at this time.

As shown in the figure, in the same business, according to the attributes of the business, we have divided two connection pools to deal with this situation.

HikariCP also mentioned another knowledge point, in the JDBC4 protocol, the validity of the connection can be detected through Connection.isValid().

In this way, we don't need to set a lot of test parameters, and HikariCP does not provide such parameters.

result buffer pool

When you get here, you may find that there are many similarities between Pool and Cache.

One thing in common between them is that the objects are processed and stored in a relatively high-speed area. I habitually think of caches as data objects, and objects in the pool as execution objects. The data in the cache has a hit rate problem, while the objects in the pool are generally peers.

Consider the following scenario, jsp provides the dynamic function of web pages, which can be compiled into class files after execution to speed up execution; or, some media platforms will regularly convert popular articles into static html pages, only by The load balancing of nginx can handle high concurrent requests (dynamic and static separation).

At these times, it is difficult for you to tell whether this is an optimization for caching or pooling for objects. In essence, they just save the result of a certain execution step, so that they do not need to start all over again the next time they are accessed.

I usually call this technique the Result Cache Pool, which is a combination of various optimization methods.

summary

Let me briefly summarize the key points of this article: We start with Commons Pool 2, the most common pooling package in Java, introduce some of its implementation details, and explain the application of some important parameters.

Jedis is encapsulated on the basis of Commons Pool 2. Through the JMH test, we found that after object pooling, there is a performance improvement of nearly 5 times.

Next, I introduced HikariCP, which is very fast in the database connection pool. It is based on pooling technology and has further performance improvement through coding skills. HikariCP is one of the class libraries I focus on. I also recommend that you join yourself in the task list.

In general, when you encounter the following scenarios, you can consider using pooling to increase system performance:

  • The creation or destruction of objects requires more system resources
  • The creation or destruction of objects takes a long time, requires complicated operations and a long wait
  • After the object is created, it can be used repeatedly through some state resets

After pooling objects, only the first step of optimization is enabled. In order to achieve optimal performance, some key parameters of the pool have to be adjusted. A reasonable pool size plus a reasonable timeout period can make the pool play a greater value. Similar to the cache hit rate, the monitoring of the pool is also very important.

As shown in the figure below, you can see that the number of connections in the database connection pool remains at a high level for a long time without being released, and the number of waiting threads increases sharply, which can help us quickly locate the transaction problem of the database.

In normal coding, there are many similar scenarios. For example, Http connection pool, Okhttp and Httpclient both provide the concept of connection pool, you can analyze it by analogy, the focus is also on connection size and timeout.

The underlying middleware, such as RPC, also usually uses connection pool technology to speed up resource acquisition, such as Dubbo connection pool, Feign switching to httppclient and other technologies.

You will find that the pooling design at different resource levels is similar. For example, the thread pool uses a queue to buffer tasks at the second layer, and provides various rejection strategies. We will introduce the thread pool in subsequent articles.

These features of the thread pool can also be used for reference in the connection pool technology to alleviate request overflow and create some overflow strategies.

In reality, we do the same. So how to do it? What are the practices? This part is left for everyone to think about.

Copyright statement: This article is an original article of CSDN blogger "Philadelphia Migrant Worker" and follows the CC 4.0 BY-SA copyright agreement. For reprinting, please attach the original source link and this statement.

Original link: https://blog.csdn.net/monarch91/article/details/123867269

Recent hot article recommendation:

1. 1,000+ Java interview questions and answers (2022 latest version)

2. Brilliant! Java coroutines are coming. . .

3. Spring Boot 2.x tutorial, too comprehensive!

4. Don't fill the screen with explosions and explosions, try the decorator mode, this is the elegant way! !

5. The latest release of "Java Development Manual (Songshan Edition)", download quickly!

Feel good, don't forget to like + forward!

Guess you like

Origin blog.csdn.net/youanyyou/article/details/132617954