Caffeine local cache

 

In other words: Java 8's high-performance cache library, kill GuavaCache: Caffeine is the king of local cache

Caffeine is a high-performance based on Java 8, close to the best cache library.

Caffeine uses an API inspired by Google Guava to provide memory caching. The improvement depends on your experience in designing Guava cache and ConcurrentLinkedHashMap.

LoadingCache<Key, Graph> graphs = Caffeine.newBuilder()
    .maximumSize(10_000)
    .expireAfterWrite(5, TimeUnit.MINUTES)
    .refreshAfterWrite(1, TimeUnit.MINUTES)
    .build(key -> createExpensiveGraph(key));

Foreword:

The official introduction Caffeine is a high-performance local cache library based on JDK8, which provides an almost perfect hit rate. It is a bit similar to ConcurrentMap in JDK. In fact, the LocalCache interface in Caffeine implements the ConcurrentMap interface in JDK, but the two are not exactly the same. The most fundamental difference is that ConcurrentMap saves all added elements, unless they are displayed and deleted (such as calling the remove method). Local caches are generally configured with automatic elimination strategies to protect applications, limit memory usage, and prevent memory overflow.

Caffeine provides a flexible construction method to create a local cache that can meet the following characteristics:

  1. Automatically load data into the local cache, and can configure asynchronous;

  2. Elimination strategy based on quantity;

  3. Based on the expiration time elimination strategy, this time is calculated from the last access or write;

  4. Asynchronous refresh;

  5. Key will be packaged as Weak reference;

  6. Value will be packaged as a Weak or Soft reference so that it can be GC off without memory leaks;

  7. Data removal reminder;

  8. Write broadcast mechanism;

  9. Cache access can be counted;

Upper pressure test:

The comparison is as follows:

Judging from the official stress test results, whether it is a full read scene, a full write scene, or a mixed read-write scene, whether it is 8 threads or 16 threads, Caffeine has won and crushed, and it is simply carrying a hand. Gatling.

 

use:

Caffeine is still very simple to use. If you have used GuavaCache, it is even simpler, because Caffeine's API design draws heavily on GuavaCache. First, introduce Maven dependencies:

<dependency>
    <groupId>com.github.ben-manes.caffeine</groupId>
    <artifactId>caffeine</artifactId>
    <version>2.8.4</version>
</dependency>
public static void main(String[] args) {
    Cache<String,String> cache = Caffeine.newBuilder()
        .maximumSize(1024)
        .expireAfterWrite(5, TimeUnit.SECONDS)
        .weakKeys()
        .weakValues()
        .removalListener((RemovalListener<String,String>) (key,value,cause) ->
                System.out.println("key:"+ key + ",value:"+value + ",cause:"+cause.toString()))
        .build();

    //将数据放到本地缓存中
    cache.put("username","caffer");
    cache.put("password","123456");

    //此处可以设置过期时间
    try {
      Thread.sleep(4000L);
    } catch (InterruptedException e) {
      e.printStackTrace();
    }
    //从本地取出数据
    System.out.println(cache.getIfPresent("username"));
    System.out.println(cache.getIfPresent("password"));
    System.out.println(cache.get("bolog",key -> {
      return "从redis缓存获取";
    }));
  }
AsyncLoadingCache<String, String> cache = Caffeine.newBuilder()
        // 数量上限
        .maximumSize(2)
        // 失效时间
        .expireAfterWrite(5, TimeUnit.MINUTES)
        .refreshAfterWrite(1, TimeUnit.MINUTES)
        // 异步加载机制
        .buildAsync(new CacheLoader<String, String>() {
            @Nullable
            @Override
            public String load(@NonNull String key) throws Exception {
                return getValue(key);
            }
        });
System.out.println(cache.get("username").get());
System.out.println(cache.get("password").get(6, TimeUnit.MINUTES));
System.out.println(cache.get("username").get(6, TimeUnit.MINUTES));
System.out.println(cache.get("blog").get());

 

Expiration mechanism

The expiration mechanism of the local cache is very important, because the data in the local cache does not need to be guaranteed not to be lost like business data. Locally cached data generally requires as little memory as possible under the premise of ensuring the hit rate, and can be GC dropped in extreme cases.

Caffeine's expiration mechanism is declared when constructing Cache, mainly as follows:

  1. expireAfterWrite: Indicates how long it will expire after the last write;

  2. expireAfterAccess: indicates how long it will expire after the last access (write or read);

  3. expireAfter: custom expiration strategy;

Refresh mechanism

Specify the refresh cycle through the refreshAfterWrite method when constructing the Cache. For example, refreshAfterWrite(10, TimeUnit.SECONDS) means refreshing every 10 seconds:

.build(new CacheLoader<String, String>() {
    @Override
    public String load(String k) {
        // 这里我们就可以从数据库或者其他地方查询最新的数据
        return getValue(k);
    }
});

Tips: Caffeine's refresh mechanism is "passive" . For example, if we declare to refresh every 10 seconds. We access and get the value v1 at time T. At T+5 seconds, the value in the database has been updated to v2. But at T+12 seconds, that is, 10 seconds have passed, the "still v1" we obtained from the local cache through Caffeine is not v2. During this acquisition process, Caffeine finds that 10 seconds have passed, and then loads v2 into the local cache, and v2 will be available the next time it is acquired. That is, its implementation principle is to call the refreshIfNeeded method to determine whether the data needs to be refreshed when calling afterRead in the get method. This means that if you do not read the data in the local cache, no matter what the refresh interval is, the data in the local cache will always be the old data!

 

Elimination mechanism

The removalListener method can be used to declare the removal of the listener when constructing the Cache, so that the historical information of the removed data in the local cache can be tracked. According to the enumeration value of RemovalCause.java, there are five removal strategies:

  • "EXPLICIT" : Call the method (for example: cache.invalidate(key), cache.invalidateAll) to display the deleted data;

  • "REPLACED" : It is not really removed, but the user calls some methods (for example: put(), putAll(), etc.) to overwrite the previous value;

  • "COLLECTED" : Indicates that the Key or Value in the cache has been garbage collected;

  • "EXPIRED" : expireAfterWrite/expireAfterAccess has no access within the agreed time period, which results in being removed;

  • "SIZE" : The reason why the number of elements exceeding the maximumSize limit is eliminated;

 

Differences between GuavaCache and Caffeine

  1. In terms of elimination algorithm, GuavaCache uses the "LRU" algorithm, while Caffeine uses the "Window TinyLFU" algorithm. This is the biggest and fundamental difference between the two.

  2. For immediate expiration, Guava will change the immediate expiration (for example: expireAfterAccess(0) and expireAfterWrite(0)) to set the maximum size to 0. This will result in the removal of reminders because of SIZE instead of EXPIRED. Caffiene can correctly identify the reason for this rejection.

  3. In terms of replacement reminders, as long as the data is replaced in Guava, no matter what the reason, it will trigger the culling listener. Caffiene will not trigger the listener when the substituted value is exactly the same as the previous value reference.

  4. In terms of asynchronization, a lot of Caffiene's work is handed over to the thread pool (default: ForkJoinPool.commonPool()), such as: removing listeners, refreshing mechanisms, maintenance work, etc.

Memory footprint comparison

Caffeine can delay initialization according to usage, or dynamically adjust its internal data structure. This can reduce the memory usage. As shown in the figure below, gradle memoryOverhead is used to pressure test the memory usage. The result may be affected by JVM pointer compression, object padding, etc.:

 

LRU P.K. W-TinyLFU

The cache eviction strategy is to predict which data is most likely to be used again in the short term, thereby increasing the cache hit rate. Due to its concise implementation, efficient runtime performance, and good hit rate in regular usage scenarios, the LRU (Least Recently Used) strategy may be the most popular eviction strategy. It has an excellent effect while keeping the algorithm simple. Not bad. However, LRU's predictions for the future have obvious limitations. It will consider that "the last data that arrives is the most likely to be accessed again" and give it the highest priority.

Modern caches have expanded the use of historical data, combining recency and frequency to better predict data.

 

Guava migration

So, if my project used GuavaCache before, how can I migrate to Caffeine at the lowest possible cost? Hey, Caffeine has already thought of this. It provides an adapter that allows you to operate its cache with Guava's interface. The code snippet is as follows:

// Guava's LoadingCache interface
LoadingCache<Key, Graph> graphs = CaffeinatedGuava.build(
    Caffeine.newBuilder().maximumSize(10_000),
    new CacheLoader<Key, Graph>() { // Guava's CacheLoader
        @Override public Graph load(Key key) throws Exception {
          return createExpensiveGraph(key);
        }
    });

 

Actual combat:

Filling strategy (Population)

Caffeine provides us with three filling strategies: manual, synchronous and asynchronous

Eviction strategy (eviction)

Caffeine provides three types of eviction strategies: size-based, time-based and reference-based.

Based on reference:

Please click the link to explain the concepts of strong reference, soft reference, and weak reference . Here are the differences between each reference:

The levels of Java 4 references from high to low are: strong reference> soft reference> weak reference> phantom reference

Reference type Garbage collection time use Survival time
Strong citation Never General state of the object Terminate when the JVM stops running
Soft reference When memory is low Object cache Terminate when memory is insufficient
Weak reference During garbage collection Object cache Terminate after gc runs
Phantom reference Unknown Unknown Unknown

Remove the listener (Removal)

concept:

  • Eviction: As a certain eviction strategy is satisfied, the deletion operation is automatically performed in the background
  • Invalidation: means that the caller manually deletes the cache
  • Removal: a listener that listens to eviction or invalid operations

Refresh

LoadingCache<Key, Graph> graphs = Caffeine.newBuilder()
    .maximumSize(10_000)
    // 指定在创建缓存或者最近一次更新缓存后经过固定的时间间隔,刷新缓存
    .refreshAfterWrite(1, TimeUnit.MINUTES)
    .build(key -> createExpensiveGraph(key));

Refresh and eviction are not the same.

Refreshing is specified by the LoadingCache.refresh(key) method and executed by calling the CacheLoader.reload method. The refresh key will asynchronously load the new value for this key and return the old value (if any). The eviction will block the query operation until the eviction is completed and no other operations will be performed.

The difference with expireAfterWrite is that refreshAfterWrite will determine whether the data meets the query conditions when querying the data, and if the conditions are met, the cache will perform the refresh operation. For example, you can specify refreshAfterWrite and expireAfterWrite at the same time in the same cache. The data will be refreshed only when the data meets the refresh conditions, and the refresh operation will not be performed blindly.

If the data has not been queried again after being refreshed, the data will also expire.

The refresh operation is performed asynchronously using Executor. The default executor is ForkJoinPool.commonPool(), which can be overridden by Caffeine.executor(Executor).

If an exception is raised during refresh, the log is used to record the log, and it will not be thrown.

Statistics (Statistics)

Cache<Key, Graph> graphs = Caffeine.newBuilder()
    .maximumSize(10_000)
    .recordStats()
    .build();

Using Caffeine.recordStats(), you can turn on statistics collection. The Cache.stats() method returns CacheStats that provide statistical information, such as:

  • hitRate(): Returns the ratio of hits to requests
  • hitCount(): Returns the total number of hits in the cache
  • evictionCount(): the number of cache evictions
  • averageLoadPenalty(): The average time it takes to load a new value

 

TIPS:

  • expireAfterAccess(long, TimeUnit): How long does it expire after the last access (read or write)
  • expireAfterWrite(long, TimeUnit): How long does it expire after the last time it was created or modified
  • expireAfter(Expiry): How long does it expire after creation 

 

Reference documents:

Kill GuavaCache : Caffeine is the king of local cache:  https://blog.csdn.net/u013256816/article/details/106740641

Caffeine source code:  https://github.com/ben-manes/caffeine 

Caffeine cache : https://my.oschina.net/xiaolyuh/blog/3109290

 

 

Guess you like

Origin blog.csdn.net/chajinglong/article/details/113079264