Guava: Cache, a powerful local caching framework

Guava Cache is a very excellent local caching framework.

1. Classic configuration

The data structure of Guava Cache is similar to JDK1.7's ConcurrentHashMap. It provides three recycling strategies based on time, capacity, and reference, as well as functions such as automatic loading and access statistics.

basic configuration

    @Test
    public void testLoadingCache() throws ExecutionException {
        CacheLoader<String, String> cacheLoader = new CacheLoader<String, String>() {
            @Override
            public String load(String key) throws Exception {
                System.out.println("加载 key:" + key);
                return "value";
            }
        };

        LoadingCache<String, String> cache = CacheBuilder.newBuilder()
                //最大容量为100（基于容量进行回收）
                .maximumSize(100)
                //配置写入后多久使缓存过期
                .expireAfterWrite(10, TimeUnit.SECONDS)
                //配置写入后多久刷新缓存
                .refreshAfterWrite(1, TimeUnit.SECONDS)
                .build(cacheLoader);

        cache.put("Lasse", "穗爷");
        System.out.println(cache.size());
        System.out.println(cache.get("Lasse"));
        System.out.println(cache.getUnchecked("hello"));
        System.out.println(cache.size());

    }

In the example, the maximum cache capacity is set to 100 ( recycling based on capacity ), and the invalidation policy and refresh policy are configured .

1. Failure strategy

When configured expireAfterWrite , cache items expire within a specified amount of time after they are created or last updated.

2. Refresh strategy

Configure refreshAfterWrite the refresh time so that new values can be reloaded when cached items expire.

In this example, some students may have questions: Why do we need to configure the refresh strategy? Isn't it enough to just configure the invalidation strategy ?

Of course it is possible, but in high concurrency scenarios, configuring the refresh strategy will be miraculous. Next, we will write a test case to facilitate everyone's understanding of Gauva Cache's thread model.

2. Understand the thread model

We simulate the operation of both "cache expiration and execution of the load method" and "refresh and execution of the reload method" in a multi-threaded scenario.

@Test
    public void testLoadingCache2() throws InterruptedException, ExecutionException {
        CacheLoader<String, String> cacheLoader = new CacheLoader<String, String>() {
            @Override
            public String load(String key) throws Exception {
                System.out.println(Thread.currentThread().getName() + "加载 key" + key);
                try {
                    Thread.sleep(500);
                } catch (InterruptedException e) {
                    throw new RuntimeException(e);
                }
                return "value_" + key.toLowerCase();
            }

            @Override
            public ListenableFuture<String> reload(String key, String oldValue) throws Exception {
                System.out.println(Thread.currentThread().getName() + "加载 key" + key);
                Thread.sleep(500);
                return super.reload(key, oldValue);
            }
        };
        LoadingCache<String, String> cache = CacheBuilder.newBuilder()
                //最大容量为20（基于容量进行回收）
                .maximumSize(20)
                //配置写入后多久使缓存过期
                .expireAfterWrite(10, TimeUnit.SECONDS)
                //配置写入后多久刷新缓存
                .refreshAfterWrite(1, TimeUnit.SECONDS)
                .build(cacheLoader);

        System.out.println("测试过期加载 load------------------");

        ExecutorService executorService = Executors.newFixedThreadPool(5);
        for (int i = 0; i < 5; i++) {
            executorService.execute(new Runnable() {
                @Override
                public void run() {
                    try {
                        long start = System.currentTimeMillis();
                        System.out.println(Thread.currentThread().getName() + "开始查询");
                        String hello = cache.get("hello");
                        long end = System.currentTimeMillis() - start;
                        System.out.println(Thread.currentThread().getName() + "结束查询 耗时" + end);
                    } catch (Exception e) {
                        throw new RuntimeException(e);
                    }
                }
            });
        }

        cache.put("hello2", "旧值");
        Thread.sleep(2000);
        System.out.println("测试重新加载 reload");
        //等待刷新，开始重新加载
        Thread.sleep(1500);
        ExecutorService executorService2 = Executors.newFixedThreadPool(5);
//        CyclicBarrier cyclicBarrier = new CyclicBarrier(3);
        for (int i = 0; i < 5; i++) {
            executorService2.execute(new Runnable() {
                @Override
                public void run() {
                    try {
                        long start = System.currentTimeMillis();
                        System.out.println(Thread.currentThread().getName() + "开始查询");
                        //cyclicBarrier.await();
                        String hello = cache.get("hello2");
                        System.out.println(Thread.currentThread().getName() + "：" + hello);
                        long end = System.currentTimeMillis() - start;
                        System.out.println(Thread.currentThread().getName() + "结束查询 耗时" + end);
                    } catch (Exception e) {
                        throw new RuntimeException(e);
                    }
                }
            });
        }
        Thread.sleep(9000);
    }

The execution results are shown in the figure below

The execution results show that: Guava Cache does not have a background task thread to asynchronously execute the load or reload method.

Invalidation strategy : expireAfterWrite Allow one thread to execute the load method, while other threads block and wait.

When a large number of threads obtain cached values with the same key, only one thread will enter the load method, while other threads wait until the cached value is generated. This also avoids the risk of cache breakdown. In high concurrency scenarios, this will still block a large number of threads.
Refresh strategy : refreshAfterWrite Allow one thread to execute the load method, and other threads to return the old value.

Under single key concurrency, using refreshAfterWrite will not block, but if multiple keys happen to expire at the same time, it will still put pressure on the database.

In order to improve system performance, we can optimize from the following two aspects:

Configure refresh < expire to reduce the probability of blocking a large number of threads;
Adopt an asynchronous refresh strategy, that is, the thread loads data asynchronously, during which all requests return the old cache value to prevent cache avalanche.

The figure below shows the timeline of the optimization plan:

3. Two ways to implement asynchronous refresh

3.1 Override the reload method

ExecutorService executorService = Executors.newFixedThreadPool(5);
        CacheLoader<String, String> cacheLoader = new CacheLoader<String, String>() {
            @Override
            public String load(String key) throws Exception {
                System.out.println(Thread.currentThread().getName() + "加载 key" + key);
                //从数据库加载
                return "value_" + key.toLowerCase();
            }

            @Override
            public ListenableFuture<String> reload(String key, String oldValue) throws Exception {
                ListenableFutureTask<String> futureTask = ListenableFutureTask.create(() -> {
                    System.out.println(Thread.currentThread().getName() + "异步加载 key" + key);
                    return load(key);
                });
                executorService.submit(futureTask);
                return futureTask;
            }
        };
        LoadingCache<String, String> cache = CacheBuilder.newBuilder()
                //最大容量为20（基于容量进行回收）
                .maximumSize(20)
                //配置写入后多久使缓存过期
                .expireAfterWrite(10, TimeUnit.SECONDS)
                //配置写入后多久刷新缓存
                .refreshAfterWrite(1, TimeUnit.SECONDS)
                .build(cacheLoader);

3.2 Implement asyncReloading method

ExecutorService executorService = Executors.newFixedThreadPool(5);

        CacheLoader.asyncReloading(
                new CacheLoader<String, String>() {
                    @Override
                    public String load(String key) throws Exception {
                        System.out.println(Thread.currentThread().getName() + "加载 key" + key);
                        //从数据库加载
                        return "value_" + key.toLowerCase();
                    }
                }
                , executorService);

4. Asynchronous refresh + multi-level cache

Scenes :

An e-commerce company needs to optimize the performance of the app homepage interface. It took the author about two days to complete the entire solution, using a two-level cache mode and Guava's asynchronous refresh mechanism.

The overall architecture is shown in the figure below:

The cache reading process is as follows :

1. When the business gateway is just started, there is no data in the local cache. Read the Redis cache. If there is no data in the Redis cache, call the shopping guide service through RPC to read the data, and then write the data to the local cache and Redis; if the Redis cache If not empty, the cached data will be written to the local cache.

2. Since the local cache has been warmed up in step 1, subsequent requests directly read the local cache and return it to the user.

3. Guava is configured with a refresh mechanism, which will call the custom LoadingCache thread pool (5 maximum threads, 5 core threads) every once in a while to synchronize data from the shopping guide service to the local cache and Redis.

After optimization, the performance is very good, the average time consumption is about 5ms, and the frequency of applying GC is greatly reduced.

This solution still has flaws. One night we found that the data displayed on the home page of the app was sometimes the same and sometimes different.

That is to say: although the LoadingCache thread has been calling the interface to update the cache information, the data in the local cache of each server is not completely consistent.

This illustrates two very important points:

1. Lazy loading may still cause data inconsistency on multiple machines;

2. The number of LoadingCache thread pools is not configured reasonably, resulting in a pile-up of tasks.

The suggested solution is :

1. Asynchronous refresh combines the message mechanism to update the cache data, that is: when the configuration of the shopping guide service changes, the business gateway is notified to re-pull the data and update the cache.

2. Appropriately increase the thread pool parameters of LoadingCache, and bury points in the thread pool to monitor the usage of the thread pool. When the thread is busy, an alarm can be issued, and then the thread pool parameters can be modified dynamically.

5. Summary

Guava Cache is very powerful. It does not have a background task thread to asynchronously execute the load or reload method. Instead, it performs related operations through request threads.

In order to improve system performance, we can deal with it from the following two aspects:

Configure refresh < expire to reduce the probability of blocking a large number of threads.
Adopt an asynchronous refresh strategy, that is, the thread loads data asynchronously, during which all requests return the old cached value .

Nonetheless, we still need to consider cache and database consistency issues when using this approach.