[Guava] Guava Cache's refresh and expire refresh mechanism

Insert picture description here

1 Overview

Reprinted: https://www.cnblogs.com/liuxiaochong/p/13613071.html
Overview reference: [Guava] Google Guava local high-efficiency cache
Case reference: [guava] Do some operations when the GuavaCache cache is invalid. RemovalListener

2. Thinking and guessing

First look at three time-based ways to clean or refresh cached data:

expireAfterAccess: 当缓存项在指定的时间段内没有被读或写就会被回收。
expireAfterWrite:当缓存项在指定的时间段内没有更新就会被回收。
refreshAfterWrite:当缓存项上一次更新操作之后的多久会被刷新。 

Considering timeliness, we can use expireAfterWrite to invalidate the cache at a specified time after each update, and then reload the cache. Guava cache will strictly limit only one load operation, which will prevent a large number of requests from penetrating to the back end at the moment of cache failure and causing an avalanche effect.

However, by analyzing the source code, guava cache locks when there is only one load operation, and other requests must block waiting for the load operation to complete; moreover, after the load is completed 其他请求的线程会逐一获得锁, to determine whether the load has been completed, each thread must Take turns to go through a " 获得锁,获得值,释放锁" process, so there will be some loss of performance. Here, since we plan to cache locally for 1 second, frequent expiration and loading, lock waiting and other processes will cause a greater loss of performance.

So we consider using refreshAfterWrite. The feature of refreshAfterWrite is that in the refresh process, only one reload operation is strictly limited, and other queries return the old value first, which can effectively reduce waiting and lock contention, so refreshAfterWrite will perform better than expireAfterWrite. But it also has a disadvantage, because after the specified time is reached, it cannot strictly guarantee that all queries get the new value. The original students who have understood the timing failure (or refresh) of the guava cache all know that the guava cache does not use additional threads for timing cleaning and loading functions, but relies on query requests. When querying, compare the last update time, and load or refresh if it exceeds the specified time. Therefore, if you use refreshAfterWrite, in the case of very low throughput, such as no query for a long time, the query that occurred may get an old value (this old value may come from a long time ago), which will Will cause problems.

It can be seen that refreshAfterWrite and expireAfterWrite have their own advantages and disadvantages, and each has its own usage scenarios. So can you find a compromise between refreshAfterWrite and expireAfterWrite? For example, control the cache to refresh every 1s. If there is no access for more than 2s, the cache will be invalidated, and the old value will not be obtained the next time it is accessed, but the new value must be loaded. Since the official guava document did not give a detailed explanation, and I did not get the answer after consulting some online materials, I could only analyze the source code to find the answer. After analysis, when both are used at the same time, the expected effect can be achieved. This is really good news!

3. Source code analysis

By tracing the source code of the get method of LoadingCache, I found that the following core methods will eventually be called. The source code is posted below:

com.google.common.cache.LocalCache.Segment.get方法:

Insert picture description here

This buffered get method, number 1 is to determine whether there is a survival value, that is, whether it has expired according to expireAfterAccess and expireAfterWrite, if it expires, the value is null, and number 3 is executed. Number 2 refers to whether refresh is needed according to refreshAfterWrite if it does not expire. And number 3 needs to be loaded (load instead of reload), because there is no survival value, which may be due to expired, or there may be no value at all. From the code point of view, when get, it is the first to judge the expiration, and then the refresh, so we can set refreshAfterWrite to 1s and expireAfterWrite to 2s. When the access is frequent, refresh will be performed every second, and When there is no access for more than 2s, the new value must be loaded for the next access.

Let's continue to follow the vine, take a look at what load and refresh do respectively, and verify the following theories mentioned above.

Let's take a look at the com.google.common.cache.LocalCache.Segment.lockedGetOrLoad method:

Insert picture description here

This method is a bit long, due to space limitations, not all the code is posted, there are 7 key steps:

1. Obtain the lock;

2. Obtain the valueReference corresponding to the key;

3. Judge whether the cached value is loading, if it is loading, no more load operation (by setting createNewEntry to false), and then wait for the new value to be obtained;

4. If it is not loading, judge whether there is a new value (loaded by other requests), if it is, return the new value;

5. Prepare loading and set it to loadingValueReference. loadingValueReference will make other requests find that they are loding in step 3;

6. Release the lock;

7. If you really need to load, perform the load operation.

Through analysis, it is found that there will be only one load operation, and other get will block first, verifying the previous theory.

Let's take a look at the com.google.common.cache.LocalCache.Segment.scheduleRefresh method:

Insert picture description here

1. Determine whether refresh is needed, and the current non-loading state, if yes, perform the refresh operation and return the new value.

2. Step 2 is added by me to prepare for the next test. If refresh is needed, but other threads are refreshing the value, print it, and eventually return the old value.

Continue to dive into the refresh method called in step 1:

Insert picture description here

1. Insert loadingValueReference to indicate that the value is loading, and other requests will determine whether it needs to refresh or return the old value based on this. There is a lock operation in insertLoadingValueReference to ensure that only one refresh penetrates to the back end. Due to space limitations, it will not be expanded here. However, the range of locking here is smaller than the range of locking during load. In the process of expire->load, once all get knows expire, they need to acquire the lock until the new value is obtained. The influence range of blocking will be from Expire until the load reaches the new value; in the refresh->reload process, once get finds that refresh is needed, it will first determine whether there is loading, then obtain the lock, and then release the lock before reloading. The blocking range is only a small part of the insertLoadingValueReference Object new and set operations are almost negligible, so this is one of the reasons why refresh is more efficient than expire.

2. Perform refresh operation. LoadAsync is not expanded here. It calls the reload method of CacheLoader. The reload method supports reloading to achieve asynchronous loading, and the current thread returns the old value, so the performance will be better. The default is to call synchronously The load method of CacheLoader is implemented.

Here, we know the difference between refresh and expire! Refresh executes reload, and after expires, it re-executes load, the same as during initialization.

4. Testing and verification

In the source code posted above, you should notice some System.out.println statements, which I added to facilitate subsequent testing and verification. Now let's verify the procedure for the analysis just now.

Post the test source code:

import java.util.concurrent.CountDownLatch;
import java.util.concurrent.CyclicBarrier;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeUnit;    
       
import com.google.common.cache.CacheBuilder;
import com.google.common.cache.CacheLoader;
import com.google.common.cache.LoadingCache ;
import com.google.common.util.concurrent.Futures;
import com.google.common.util.concurrent.ListenableFuture; 
 
public class ConcurrentTest {
    
    
      
     private static final int CONCURRENT_NUM = 10;//并发数
      
     private volatile static int value = 1;
      
     private static LoadingCache <String, String> cache = CacheBuilder.newBuilder().maximumSize(1000).expireAfterWrite(5, TimeUnit. SECONDS)
           .refreshAfterWrite(1, TimeUnit. SECONDS)
           .build(newCacheLoader<String, String>() {
    
    
                 public String load(String key) throws InterruptedException {
    
    
                     System. out.println( "load by " + Thread.currentThread().getName());
                      return createValue(key);
                }
 
 
                 @Override
                 public Listenable Future<String> reload(String key, String oldValue)
                            throwsException {
    
    
                     System. out.println( "reload by " + Thread.currentThread().getName());
                      return Futures.immediateFuture(createValue(key ));
                }
           }
     );
            
     //创建value
     private static String createValue(String key) throws InterruptedException{
    
    
           Thread. sleep(1000L);//让当前线程sleep 1秒,是为了测试load和reload时候的并发特性
            return String.valueOf(value++);
     }
      
     public static void main(String[] args) throws InterruptedException, ExecutionException {
    
    
            CyclicBarrier barrier = newCyclicBarrier(CONCURRENT_NUM );
            CountDownLatch latch = newCountDownLatch(CONCURRENT_NUM );
            for(inti = 0; i < CONCURRENT_NUM; i++) {
    
    
            finalClientRunnable runnable = newClientRunnable(barrier, latch );
            Thread thread = newThread( runnable, "client-"+ i);
            thread.start();
        }
            
            //测试一段时间不访问后是否执行expire而不是refresh
            latch.await();
           Thread.sleep(5100L);
           System.out.println( "\n超过expire时间未读之后...");
           System.out.println(Thread. currentThread().getName() + ",val:"+ cache .get("key"));
     }
      
     static class Client Runnable implementsRunnable{
    
    
            
           CyclicBarrier barrier;
           CountDownLatch latch;
            
            public Client Runnable(CyclicBarrier barrier, CountDownLatch latch){
    
    
                 this.barrier = barrier;
                 this.latch = latch;
           }
            
            public void run() {
    
    
                 try{
    
    
                     barrier.await();
                     Thread.sleep((long)(Math.random()*4000));//每个client随机睡眠,为了充分测试refresh和load
                     System.out.println(Thread. currentThread().getName() + ",val:"+ cache .get("key"));
                     latch.countDown();
                }catch(Exception e) {
    
    
                      e.printStackTrace();
                }
           }
            
     }
 
}
 

Copy code
execution result:

Insert picture description here

The verification results are consistent with expectations:

1. When the cache has not been initialized, client-1 recently obtained the load lock and performed the load operation. During the load, other clients also entered the load process, blocked, waited for client-1 to release the lock, and then obtained the lock in turn . In the end, only load by client-1.

2. When there is no access within the time set by refreshAfterWrite, it needs to be refreshed, and client-5 is refreshed. In this process, other clients do not obtain the lock, but directly query the old value, and get the new value after refresh. Value, the transition is smooth.

3. There is no access within the time set by expireAfterWrite. When the main thread accesses, the value has expired, and the load operation needs to be performed without getting the old value.

Reprinted at: https://blog.csdn.net/abc86319253/article/details/53020432

Guess you like

Origin blog.csdn.net/qq_21383435/article/details/108835727