Local cache, Redis data cache strategy

insert image description here

Table of contents

Hello everyone, I am Nezha.

When I first came into contact with caching, I used map to do it. At that time, I was doing a real-time data synchronization function.

Requirements seem simple, one take one pass

  1. At that time, the server data was obtained through websocket;
  2. Then cache the data in the local map according to the data category;
  3. Made a scheduled task and uploaded it to a third-party server via ftp;

When there is concurrency, map will not work, and the data will be disordered. Using ConcurrentHashMap can solve the problem of concurrent data disorder.

  • The on-site network is very unstable, FTP is up and down;
  • What we are doing is a real-time monitoring system for security issues. The third-party data requirements are still very strict and must be 100% accurate.

How to resolve this contradiction is unclear.

At first, it was solved by restarting, haha, restarting solved all troubles.

  1. Add a heartbeat function to monitor the status of the FTP service in real time;
  2. If it is broken for more than 7 seconds, the alarm function will be adopted. I remember setting the fire alarm music to remind the on-site personnel to check the FTP network;
  3. If it is disconnected for more than 1 minute, the software will automatically restart.

However, a new problem arises, the data is lost.

Because ConcurrentHashMap is used to cache data, that is, local cache, if you restart, the data will be gone? brother.

Later, I found out that what I did at the time was really poor, and the local cache should have many functions, but there were none of these at the time.

  1. Exceeding the maximum limit has a corresponding elimination strategy such as LRU, LFU
  2. Elimination of expiration time such as timing, lazy, regular
  3. Persistence
  4. Statistical monitoring

From the following dimensions of caching, local caching, Redis caching, and Redis caching strategy, we will learn about caching in an all-round and systematic manner.

1. Cache

Caching is to load the hot data with high access volume from the traditional relational database into the memory. When the user accesses the hot data again, it is loaded from the memory, which reduces the number of visits to the database and solves the problem of high concurrency scenarios. It is easy to cause database downtime.

What are the categories of caches:

  1. Operating system disk cache, reducing disk mechanical operations
  2. Database caching to reduce filesystem I/O
  3. Application caching to reduce queries to the database
  4. Web server caching, reducing application server requests
  5. Client browser caching to reduce visits to the website

Local cache : set aside a part of the client's local physical memory to cache the data that the client writes back to the server. When the local write-back cache reaches the cache threshold, the data is written to the server.

Second, analyze the advantages of local cache

Data caching brings many advantages, two of which are core:

  • Reduce database pressure : By storing commonly used data in fast-access memory, caching effectively reduces the pressure on the back-end database. This means that the database can focus more on processing complex queries and update operations, instead of having to deal with frequent repeated read requests.
  • Improve response speed : storing data in the cache enables the system to respond to user requests more quickly. Compared with getting data from the database every time, caching can provide the required information within milliseconds, which greatly improves the user experience.

3. Local cache solution?

ConcurrentHashMap is introduced above, so I won't go into details here.

1. Realize local cache based on Guava Cache

Guava is a Java core enhancement library open sourced by the Google team. It includes collection, concurrency, caching, IO, reflection and other toolbox performance and stability. It is widely used.

Guava Cache supports many features:

  1. Support maximum capacity limit
  2. Supports two expiration deletion strategies insertion time and access time
  3. Support simple statistical functions
  4. Implementation based on LRU algorithm

2. Realize local cache based on Caffeine

Caffeine is a new generation of cache tool implemented based on java8. The cache performance is close to the theoretical optimum. It can be regarded as an enhanced version of Guava Cache, and the functions are similar.

The difference is that Caffeine uses an algorithm W-TinyLFU that combines the advantages of LRU and LFU, which has obvious advantages in performance.

3. Realize local cache based on Encache

Encache is a pure Java in-process caching framework with fast and lean features.

Compared with Caffeine and Guava Cache, Encache has richer functions and stronger scalability.

advantage:

  1. Support multiple cache elimination algorithms including LRU, LFU and FIFO
  2. Cache supports in-heap storage, off-heap storage, and disk storage supports persistence
  3. Support multiple cluster solutions to solve data sharing problems

4. Introduce Redis

Later, because of an accident, Party A was fined 1 million yuan by the regulatory platform. The essential reason was the loss of data.

How can this be good, I was also in a cold sweat, thinking about the rectification plan overnight, the final solution is, "introduce Redis".

As a high-performance, memory-storage cache database, Redis is widely used in the scenario of caching data.

  1. When the user accesses the data for the first time, there is no data in the cache, and the data must be obtained from the database, because the process of reading data from the disk is relatively slow.
  2. After getting the data, store the data in the cache;
  3. When the user accesses the data for the second time, it can be obtained directly from the cache, because the cache directly operates the memory, and the data access speed is relatively fast.

insert image description here

The following will discuss Redis's data caching strategy in depth, focusing on algorithms such as LRU (least recently used) and LFU (least frequently used), and share how to improve the efficiency of the cache system through performance optimization.

5. Redis data caching strategy

1. Why do you need a data caching strategy?

In modern applications, data caching plays a vital role.

By storing frequently accessed data in memory, we are able to avoid unnecessary database queries, thereby significantly improving system responsiveness and throughput.

However, with the continuous increase of application scale and user visits, an effective data caching strategy becomes particularly important.

We need to find the best balance between performance and resource utilization to meet different needs and challenges.

This further leads to a key question: how to choose a suitable data caching strategy to meet different application scenarios?

The following diagram illustrates in detail the advantages of data caching and the process of selecting a suitable data caching strategy:

insert image description here

Through the above figure, we explored the advantages of data caching in depth and showed how we can find the best balance between improving performance and resource utilization when choosing an appropriate caching strategy.

Choosing a suitable strategy can effectively reduce the pressure on the database and provide a better user experience by improving the response speed.

2. Advantages of Redis as a cache

Redis (Remote Dictionary Server) is a powerful high-performance open source memory database, which is not only widely used in caching scenarios, but also used as queues, publish-subscribe systems, etc. As a cache database, Redis has a series of outstanding advantages:

(1) High performance features

Redis data is stored in memory, so it has excellent read and write performance. Its efficient data structure and optimized algorithm enable most of the read and write operations to be completed within microseconds, meeting the needs of high-concurrency applications.

(2) Diverse cache strategies

Redis provides a variety of data caching strategies, allowing developers to choose the appropriate strategy based on business characteristics. This flexibility allows us to decide when data is cleaned or retained based on its access patterns, frequency of use, and other factors.

The following diagram illustrates the caching strategy selection process:

insert image description here

By analyzing the data access pattern, select the appropriate caching strategy according to the access frequency of the data. Continuously monitor data access according to actual conditions, optimize caching strategies, and flexibly apply these strategies in different scenarios.

6. LRU Algorithm: Least Recently Used

The LRU (Least Recently Used) algorithm is a classic cache replacement strategy. Its core idea is to first eliminate the least recently used data to make room for new data. In data caching scenarios, the LRU algorithm can retain popular data, thereby improving the cache hit rate.

1. Analysis of the principle of LRU algorithm

The principle of the LRU algorithm is very intuitive: when the cache space is full, the system will first eliminate the data that has not been accessed for the longest time. The idea behind this strategy is that if a piece of data has not been accessed in the recent period, it may not be accessed in the future either. This replacement strategy helps keep the data in the cache fresh 热数据, that is, the most recently accessed data.

insert image description here

The above figure illustrates how the LRU algorithm retains data in the cache according to the order of access. The most recently accessed data will be kept in the cache, while the oldest accessed data will be replaced first.

The sample code is as follows, showing how to LinkedHashMapimplement LRU caching through inheritance:

import java.util.LinkedHashMap;
import java.util.Map;

class LRUCache<K, V> extends LinkedHashMap<K, V> {
    
    
    private final int MAX_CAPACITY;

    public LRUCache(int capacity) {
    
    
        super(capacity, 0.75f, true);
        MAX_CAPACITY = capacity;
    }

    @Override
    protected boolean removeEldestEntry(Map.Entry<K, V> eldest) {
    
    
        return size() > MAX_CAPACITY;
    }
}

In this example, we create a LRUCacheclass that inherits from LinkedHashMap. By rewriting removeEldestEntrythe method, we specify that when the cache size exceeds a certain threshold, the data that has not been accessed for the longest time is automatically deleted.

2. Apply LRU algorithm in redis

In Redis, we can maxmemory-policyenable the caching strategy of the LRU algorithm through configuration options. When the memory usage of Redis reaches the limit, the LRU algorithm will be used to eliminate some data to make room for new data.

Here is an example of how to enable LRU caching strategy in Redis:

# 启用LRU缓存策略
CONFIG SET maxmemory-policy allkeys-lru

3. The advantages and limitations of the LRU algorithm

insert image description here

The LRU (Least Recently Used) algorithm is a commonly used data caching strategy, which has some obvious advantages and some limitations when managing cached data.

advantage

advantage describe
for hot data The LRU algorithm retains the most recently accessed data, so it is very suitable for scenarios with obvious access hotspots.
simple and effective The implementation of the LRU algorithm is relatively simple and does not require complex calculations and maintenance.

limit

limit describe
periodic visit The LRU algorithm may cause unnecessary data replacement due to periodic data access, especially in some special business scenarios.
cache pollution The LRU algorithm is easily affected by a large number of sudden accesses, which may cause the "hot" data in the cache to be eliminated, thereby affecting the cache effect.

Seven, LFU algorithm: the least frequently used

The LFU (Least Frequently Used) algorithm is a cache replacement strategy similar to LRU. Its core idea is to first eliminate the least frequently used data to make room for new data. In some specific scenarios, the LFU algorithm can better adapt to changes in data access patterns.

1. Analysis of the principle of LFU algorithm

The principle of the LFU algorithm is similar to the LRU algorithm, but the difference is that the LFU algorithm makes replacement decisions based on the frequency of data being accessed, not just the chronological order of access. The LFU algorithm maintains a record of data access frequency. When data needs to be eliminated, the data with the lowest access frequency will be selected first.

insert image description here

The figure above illustrates how the LFU algorithm retains data in the cache according to the frequency of data access. Frequently accessed data will be retained, while infrequently accessed data will be replaced first.

2. Apply LFU algorithm in Redis

In Redis, you can enable the cache policy of the LFU algorithm by configuring the maxmemory-policy option. When the memory usage of Redis reaches the limit, the LFU algorithm will be used to evict some data to make room for new data.

Here is an example of how to enable the LFU caching strategy in Redis:

# 启用LFU缓存策略
CONFIG SET maxmemory-policy allkeys-lfu

3. Advantages and limitations of LFU algorithm

insert image description here

The LFU (Least Frequently Used) algorithm is an alternative data caching strategy, which has some obvious advantages and some limitations in different scenarios.

advantage

advantage describe
Good for frequent refresh The LFU algorithm can give priority to retaining frequently refreshed data, which is suitable for some periodic access scenarios.
Sensitive to changes in data heat Compared with the LRU algorithm, the LFU algorithm is more adaptable to changes in data access patterns and can better reflect the heat of data.

limit

limit describe
computational complexity The LFU algorithm needs to maintain data access frequency records, which may lead to certain computational complexity, especially in large-scale data scenarios.
cold start problem For the data that has just been accessed, it may be difficult for the LFU algorithm to make an appropriate replacement decision due to insufficient access frequency information.

insert image description here

8. Other data caching strategies

1、Least Recently Used with Sampling(LRUS)

In addition to the traditional LRU algorithm, there is an improved version, the LRUS (Least Recently Used with Sampling) algorithm. The LRUS algorithm records data access through periodic sampling, thereby better estimating the most recently used data and reducing the "cold start" problem in the LRU algorithm.

LRUS algorithm principle

The LRUS algorithm introduces a sampling mechanism to more accurately determine which data is hot and which is cold by periodically recording the access status of some data. Different from the traditional LRU algorithm, the LRUS algorithm can better adapt to changes in data access patterns and improve the hit rate of data caches.

insert image description here

The LRUS algorithm in the above figure records data access through periodic sampling, so as to more accurately determine which data should be retained and which should be replaced.

2. Random Replacement

Random replacement is a simple but effective caching strategy. Unlike LRU and LFU, the random replacement strategy does not consider the access time or frequency of data, but randomly selects the data to be replaced. Although this sounds unintelligent, in some scenarios, the random replacement strategy shows unexpected advantages.

The principle of random replacement

The core idea of ​​random replacement is to randomly select a piece of data from the cache for replacement every time data needs to be replaced. Although this strategy does not take into account the popularity or frequency of data, in some special cases, random replacement can prevent specific data from being frequently eliminated, thereby maintaining a certain degree of data diversity.

insert image description here

In the above figure, the random replacement algorithm randomly selects the data to be replaced, thereby maintaining data diversity in some cases.

9. Performance optimization and practical application

1. Performance considerations of data caching strategy

Performance is a key factor when selecting and configuring a data caching strategy. Different caching strategies are applicable to different business scenarios, so multiple factors need to be considered comprehensively when making a decision.

(1) Balance between cache size and hit rate

When configuring the cache size, you need to weigh the total size of the cache and the amount of data actually stored. A cache that is too small may reduce the hit rate and cannot effectively reduce the load on the database, while a cache that is too large may waste memory resources. Cache size can usually be optimized by monitoring hit ratio and cache utilization.

(2) Analysis of data access mode

Analyzing the data access patterns of the business is critical to choosing an appropriate caching strategy. For example, if some data is accessed frequently and other data is rarely accessed, choosing an appropriate strategy can improve the effectiveness of caching. For frequently accessed hot data, you can choose the LRU or LFU strategy, and for less frequently accessed cold data, you can consider a random replacement strategy.

2. Practical application case: e-commerce website

Let us use a practical application case to show how to choose the appropriate caching strategy according to business needs. Consider an e-commerce site where users frequently visit product listings, product details, and shopping cart pages. For this scenario, different caching strategies can be selected to optimize performance.

(1) Cache strategy selection for e-commerce websites

Product list page : Since the product information on the product list page changes frequently, you can choose the LRU or random replacement strategy. This preserves recent product data and improves page load speed.

// 使用LRU算法实现商品列表页缓存
LRUCache<String, List<Product>> productListCache = new LRUCache<>(1000); // 缓存容量1000

List<Product> cachedProductList = productListCache.get("productList");
if (cachedProductList == null) {
    
    
    // 从数据库获取商品列表数据
    List<Product> productList = database.getProductList();
    productListCache.put("productList", productList);
    cachedProductList = productList;
}

Product details page : The data on the product details page is relatively stable, so it is suitable to choose the LFU strategy. In this way, the frequently accessed product details data can be preserved and the page response speed can be improved.

// 使用LFU算法实现商品详情页缓存
LFUCache<String, ProductDetails> productDetailsCache = new LFUCache<>(500); // 缓存容量500

ProductDetails cachedProductDetails = productDetailsCache.get("product123");
if (cachedProductDetails == null) {
    
    
    // 从数据库获取商品详情数据
    ProductDetails productDetails = database.getProductDetails("product123");
    productDetailsCache.put("product123", productDetails);
    cachedProductDetails = productDetails;
}

Shopping cart page : The data on the shopping cart page is closely related to the user, and the LRU or LRUS strategy can be selected. This preserves the most recently accessed cart data, providing a better user experience.

// 使用LRUS算法实现购物车页缓存
LRUSCache<String, ShoppingCart> shoppingCartCache = new LRUSCache<>(200); // 缓存容量200

ShoppingCart cachedShoppingCart = shoppingCartCache.get("user123");
if (cachedShoppingCart == null) {
    
    
    // 从数据库获取购物车数据
    ShoppingCart shoppingCart = database.getShoppingCart("user123");
    shoppingCartCache.put("user123", shoppingCart);
    cachedShoppingCart = shoppingCart;
}

(2) Performance optimization and practical application improvement

In practical applications, e-commerce websites can significantly improve page loading speed and user experience by properly configuring caching strategies and optimizing cache size. At the same time, by monitoring changes in data access patterns, the caching strategy can be dynamically adjusted to further optimize performance.

10. Summary and Practical Guidance

1. The importance of Redis data caching strategy

Data caching can not only improve system performance, but also reduce the pressure on the back-end database, resulting in faster response time and better user experience. In modern high-concurrency applications, optimizing data caching strategies has become an indispensable part of system design.

2. How to choose an appropriate caching strategy

In practical applications, choosing an appropriate caching strategy is crucial. According to different business scenarios and data access patterns, we can flexibly choose cache strategies such as LRU, LFU, LRUS, and random replacement. At the same time, the size of the cache can be dynamically adjusted according to actual needs to achieve the best balance between performance and resource utilization.

Practical guidance :

  1. Analyze data access patterns : Before choosing a caching strategy, you first need to analyze the data access patterns in detail. Which data is frequently accessed? Which data changes less? Based on this information, select an appropriate caching strategy.
  2. Select an appropriate algorithm : Select an appropriate cache algorithm based on business requirements. LRU is good for keeping the most recently accessed data, LFU is good for keeping the most frequently accessed data, and LRUS is better at coping with changes in access patterns.
  3. Monitoring and optimization : The cache strategy is not static, and it is necessary to continuously monitor the data access situation and optimize the cache size and strategy. Adjustments can be made dynamically by monitoring the hit rate and utilization of the cache.
  4. Flexible application : Different business modules may require different caching strategies. Depending on the actual situation, various caching strategies can be employed in the system to maximize performance.

Guess you like

Origin blog.csdn.net/guorui_java/article/details/132666743