Typical issues in Redis cache design

Table of contents

cache penetration

Cache invalidation (breakdown)

cache avalanche

Hotspot cache key reconstruction optimization

Cache and database double write are inconsistent


cache penetration

Cache penetration refers to querying data that does not exist at all. Neither the cache layer nor the storage layer will hit the data. Usually for fault tolerance considerations, if the data cannot be found from the storage layer, it will not be written to the cache layer. Cache penetration will cause non-existent data to be queried at the storage layer every time it is requested, thus losing the significance of cache protection for back-end storage.
There are two basic reasons for cache penetration:
First, there is a problem with your own business code or data.
Second, some malicious attacks, crawlers, etc. cause a large number of empty hits.

Solution to cache penetration problem:
1. Cache empty objects

2. Bloom filter
For malicious attacks that require a large amount of non-existent data from the server to cause cache penetration, you can also use a Bloom filter to filter first. Bloom filters can generally filter out data that does not exist and prevent requests from being sent to the backend. When a Bloom filter says a value exists, it probably doesn't exist; when it says it doesn't exist, it definitely doesn't exist.

 A Bloom filter is just a large bit array and several different unbiased hash functions. The so-called unbiased means that the hash value of the element can be calculated relatively evenly. When adding a key to a Bloom filter, multiple hash functions will be used to hash the key to calculate an integer index value and then perform a modulo operation on the length of the bit array to obtain a position. Each hash function will calculate a different position. Then set these positions of the bit array to 1 to complete the add operation. When asking the Bloom filter whether key exists, just like add, several positions of hash will also be calculated to see if these
positions in the bit array are all 1. As long as one bit is 0, it means that the key does not exist in the Bloom filter. If they are all 1, this does not mean that this key must exist, but it is very likely to exist, because these bits are set to 1 because other keys exist. If the bit array is sparse, the probability will be high. If the bit array is crowded, the probability will be reduced. This method is suitable for application scenarios where data hits are not high, data is relatively fixed, and real-time performance is low (usually with large data sets). Code maintenance is more complex, but cache space is occupied very little.

Cache invalidation (breakdown)

Since the failure of a large batch of caches at the same time may cause a large number of requests to penetrate the cache directly to the database at the same time, it may cause the database to be under instant pressure or even hang up. In this case, when we increase the cache in batches, it is best to cache this batch of data. Expiration times are set to different times within a time period.

cache avalanche

Cache avalanche means that after the cache layer cannot support or fails, the traffic will hit the back-end storage layer like a fleeing bison.
Since the cache layer carries a large number of requests, it effectively protects the storage layer. However, if the cache layer cannot provide services for some reasons (such as extremely large concurrency, the cache layer cannot support it, or due to the cache Poor design, such as a large number of requests to access bigkey, resulting in a sharp decline in the concurrency that the cache can support), so a large number of requests will hit the storage layer, and the number of calls to the storage layer will increase dramatically, causing cascading downtime of the storage layer.
To prevent and solve the cache avalanche problem, you can start from the following three aspects.
1) Ensure high availability of cache layer services, such as using Redis Sentinel or Redis Cluster.
2) Rely on the isolation component to limit the current flow and downgrade the backend. For example, use Sentinel or Hystrix current limiting and downgrading components.
For example, in the event of service degradation, we can adopt different processing methods for different data. When the business application accesses non-core data (such as e-commerce product attributes, user information, etc.), it temporarily stops querying these data from the cache and directly returns the predefined default degradation information, null value or error message; when When business applications access core data (such as e-commerce product inventory), they are still allowed to query the cache. If the cache is missing, they can continue to read through the database.
3) Practice in advance. Before the project goes online, drill the application and back-end load conditions and possible problems after the cache layer is down, and make some plan settings based on this.

Hotspot cache key reconstruction optimization

Developers use the "caching + expiration time" strategy to not only speed up data reading and writing, but also ensure regular data updates. This model can basically meet most needs. But there are two problems that may cause fatal harm to the application if they occur at the same time:
The current key is a hot key (such as a popular entertainment news), and the amount of concurrency is very large. Rebuilding the cache cannot be completed in a short time and may be a complex calculation, such as complex SQL, multiple IOs, multiple dependencies, etc. At the moment when the cache expires, there are a large number of threads to rebuild the cache, which increases the backend load and may even cause the application to crash. The main solution to this problem is to avoid a large number of threads rebuilding the cache at the same time. We can use a mutex lock to solve this problem. This method only allows one thread to rebuild the cache, and other threads wait for the thread that rebuilds the cache to finish executing, and then re-obtain the data from the cache.

Cache and database double write are inconsistent

Under large concurrency, there will be data inconsistency problems when operating the database and cache at the same time
1. Double-write inconsistency

 2. Inconsistency in read and write concurrency

 Solution:
1. For data with a small probability of concurrency (such as personal-dimensional order data, user data, etc.), there is almost no need to consider this problem, and it is rarely the case. If cache inconsistency occurs, you can add an expiration time to the cached data and trigger active updates of reads every once in a while.
2. Even if the concurrency is very high, if the business can tolerate short-term cached data inconsistencies (such as product names, product category menus, etc.), caching plus expiration time can still solve most business problems. Caching requirements.
3. If you cannot tolerate cache data inconsistency, you can add read-write locks to ensure that concurrent reading and writing or writing and writing are queued in order, which is equivalent to no lock when reading and reading.
4. You can also use Alibaba’s open source canal to modify the cache in time by monitoring the binlog of the database. However, new middleware is introduced, which increases the complexity of the system.

 is aimed at adding cache to improve performance when there are many reads and few writes. If there are many writes and many reads and the cached data cannot be tolerated, there is no need to add cache and you can directly operate the database. Of course, if the database cannot withstand the pressure, you can also use the cache as the main storage for data reading and writing, and asynchronously synchronize the data to the database. The database is only used as a backup of the data.
The data placed in the cache should be data that does not have high requirements for real-time performance and consistency. Remember not to do a lot of over-design and control to increase system complexity in order to use cache while ensuring absolute consistency!

Guess you like

Origin blog.csdn.net/qq_43649937/article/details/134668192