Three questions and solutions of caching frequently asked during interviews

1. The reason for caching.
With the gradual improvement of the development of the Internet system, the qps of the system has been improved. Most of the current systems have added a caching mechanism to avoid excessive requests for direct operations with the database and cause system bottlenecks, which greatly improves User experience and system stability.
2. Cache problem
Although the use of cache has brought a certain qualitative improvement to the system, it also brings some problems that need attention.

2.1 Cache penetration

Cache penetration refers to querying a data that must not exist. Because there is no information about the data in the cache, it will go directly to the database layer for query. From the system level, it seems that it penetrates the cache layer and reaches the db directly. For cache penetration, without the protection of the cache layer, such query data that must not exist may be a danger to the system. If someone maliciously uses this data that must not exist to frequently request the system, no, accurate Said to attack the system, the request will reach the database layer, causing the db to crash and cause system failure.

2.2 Solution

Cache penetration solutions in the industry are relatively mature, and the main commonly used solutions are as follows:

Bloom filter: An algorithm similar to a hash table, which uses all possible query conditions to generate a bitmap, which is used to filter before the database query, and if it is not in it, it filters directly, thereby reducing the pressure on the database level. The BloomFilter algorithm is implemented in guava

Null value cache: A relatively simple solution. After querying the non-existent data for the first time, the key and the corresponding null value are also put into the cache, but set to a shorter expiration time, for example A few minutes, so that you can deal with a large number of attacks on the key in a short period of time. Set a shorter expiration time because the value may not be business-related and has little meaning, and the query may not be initiated by the attacker. Long-term storage is necessary, so it can expire earlier.

2.3 Cache Avalanche

In ordinary caching systems, such as redis, memcache, etc., we will set an invalidation time for the cache, but if all caches have the same invalidation time, then all system requests will be sent to the database layer when they expire at the same time. db may not be able to withstand such a large amount of pressure and cause the system to crash.

2.4 Solution

Thread mutual exclusion: Only let one thread build the cache, and other threads wait for the execution of the thread that builds the cache, and then get data from the cache again. Only one thread is executing the request at a time, which reduces the pressure of db, but the disadvantages are also obvious , Reduce the qps of the system.

Staggered invalidation time: This method is relatively simple and rude. Since invalidation at the same time will cause too many requests avalanche, we can avoid this problem from a certain length by staggering different invalidation times. When the cache invalidation time is set , A random time from an appropriate value range can be used as the failure time.

2.5 Cache breakdown

Cache breakdown is actually a special case of cache avalanche. Everyone who has used Weibo should know that Weibo has a hot topic function, and the search volume of users on hot topics is often much higher than other topics at some moments. We have become the "hot spot" of the system. Because the data cache of these hotspots in the system also has invalidation time, when the cache of the hotspot reaches the invalidation time, a large number of requests may still reach the system at this time, and there is no cache layer. Protection, these requests will also reach the db and may cause failures. The difference between breakdown and avalanche is that breakdown is for specific hot data, while avalanche is all data.

2.6 Solution

Second-level cache: Perform second-level cache for hot data and set different invalidation times for different levels of cache, so the request will not directly penetrate the cache layer to reach the database.

Here is a reference to Ali's double 11 trillion traffic cache breakdown solution. The key to solving this problem is hotspot access. Since hotspots may change over time, special caching for fixed data cannot cure the problem. Combining the LRU algorithm can better help us solve this problem.

The LRU (Least recently used, least recently used) algorithm eliminates data based on the historical access records of the data. Its core idea is "if the data has been accessed recently, the probability of being accessed in the future is also higher." The most common implementation is to use a linked list to store cached data, as shown in the following figure

WeChat image_20200507133759.png

This linked list is our cache structure, and the cache processing steps are

First put the new data into the head of the linked list

In the process of data insertion, if it is detected that there is data in the linked list that is accessed again, that is, it is requested to access the data again, then the head of the linked list inserted, because they may be hot data relative to other data, with The meaning of longer retention

Finally, when the linked list data is full, the bottom data is eliminated, that is, the data that is not frequently accessed

The LRU-K algorithm, in fact, the above algorithm is also a special case of the algorithm, namely LRU-1. The above algorithm has many irrationalities. In the actual application process, the algorithm is used to improve, such as accidental data impact. This results in a low hit rate. For example, a certain data is about to reach the bottom and will be eliminated, but because a request is put into the header, there is no request for the data after that, then the continued existence of the data is actually unreasonable. In this case, the LRU-K algorithm has better solutions. The structure diagram is as follows:

WeChat image_20200507133843.png

LRU-K needs to maintain one more queue or more to record the history of all cached data being accessed. Only when the number of accesses of the data reaches K times, the data is put into the cache. When data needs to be eliminated, LRU-K will eliminate the data whose Kth access time is the largest from the current time.

The first step is to add data to the head of the first queue

If the data has not been accessed K times in the queue (the value is determined according to the specific system qps), it will continue to reach the bottom of the linked list until it is eliminated; if the data has been accessed in the queue for K times, it will be added to In the next level 2 (specifically, several levels of structure are also combined with system analysis) linked list, arranged in the second level linked list in chronological order

The operation in the next level 2 linked list is the same as the above algorithm. If the data in the linked list is accessed again, it will be moved to the head. When the linked list is full, the bottom data will be eliminated.

Compared with LRU, LRU-K needs to maintain an additional queue to record the history of all cached data being accessed, so more memory space is needed to build the cache, but the advantages are also obvious, and the data is better reduced. The pollution rate increases the hit rate of the cache. It is also a way for the system to exchange a certain hardware cost for system performance. Of course, there are more complex cache structure algorithms, which can be learned by clicking on the LRU algorithm, such as Two Queues and Mutil Queues, etc. This article will not go into details, only to provide readers with a solution.

Reprinted from: https://gper.club/articles/7e7e7f7ff4g5bgc7g6f

Three questions and solutions of caching frequently asked during interviews

Guess you like