Java interview questions: Three major cache problems and solutions!

Java interviews are indispensable for some technical written test questions. During the interview process, many friends may express themselves very well, but as a result, the Java foundation is not very solid. So today, for the Java written test questions, the editor summarizes the three major problems of Java interview question caching and their solutions. plan.

1. Java interview questions: Three major caching problems and solutions: Reasons for caching

With the gradual improvement of the development of Internet systems and the improvement of system qps, most current systems have added caching mechanisms to avoid excessive requests for direct database operations and cause system bottlenecks, which greatly improves user experience and system stability . sex.

2. Java interview questions: Three major caching problems and solutions: caching problem

Although the use of cache has brought certain qualitative improvements to the system, it also brings some problems that need attention.

2.1 Java interview questions Three major cache problems and solutions: Cache penetration

Cache penetration refers to querying a data that must not exist. Because there is no information about the data in the cache, it will directly go to the database layer for query. From the system level, it seems to penetrate the cache layer and directly reach the db, which is called For cache penetration, without the protection of the cache layer, this kind of query for data that definitely does not exist may be a danger to the system. If someone maliciously uses this kind of data that definitely does not exist to frequently request the system, no, it is not accurate. If it is said to be an attack on the system, the requests will reach the database layer, causing the db to be paralyzed and causing system failure.

2.2 Three major problems and solutions for Java interview question caching: Solution

The solutions in the cache penetration industry are relatively mature, and the main commonly used ones are the following:

Bloom filter: An algorithm similar to a hash table. It uses all possible query conditions to generate a bitmap. This bitmap will be used for filtering before database query. If it is not in it, it will be filtered directly, thereby reducing the pressure on the database level. BloomFilter algorithm is implemented in guava

Null value cache: A relatively simple solution. After querying the non-existent data for the first time, the key and the corresponding null value are also placed in the cache, but set to a shorter expiration time, for example A few minutes, so that you can deal with a large number of attacks on this key in a short period of time. The reason for setting a shorter expiration time is because the value may have nothing to do with the business and has little significance. Moreover, the query may not be initiated by the attacker, and there is no fault. Long-term storage is necessary, so it can expire earlier.

2.3Java interview questions Three major cache problems and solutions: Cache avalanche

In ordinary cache systems, such as redis, memcache, etc., we will set an expiration time for the cache. However, if all caches have the same expiration time, then when they expire at the same time, all system requests will be sent to the database layer. The db may not be able to withstand so much pressure causing the system to crash.

2.4 Three major problems and solutions for Java interview question caching: solutions

Thread mutual exclusion: Only one thread is allowed to build the cache, and other threads wait for the thread that built the cache to finish executing before re-obtaining the data from the cache. Only one thread is executing the request at a time, which reduces the pressure on the DB, but the shortcomings are also obvious. , reducing the system qps.

Staggered expiration time: This method is relatively simple and crude. Since expiration at the same time will cause an avalanche of too many requests, we can avoid this problem to a certain extent by staggering different expiration times. When setting the cache expiration time , a random time from an appropriate value range can be used as the failure time. Follow the Java Technology Stack WeChat official account and reply with the keyword: cache in the background to get more useful information on the cache series technology compiled by the stack leader.

2.5Java interview questions three major cache problems and solutions: cache breakdown

Cache breakdown is actually a special case of cache avalanche. Everyone who has used Weibo should know that Weibo has a hot topic function. Users’ search volume for hot topics is often much higher than other topics at some moments. In this way, we become the "hot spots" of the system. Since the data cache of these hot spots in the system also has an expiration time, when the cache of the hot spots reaches the expiration time, there may still be a large number of requests arriving at the system. Without the cache layer, Protection, these requests will also reach the db and may cause failures. The difference between breakdown and avalanche is that breakdown is for specific hotspot data, while avalanche is for all data.

2.6 Three major problems and solutions for Java interview question caching: solutions

Second-level cache: Perform second-level cache for hot data, and set different expiration times for different levels of cache, so that requests will not directly penetrate the cache layer and reach the database.

Here we refer to Alibaba’s cache breakdown solution for Double 11 trillion traffic. The key to solving this problem lies in hotspot access. Since hot spots may change over time, special caching for fixed data cannot solve the problem. Combining it with the LRU algorithm can better help us solve this problem. So what is LRU? Let’s give a rough introduction below. If you are interested, you can click on the link above to view it.

The LRU (Least recently used) algorithm eliminates data based on the historical access records of the data. Its core idea is that "if the data has been accessed recently, the probability of being accessed in the future is also higher." The most common implementation is to use a linked list to save cached data, as shown in the figure below

Java interview questions, frequently asked interview questions

This linked list is our cache structure, and the cache processing steps are:

First put the new data into the head of the linked list

During the data insertion process, if it is detected that data in the linked list is accessed again, that is, there is a request to access the data again, then the head of the linked list inserted into it, because they may be hot data relative to other data, has What it means to keep it longer

Finally, when the linked list is full, the data at the bottom will be eliminated, that is, the data that is not frequently accessed.

LRU-K algorithm. In fact, the above algorithm is also a special case of this algorithm, namely LRU-1. The above algorithm has many irrationalities. This algorithm has been improved in the actual application process. For example, accidental data influence will This results in a low hit rate. For example, a certain data is about to reach the bottom and is about to be eliminated. However, because a request has been put into the header again, and there are no more requests for the data since then, it is actually unreasonable for the data to continue to exist. The LRU-K algorithm has better solutions to this situation. The structure diagram is as follows:

LRU-K needs to maintain one more queue or more to record the history of all cached data being accessed. Only when the number of accesses to the data reaches K times, the data is put into the cache. When data needs to be eliminated, LRU-K will eliminate the data whose K-th access time is the largest from the current time. Follow the Java Technology Stack WeChat official account and reply with the keyword: cache in the background to get more useful information on the cache series technology compiled by the stack leader.

In the first step, adding data is still placed at the head of the first queue.

If the data has not been accessed K times in the queue (this value is determined according to the specific system qps), it will continue to reach the bottom of the linked list until it is eliminated; if the data has been accessed K times while in the queue, then it will be added to In the next 2-level linked list (the specific required level of structure is also combined with system analysis), arrange it in the 2-level linked list in chronological order.

The operations in the next level 2 linked list are the same as the above algorithm. If the data in the linked list is accessed again, it will be moved to the head. When the linked list is full, the bottom data will be eliminated.

Compared with LRU, LRU-K needs to maintain one more queue to record the access history of all cached data, so it requires more memory space to build the cache, but the advantages are also obvious, and it can better reduce the data complexity. The pollution rate improves the cache hit rate. It is also a way for the system to exchange a certain hardware cost for system performance. Of course, there are more complex cache structure algorithms. You can learn them by clicking on the LRU algorithm, such as Two Queues and Mutil Queues. This article will not go into details and only provides readers with a solution.

Guess you like

Origin blog.csdn.net/ole_triangle_java/article/details/106525190