Talk about what is cache avalanche and cache penetration

Cache avalanche

Suppose a system has 7000 requests per second during the peak period. At this time, we use the cache to resist such high requests. However, if a large number of caches fail at a certain point in time, or the cache server goes down, then these requests will directly act on ordinary databases (such as MySQL). With such a high volume of requests, MySQL will definitely not be able to withstand it, and then it will hang. If the database hangs, it will also cause the system to hang.

image

We summarize the conditions that the cache avalanche triggers:

  • High concurrency

  • The cache server is down

  • Centralized invalidation of a large number of caches

The consequence is: the system crashes.

The solution is to increase current limit queuing access to the database. Assuming that our database can withstand at most 2000 requests per second, when we request the database, the requests per second need to be controlled within 2000, and the rest of the requests need to be queued. At the same time, we need to use a cluster to ensure the high availability of the cache server. In addition, we also need to randomly set the expiration time of the cache key to prevent the key from expiring at the same time.

image

Avalanche resolution

Regarding current limiting operations, you can use Spring Cloud related current limiting plug-ins. Here we use Semaphore semaphores to simulate lower current limiting operations.

image

Here we limit that there can be up to 5 requests to visit at the same time. When we have 9 requests coming in, only 5 requests get access tokens, and the remaining 4 requests are waiting outside. At the end of the previous access, the tokens will be released, and the next 4 requests will preempt these tokens, which will be executed repeatedly.

Here we play a current limiting operation.

Cache penetration

Assuming that there is a key that will never exist in the cache, when a hacker uses this key to attack the system, such as 7000 attacks per second, then the cache will not be used anyway. The attack request is directly hit on the database, and the database must be Can't hold it. Caused the database to crash, at the same time, the system will also crash.

image

penetrate

Our solution is to determine whether the target data exists before querying, and ignore those that do not exist. Intercept traffic before the cache and database.

image

Penetrating solution

As shown in the figure above, add a filter. Before the system starts, we need to load the hotspot data into the cache. When accessing the hotspot data, it will be judged once by the filter. If the request does not exist in the cache, it will return directly. , Will not go through the database.

Filters We can use bloom filters, the sample code of bloom filters is as follows:

image

Bloom filter mainly scatters the value calculated by the key through the hash algorithm into the bitmap, which exists as 0 and 1 in the bitmap.

image

bitmap

However, the above algorithm has flaws:

Bloom filters cannot filter accurately. (The Bloom filter judges that it does not exist, 100% does not exist, and if it is judged to exist, it may not exist.) In theory, the hash calculation value is collided (different content hashes calculate the same value), resulting in non-existence Elements may be judged to exist

Of course, the Bloom filter does not need to intercept all requests, it only needs to control the cache breakdown to a certain amount.

Regarding cache penetration, we can also use the following solutions:

The data that cannot be retrieved in the cache is not retrieved in the database. At this time, the key-value pair can also be written as key-null, and the effective time of the cache can be set to a short time, such as 30 seconds. Can't be used). This can prevent the attacking user from repeatedly using the same id brute force attack.

However, the above scheme has a flaw: the hacker may simultaneously access multiple non-existent keys for a short time, and also access the database, so in practice, it needs to be optimized and modified according to the scene.

Cache avalanche and cache penetration are generally encountered in high-concurrency situations. They are also the content of high-frequency interviews with Internet companies. Mastering this knowledge, I believe you have entered a new world.

Recommended in the past

Scan the QR code to get more exciting. Or search Lvshen_9 on WeChat , you can reply to get information in the background

回复"java" 获取java电子书;
 
回复"python"获取python电子书;
 
回复"算法"获取算法电子书;
 
回复"大数据"获取大数据电子书;
 
回复"spring"获取SpringBoot的学习视频。
 
回复"面试"获取一线大厂面试资料
 
回复"进阶之路"获取Java进阶之路的思维导图
 
回复"手册"获取阿里巴巴Java开发手册(嵩山终极版)
 
回复"总结"获取Java后端面试经验总结PDF版
 
回复"Redis"获取Redis命令手册,和Redis专项面试习题(PDF)
 
回复"并发导图"获取Java并发编程思维导图(xmind终极版)

Another: Click [ My Benefits ] to have more surprises.

Guess you like

Origin blog.csdn.net/wujialv/article/details/115325208