Four difficulties of Redis, cache penetration, cache breakdown, cache avalanche, distributed lock

1. The concept of cache

2. Cache Avalanche

3. Cache penetration

4. Cache breakdown

5. Distributed lock


1. The concept of cache

generalized cache
That is, when loading some data that may be reused for the first time, while loading the data, put the data in a designated place for saving . The next time you load, fetch data from this specified location. There is a prerequisite for adding cache here, that is, fetching data from this place is much faster than fetching data from the data source.
Narrow cache (java)
1. Virtual machine cache (ehcache , JBoss Cache )
2. Distributed cache (redis , memcache )
3. Database cache

2. Cache Avalanche

Causes of cache avalanches
The popular and simple understanding of cache avalanche is: because the original cache is invalid (or the data is not loaded into the cache), all requests that should have accessed the cache go to the database before the new cache arrives, which causes huge pressure on the database CPU and memory. , which will seriously cause database downtime and system crash.
solution
1 : After the cache is invalid, the number of threads that read the database and write the cache is controlled by locking or queueing.
For example, only one thread is allowed to query the data and write the cache for a certain key , and other threads wait. Although it can relieve the pressure of the database to a certain extent, it reduces the throughput of the system at the same time.
Note : Lock queuing is only to reduce the pressure on the database, and does not improve system throughput. Assuming that the key is locked during cache reconstruction under high concurrency, 999 out of 1000 requests are blocked. It will also cause the user to wait for a timeout, which is a temporary solution.

2 : Analyze user behavior, set different expiration times for different keys , and make the time point of cache invalidation as uniform as possible.

3. Cache penetration

cache penetration
It refers to the user query data, which is not in the database, and naturally not in the cache. In this way, when the user queries, it can not and every time the user has to go to the database to query again, and then return empty. In this way, the request bypasses the cache and directly checks the database, which is also a frequently mentioned cache hit rate problem.
solution
1. If the query database is also empty, directly set a default value and store it in the cache, so that the second time you get the value from the cache, you will not continue to access the database. This method is the most simple and rude.
2. Cache the empty results, so that the same request can return empty next time, which can avoid the cache penetration caused when the query value is empty. At the same time, you can also set a separate cache area to store null values, pre-check the key to be queried , and then release it to the normal cache processing logic behind.
Note: When storing the true value for the corresponding ip , you need to clear the corresponding empty cache first.

4. Cache breakdown

For some keys with an expiration time set , if these keys may be accessed with high concurrency at certain points in time, it is a very " hot " data. At this time, there is a problem that needs to be considered: the problem of the cache being " breakdown " . The difference between this and the cache avalanche is that the cache is for a certain key , while the former is for many keys .
hot key
A certain key is accessed very frequently. When the key is invalid, there are a large number of threads to build the cache, which causes the load to increase and the system to crash.
Solution
(1) Use locks, synchronized, lock , etc. for stand-alone machines, and distributed locks for distributed use.
(2) The cache expiration time is not set, but is set in the value corresponding to the key . If it is detected that the stored time exceeds the expiration time, the cache is updated asynchronously.

5. Distributed lock

Several conditions must be met to use distributed locks:
1. The system is a distributed system
2. Shared resources (each system accesses the same resource, and the carrier of the resource may be a traditional relational database or NoSQL )
3. Synchronous access (that is, there are many processes accessing the same shared resource at the same time)
Distributed lock
When multiple processes are not in the same system, use distributed locks to control the access of multiple processes to resources.
1 thread lock
Mainly used to lock methods and code blocks. When a method or code uses a lock, only one thread executes the method or code segment . Thread locks are only effective in the same JVM , because the implementation of thread locks is fundamentally achieved by sharing memory between threads. For example, synchronized is a shared object header, indicating that the lock is a shared variable (
state )。
process lock
In order to control the access of multiple processes in the same operating system to a shared resource, each process cannot access the resources of other processes because of the independence of the processes . Therefore, process locks cannot be implemented through thread locks such as synchronized .
application scenarios
Both the inter-thread concurrency problem and the inter-process concurrency problem can be solved by distributed locks, but this is strongly not recommended! Because using distributed locks to solve these small problems is very resource-intensive! Distributed locks should be used to solve multi-process concurrency problems in distributed situations.
Example
If it is a single machine (single JVM ), the memory is shared between threads, and the concurrency problem can be solved by using thread locks.
If it is a distributed situation (multiple JVMs ), thread A and thread B are likely not in the same JVM , so thread locks will not work, and distributed locks must be used to solve this problem.
Use the setNX command of redis to implement distributed locks
Redis is a single-process and single-threaded mode. The queue mode is used to turn concurrent access into serial access, and there is no competition between multiple clients connecting to Redis . The SETNX command of redis can easily implement distributed locks.

Guess you like

Origin blog.csdn.net/qinluyu111/article/details/123698394