The cache is really fragrant, but it is also very hurt!


The use of cache is an evolutionary process. Ask yourself, what is the most direct reason for using caching? Without it, only fast!


640?wx_fmt=png&tp=webp&wxfrom=5&wx_lazy=1&wx_co=1Picture from Pexels


Looking back at the scene where you first used the cache, some unchanging configuration information stored in the database is directly loaded into the local public module when the service starts, so that other functional modules can be shared and used. This is the most basic and simple local cache application.



Service and cache



The so-called service, in short, one layer of application + one layer of data, the application obtains data from the data layer and then processes the output.




The data layer usually refers to the persistent storage on the persistent medium. It has many forms, it can be a file or a database.




Data is stored on persistent media, while applications run in memory. Memory and persistent media are two different media with an order of magnitude difference in speed. As a result, there is a "contradiction" between applications and data.




With this "contradictory" introduction, there is an urgent need for caching.



The cache we are talking about must be stored in the memory, so that it can be closer to the application and give the data needed by the application faster to obtain a faster service response.


640?wx_fmt=png&tp=webp&wxfrom=5&wx_lazy=1&wx_co=1

Of course, it is not that the cache completely isolates the persistence layer data. Cache, a word that comes with it is called hit rate.




When the data we query exists in the cache, we call it a "hit". At this time, the required data can be directly provided by the cache.




As for the data that is not "hit", it needs to pass through the cache layer to obtain it from the persistent data layer. This scenario is called cache penetration.




After the data is obtained, we need to repopulate the cache for the next "hit" query before returning it to the application.




Of course, what we mentioned above only refers to the "read" query scenario. When the application data operation changes, we need to update the changes to the persistence layer and buffer layer at the same time.



At this time, we will face another problem, the problem of "first" and "last".640?wx_fmt=png&tp=webp&wxfrom=5&wx_lazy=1&wx_co=1


The problem of "first" and "after" is also called the cache coherency problem. If you update the cache first, you may face a failure to update the persistence layer, resulting in the problem of cached dirty data.




However, if the persistence layer is updated first, we have to face the dilemma that the cache provides old data to the outside during the interval from the successful update of the persistence layer to before the cache update.




Cache consistency issues, especially in high-concurrency environments, require more subtle control according to specific scenarios. For example, the consistency lock of concurrent modification; for example, the delayed refresh of asynchronous refresh and so on.




Cache and update



We mentioned the problem of cache update consistency above. In terms of practical application scenarios, it can be subdivided into strong consistency requirements, weak consistency requirements, and final consistency requirements.



Strong consistency requirements



For example, applications such as transaction status information, orders placed, paid, and paid, require us to proactively update the associations in a timely manner and ensure consistency at the transaction level.




Many theories, including distributed transactions, that emerged in response to the situation also provide us with a good practice plan for solving practical problems.



Weak consistency requirement



Some scenarios involving less important information updates can tolerate inconsistencies between persistent layer data and cached data in a short time (for example, a few minutes).




Such as non-external description information, statistical counting cache information, etc. Asynchronous processing can usually be adopted.




Some scenes where fixed information is output for a short period of time (a few seconds, a few minutes). For example, update hotspot information and fare information every 30s. It can be processed by setting the cache timeout to be automatically eliminated.



Final consistency requirement



Ensure the ultimate consistency of the data state.




Cache granularity



The so-called granularity is the level and size of cached information blocks. The choice of the granular cache depends on the overall architecture of our application, data storage plan and specific application scenarios.




Take user information as an example. Is it caching active information? Or relatively static information? Is it cached at a single attribute level? Or by the entire object information?




Different data granularity also determines the form of our storage cache: binary serialized data of the entire object? A more transparent and intuitive json string? One-to-one mapping of attributes and values?




Each form has its own advantages and disadvantages, and developers can conduct comprehensive evaluation and selection from various aspects of application, storage and maintenance costs.




The harm of cache penetration



Earlier we mentioned the reason for cache penetration: cache miss. Then why is it missed?



The data does not exist in the cache temporarily



The so-called temporary can mean that the data has not been initially loaded into the cache, and lazy load loads the application in real time and on time.



It can also be that the cached data is automatically or actively expired by our specific cache expiration strategy. The commonly used expiration strategy includes element number limit, memory usage limit and lifetime limit.640?wx_fmt=png&tp=webp&wxfrom=5&wx_lazy=1&wx_co=1


In fact, whether it is initially unloaded, cache expired, or deleted, these are all normal application scenarios that we assume, and again we will not comment too much.



Data never exists



When a request for querying data that does not exist comes, it will inevitably pass through the cache and reach the persistent storage layer.




The responsiveness of persistent storage is limited. When such requests reach a certain magnitude, the service may be in danger of downtime.




At this point, our understanding of the role of caching needs to be further extended: reducing the underlying load and protecting back-end resources.


640?wx_fmt=png&tp=webp&wxfrom=5&wx_lazy=1&wx_co=1


The reasons for this cache penetration can be simply divided into internal and external incentives: internal application logic problems and external malicious attacks, crawler interference, etc.




Internal problems are easy to solve, introspection is predictable, and benign optimization is enough; instead, they are external unpredictable, which may require more cautious and multi-faceted defensive treatment.




In fact, whether internally or externally, there is only one thing that needs to be handled at the cache level: effective interception of penetration.




At this point, the usual first step of inertial thinking is to put the data that caused the cache penetration into the cache, regardless of whether it exists in the persistent storage.




For example, for normal deleted user data, soft delete processing at the cache level is performed, and status information is used as an annotation (I exist, but I don't exist! ). The penetration pressure caused by such problems can be solved well.


However, we have a clear understanding that it is the abnormal data that can cause harm.


For example, the difference data of exhaustive traversal is stored in the cache one by one, and the only result is the overflow of the cache resources. This is a pretty scary scene.


For this type of "big data" ***, Bloom filter interception may be a good choice.

640?wx_fmt=png&tp=webp&wxfrom=5&wx_lazy=1&wx_co=1

Also talk about cache avalanche


In the previous section, we talked about the load-bearing protection function of the cache. On the one hand, it responds quickly, and on the other hand, it protects the persistent layer data. In some read-oriented services, the cache almost carries more than 90% of requests.


However, if the cache cannot provide normal services for some reason, all requests will penetrate to the persistent storage layer, causing extreme downtime of the storage layer.640?wx_fmt=png&tp=webp&wxfrom=5&wx_lazy=1&wx_co=1

So, how should we deal with this situation? 


High availability


High availability of the cache is the primary guarantee for coping with cache avalanches: master-slave, read-write separation, dynamic expansion, consistency balance, remote disaster tolerance, etc.


Practical applications such as Redis sentinel mode, cluster deployment, etc.:640?wx_fmt=png&tp=webp&wxfrom=5&wx_lazy=1&wx_co=1

Service governance current limit, fuse downgrade


What is the purpose of service governance? The stability of the service. Current limiting refers to the control of abnormal traffic; the protection of core service resources that are fuse and degraded.


Cache and persistent data storage are both resources, or we can start to deal with cache avalanche scenarios from the flow control of the cache and the fuse and downgrade protection of the persistent data storage.

640?wx_fmt=png&tp=webp&wxfrom=5&wx_lazy=1&wx_co=1

Centralized expiration of cache elements causes cache invalidation


For cache elements with an expiration time set, if the elements expire at the same time, an instant external request will directly reach the persistent storage layer.


In actual caching applications, certain measures need to be taken to achieve uniform distribution of cache element expiration times.


Guess you like

Origin blog.51cto.com/14410880/2544323