How to use cache correctly to improve system performance

introduction

  In the previous articleThree ways to improve the performance of IO-intensive services, we mentioned three ways to optimize IO-intensive systems Methods, among which the method of adding cache is the most commonly used, and it is also the most universal. Today I will talk about how to use cache correctly. To be precise, we need to solve the following three big problems.

  1. Under what circumstances is it appropriate to add caching?
  2. How should the cache be configured?
  3. How to solve or reduce the side effects of caching?

When is it appropriate to add caching?

  Let’s first solve the first problem, under what circumstances is it appropriate to add caching. In one sentence, if the acquisition cost of a certain data is high and the data is valuable in the long term, then the data can be cached. **As for how long this "long term" is, we will explain it in detail later. Note that the acquisition cost is high and is valuable in the long run, these two conditions are Necessary conditions for adding cache. These two conditions must be met before the cache can be added. The terms high cost and long-term value are very abstract. How to determine high cost? What is long-term value? In fact, there is no objective judgment standard here, it can only be done through comparison. Let's take two specific examples to illustrate.

Example 1

  After you log in to WeChat, your friend’s avatar needs to be displayed on various WeChat pages (chat, likes, comments...). The avatar is a picture. The cost of obtaining the picture is dozens of dollars compared to plain text. The data in bytes is relatively high. After all, even a thumbnail is dozens of KB in size. This reflects the relatively highcost of image data. The WeChat avatar rarely changes, and pulling it once will be effective for a long time. This is reflected in the long-term value feature. This scenario is very consistent with the characteristics of data that can be cached. In fact, WeChat does the same thing. If you change a new avatar and then go to your friends to observe, it will not take effect immediately.

Example 2:

  Maybe students with good family backgrounds have bought things on Double Eleven. The inventory information of a certain product you are interested in is constantly changing. What you see now is 100, but it may become 0 the next second. Because this data is always changing, this data has strict timeliness requirements, so it does not meet the condition of "long-term value", so it cannot be cached.

Example 3:

  For example, I need to scan all the content data in our system to find out the non-compliant content. Judging a single piece of content consumes a lot of resources, and the judgment results are always valid after that. In this case, there is no need to add caching, because I will not use the judgment results again after using them once. The data here does not meet the long-term valuable characteristics. Therefore, there is no need to add caching here.

  The single acquisition cost of some data is not high, but the acquisition frequency is very high. Such a very high total acquisition cost is actually considered a high acquisition cost. For example, if there is information about each system and personnel, an interface can return it quickly, but it cannot handle the number of calls. In this case, caching can also be added to reduce the pressure. We can split data requests into four quadrants from the two dimensions of acquisition cost and long-term value.

Insert image description here

  through high acquisition cost and long-term value This From two dimensions, you can think about all the business scenarios you encounter, which ones can improve performance by adding caching?

How should the cache be configured?

  When you figure out what scenarios can be used to add cache, you have to consider how to configure the cache. Let’s first introduce the next formula for how to evaluate cache performance. To put things in perspective, let’s take an IO-intensive scenario as an example. A key indicator for measuring performance in IO-intensive scenarios is access latency (latency). With the support of cache, the average of the entire system The delay is calculated as follows:

a v g L a t e n c y = h i t R a t e × c a c h e L a t e n c y + ( 1 − h i t R a t e ) × o r i g i n a l L a t e n c y (1) avgLatency = hitRate \times cacheLatency + (1 - hitRate) \times originalLatency \tag{1} avgLate ncy=hitRate×cacheLatency+(1hitRate)×originalLatency(1)

  Where hitRate represents the cache hit rate, cacheLatency is the delay when accessing a cache, and originalLatency is the delay when accessing original data. avgLatency is the final delay of our system. The lower the value, the better. Some students may ask, why don’t we just put all the data into the fastest-access storage medium? This involves the characteristics of different storage media, mainly due to their data access speed and price. In short, the faster the storage media, the higher the unit price. I will not go into details here. You can check the information yourself.

  When we select a caching medium, the cache latency (cacheLatency) is basically fixed, and we certainly cannot intervene in the original data access latency (originalLatency). The only thing we can intervene in is the cache hit rate (hitRate). Let's take redis as a cache as an example, and then cache the data of a 10ms interface. The redis access delay in the same computer room is of the order of 0.1ms. When the cache hit rate is 1%, 10%, 50%, 90%, 100%, we use the above The results calculated by the formula are 9.9ms 9ms 5ms 1ms 0.1ms respectively. It can be seen that there are huge differences in the final results under different cache hit rates. The higher the hit rate, the stronger the performance.

  Due to cost constraints, we generally do not import the full amount of data into the cache medium, so in most cases the cache hit rate will not be 100%, and we also hope to use as little cache space as possible. How to improve system performance with limited cache media? As can be seen from the above formula, the most critical thing to improve cache performance is to improve the cache hit rate, and the cache hit rate is closely related to the access distribution of data, cache size, and cache data elimination strategy.

**Data distribution:**Is there a pattern in data access, and can these patterns be exploited?
**Cache size:** refers to the maximum amount of data that the cache can store.
**Data elimination strategy:** When the cache is full, how to eliminate the data with the lowest value in the cache to free up space for other data.

Data distribution

Insert image description here
  When I mark the access frequency per unit time on the ordinate axis, and sort it from large to small by frequency. In most cases, we will see the above data access distribution chart. It can be seen that the leftmost part of the data accounts for a very large part of the overall access (Head effect. Data under this distribution is cached at the earliest. The more obvious the head effect, the more obvious the performance improvement after caching. Fortunately, in the real world, the access distribution of most data is like this. long tail distribution), and the large amount of data in the long tail is accessed very rarely. This is the well-known

  Of course, there are also some data that are accessed evenly within a unit of time. These are generally data accesses triggered by programs at regular intervals. In this case, the effect of adding cache will be very weak or even non-existent. So you will see that in scheduled tasks, caching is rarely considered.

Cache capacity:

Insert image description here
  After determining the data distribution, you must consider the size of the cache capacity. Judging from the performance calculation formula above, in theory, the larger the cache capacity, the more obvious the cache will improve performance. However, I also mentioned above that the faster the storage medium, the higher the price. In the above figure, after the capacity of the blue box is doubled compared to the red box, the data access covered by it (the area with background color under the curve) is far from doubled, and the increase in coverage caused by the subsequent increase in cache will also increase. Come lower and lower.

  It can be seen that the size of the cache capacity will also significantly affect the cache hit rate, thereby affecting performance improvement. Conversely, after we know the expected performance target, we can also calculate the expected cache hit rate in reverse according to the above formula, and then calculate the minimum reasonable configuration size of the cache.

Data obsolescence strategy

  When the cache is full, we need to consider how to eliminate the least valuable data in the current cache, that is, the data that is least likely to be accessed in the future. No person or system has the ability to accurately predict the future, but we have a simple strategy to estimate the probability that each piece of data may be used in the future. The basis behind this strategy is locality.If a certain If data is accessed, its probability of being accessed in the future will be higher than other data that has not been accessed. The more frequently it has been visited, the higher the possibility of being visited again in the future. So here we only need to record the frequency of access to each piece of data, and then delete the piece of data with the lowest access frequency in the cache.

  Having said that, we have accidentally invented the cache elimination strategyLFU (Least Frequently Used), which maintains the access frequency of each piece of data. , when the cache space is insufficient, the data with the least access frequency is eliminated. It is equally famous as LRU (Least Recently Used). LRU eliminates the longest unused data in the cache each time. When implemented, Arranging the data into a queue from the most recent to the farthest when it was accessed, and then eliminating the last piece of data, is much simpler than LFU implementation.

  LFU and LRU both realize the function of eliminating the lowest value data to a certain extent, but there are still some differences.

  • LFU (Least Frequently Used): This strategy eliminates data that has been accessed the least frequently over a period of time. This means that even if a data item has been accessed recently, as long as its total number of accesses is still the minimum, it will be eliminated. LFU is suitable for scenarios where there is long-term hot data, which means that some data are accessed more frequently than other data.
  • LRU (Least Recently Used): This strategy will eliminate the least recently used data items. That is, if a data item has been accessed less frequently in the past, it will be prioritized when the data needs to be evicted to make room. LRU is more suitable for situations where there is an obvious "recent preference" characteristic in the data access pattern, that is, the probability of recently accessed data being accessed again in the future is relatively high.

  In general, LFU can cover a larger data time window, while LRU can only cover a time window as large as the cache size, so LFU is more suitable for data with obvious head effects in large time windows, while LRU is more suitable for small time windows. Data with obvious head effects below. According to my experience, LRU is sufficient in most cases.

  In addition to LRU and LFU, there are other data elimination strategies, such as FIFO and Random, and even some improvement strategies for LRU and LFU. The core is to improve the cache hit rate. When making specific choices, you still need to make Choose an appropriate data elimination strategy based on your own cache size and data distribution.

Caching side effects

  The biggest side effect of caching isdata inconsistency. This is why many people talk about caching, and even become a reason to refuse to use caching. It is true that there is a possibility of data inconsistency when using cache, but in fact many systems can tolerate temporary data inconsistency. Just like the problem with WeChat avatars mentioned above, it has been tested that after changing your WeChat avatar, others will still see your old avatar even a few days later. It’s not that WeChat cannot solve the problem of avatar consistency, but that this problem is not of great value and many people can accept it.

  There are two methods to avoid data consistency issues: passive and active. The passive method is to set the validity period for the data. For example, when using redis cache or spring-cache, you can set the data expiration time. Of course, the setting of the data expiration time is also very particular. Setting it too high not only wastes space, but also increases the probability of reading out the data. The probability of hitting expired data can be calculated using the following formula:

i n v a l i d H i t R a t e = c a c h e E x p i r e T i m e 2 d a t a U p d a t e I n t e r v a l × c a c h e H i t R a t e (2) invalidHitRate= \frac{\frac{cacheExpireTime}{2}} {dataUpdateInterval} \times cacheHitRate \tag{2} invalidHitRate=dataU pdate Interv al2cacheExpireTime×cacheHitRate(2)

  Where dataUpdateInterval refers to the data update interval. For example, some data is updated every 3 days on average, here it is 3 days. cacheExpireTime is the validity period of the data when you set the cache. The remaining validity period at any time is on average 1/2 of the original value. The cacheHitRate is the cache hit rate mentioned above. Under an acceptable error rate, a reasonable cache data validity period can be deduced based on the cache hit rate and data update cycle. It should be noted here that setting the expiration time for cached data may also significantly affect the cache hit rate. The final cache hit rate needs to be considered based on the cache size, data elimination strategy, and cache expiration time.

  Another proactive way is to actively delete the data in the cache after detecting changes in the data, or actively write the changed data. Compared with the passive method, the hit rate of wrong data will be significantly reduced. When calculating the error data hit rate, just replace the cacheExpireTime in Formula 2 with the data change processing time. In actual use, this part of the time is almost 0. However, the active update method has high implementation complexity. First, the data needs to have the ability to be notified of change events, which requires additional development in many systems. Secondly, the logic of change monitoring and writing needs to be implemented, which brings additional development volume. Therefore, unless it is very important data, the active update method is generally not chosen.

Summarize

  In this article, we explore the key elements of properly using cache to improve system performance. First, we determine whether it is appropriate to add caching through the two dimensions of data acquisition cost and long-term value. We then discuss in depth strategies for cache configuration, including how to evaluate and optimize cache performance in terms of access latency, hit rate, and storage cost. The cache hit rate is the core of performance improvement, which is closely related to the distribution of data access, cache size and data elimination strategy. Finally, we discussed the main side effect of caching: data consistency issues. We introduced two methods to deal with this problem, one is to passively set the data validity period, and the other is to proactively update the data. Each method has its applicable scenarios and trade-offs.

  Cache is a powerful tool, and if used properly, it can significantly improve system performance. Choosing whether and how to use caching requires a comprehensive consideration of data characteristics, business needs, and cost-benefit. Proper configuration and management can maximize the benefits of caching while reducing potential risks.

Guess you like

Origin blog.csdn.net/xindoo/article/details/134986225