SQL optimization practice: how to reduce traffic peaks for MySQL queries through caching?

V-xin: ruyuanhadeng obtained 600+ pages of original high-quality articles summary PDF

foreword

After the previous index and sql optimization, the query speed is now flying fast, and then we continue to return to the development of daily needs.

After 3 months, the data in the order table has reached 50 million, but the time for a SQL query is basically stable below 300ms.

But on a Monday, the leader came to you directly after the weekly meeting, and said directly: "Oh, the DBA at the weekly meeting asked me, saying that the sql of our order group occasionally exceeds 2s, and the DBA now requires optimization. The average time It needs to be optimized to less than 300ms, but before optimization, you need to check why the query time of SQL occasionally increases suddenly."


Troubleshoot

Then we took over the task. Then, we checked the relevant logs of this sql according to the slow sql given by the DBA, and then combined with the monitoring, and finally found that this sql has always been very stable, but during the peak period , this sql occasionally takes more than 2s.

At this time, we checked the resource occupancy of the physical machine where the order database is located, and found that during the peak period, the resource occupancy of this physical machine was very high, and the CPU and memory occupancy rates were high. Now the cause is basically determined.

To put it bluntly, a large number of requests go to MySQL to query data during the peak period. At this time, a large number of requests will intensively request the database, and then the CPU and memory usage of the machine where the database is located will soar, which will eventually lead to MySQL. The query efficiency is greatly reduced.

After understanding the situation, the leader said: "In fact, the slow database query is not necessarily caused by the large amount of MySQL data. For example, in the current situation, it is obvious that a large number of requests are intensively requesting the database, causing the database load to increase, which greatly reduces the query efficiency of the database. At this time, in fact, we need to add a layer of cache in front of MySQL to cut traffic peaks to ensure that MySQL can complete the query stably”

After a little bit of help from the leader, we suddenly realized that this is the case. To put it bluntly, at this time, we can add some cache to cut the traffic peak for MySQL, and the running process after adding the cache is probably like this:

picture

That is to say, according to the standard request process, the user's request will hit the database, but after adding the cache, this process is not the case. At this time, the request can directly obtain the data from the cache and return it. At this time, the processing of subsequent processes, such as querying the database, can be reduced, which effectively reduces the load of the database.

To put it bluntly, it is to use the cache to undertake most query requests to achieve the effect of peak traffic reduction, thereby reducing the load of the database to ensure that MySQL can complete the query stably and efficiently, so that the problem of sudden increase in query time of MySQL during peak hours can be solved. Perfectly solved.

Although the cache is very easy to use, in the process of using the cache, we should pay attention to the hit rate of the cache. The hit rate = the number of correct results returned / the number of cache requests. The hit rate is an important indicator to measure the effectiveness of the cache. The higher the hit rate, the more The higher the usage of cache.

In addition to paying attention to the cache hit rate, we also need to understand the cache flushing strategy, such as the first in first out strategy FIFO (first in first out), the least used strategy LFU (less frequently used) and the least recently used strategy LRU (least recently used).


How to improve cache hit rate

As we said just now, the hit rate is an important indicator to measure the effectiveness of the cache, so how can we improve the cache hit rate?

In fact, in order to improve the cache hit rate, there are many points that need to be considered, including the following:

1. Choose the right business scenario

First of all, the cache is suitable for scenarios with more reads and fewer writes. It is best to use high-frequency access scenarios, because the higher the access frequency, the higher the hit rate.

2. Reasonably set the cache capacity

If the cache capacity is too small, the memory elimination mechanism of Redis will be triggered, which will cause some cache keys to be deleted, which will reduce the cache hit rate. Therefore, it is very necessary to set the cache capacity reasonably.

3. Control the cache granularity

The smaller the granularity of the cache, the higher the cache hit rate, because the smaller the data unit of a single key, the less likely the cache is to be changed.

4. Flexibly set the expiration time of the cache key

What I am talking about here is to try to avoid the caches from expiring at the same time. If the caches expire at the same time, if there are multiple query requests at this time, then these requests will all hit the database. This condition is called cache breakdown, and it can cause a lot of stress on the database.

5. Avoid cache penetration

Let’s first understand the cache hit rate. For example, when a request comes to query a piece of data, if the data is not found in the cache, we can say that the cache is not hit. If a large number of query requests are rarely found in the cache To the data, we can say that the cache hit rate is very low.

When the cache hit rate is very low, because the data cannot be found in the cache, the request will hit the data at this time, and the data will be queried in the database. If the data is still not found in the database, it means that the request has penetrated the cache. .

Once the cache is penetrated, when a large number of requests come in, if the cache cannot be hit all the time, the large number of requests will turn to the database, and the ability of the database to process requests is limited. If the pressure is too high and the system goes down, once the database goes down, it is very likely to evolve into a cache avalanche, causing the entire system to be paralyzed in a large area, which is very terrifying.

Therefore, we need to make a bottom-up plan in advance to avoid the occurrence of cache penetration. For example, when a query request comes, if no data is queried in the cache, there is still no data in the database. At this time, we can In the cache, set an empty object for this query request, and then return the request with this empty object.

When the same query request comes again next time, you can directly hit the empty object in the cache, and the request does not need to flow to the database, so even when a large number of requests come in, you can achieve a high cache hit rate and cache The problem of penetration is solved.

6. Do a good job of cache preheating

Generally speaking, the first query request will be sent to the database. Therefore, we can load the database data into the cache in advance, that is, cache warm-up, so that the first query request can also go directly to the cache.

If the above points are done well, the cache hit rate will naturally increase. Well, let’s not talk about nonsense. Let’s engage in a cache battle to experience the query effect after adding the cache.


Cache combat

Scenario introduction: historical order query

Since the status of completed orders will not change, the query results will be cached in redis when querying historical orders again, and the expiration time will be set to one hour. Therefore, before the cache expires, when users query historical orders again, they will Request redis to reduce database pressure

Query time without adding cache

picture

Redis optimization ideas

When querying historical orders, it will first query whether there is a cache in redis. If so, it will directly return the data in redis. If not, it will query MySQL, then return the query data, and set the query result to the cache, so that the next query can go cache.

Cache Key Generation Rules

User id + page number + page number to generate redis Key

picture

Cache core code

picture

The effect of cache optimization

After adding the cache, you can see that the redis cache query is passed in the second request, and the efficiency has been greatly improved.

picture

Then, after you add the cache, you find that the effect is really good, a large number of requests hit the cache, the resource occupancy rate of the database is also maintained within a reasonable range, and the sql query time is also stable below 300ms .

V-xin: ruyuanhadeng obtained 600+ pages of original high-quality articles summary PDF

In addition, I recommend the 1-yuan series of courses of the Confucian Ape Classroom to you, welcome to join and learn together~

Internet Java engineer interview surprise class (1 yuan exclusive)

SpringCloudAlibaba zero-based entry to project combat (1 yuan exclusive)

E-commerce details page system actual combat project under 100 million-level traffic (1 yuan exclusive)

Kafka message middleware kernel source code introduction (1 yuan exclusive)

12 practical cases take you to play Java concurrent programming (1 yuan exclusive)

Elasticsearch zero-based entry to proficient (1 yuan exclusive)

Handwritten distributed middleware system based on Java (1 yuan exclusive)

ShardingSphere-based sub-database sub-table practical course (1 yuan exclusive)

Guess you like

Origin blog.csdn.net/qq_42046105/article/details/127350933