Redis big key problem

background

During the Double Eleven promotion period, I received feedback from customer service that the interface for receiving coupons on the APP was slow. I found a case and discovered through the call link that the operation of redis was slow, and some redis exceptions were also found.

Finally, the reason is located: in the scenario of issuing coupons, redis is used to cache coupon batches, and record all the batches of coupons received by the user on the same day.
Scenario of issuing coupons: key = userId, value = list of coupon batch IDs, and redis query It is found that there are many more big keys, which are reflected in the thousands or even tens of thousands of coupons received by a user, which makes the Redis query slow or even abnormal. As for why some users receive so many coupons. Contacting the risk control found that these users came to collect wool, but the risk control did not intercept it, resulting in an abnormality on the service side.

Although the final reason is not my problem, the Redis big key problem is quite interesting. Next, let's understand the impact of the big key problem on Redis.

what is big key

The so-called big key problem is that the value of a certain key is relatively large, so it is essentially a big value problem. The key is often set by the program itself, and the value is often not controlled by the program, so the value may be very large.

Imagine a scenario:

In an online music app, a song list has many user favorites. If there is such a data structure:

The key of redis is the song list ID, and the value of redis is a list. The list contains user IDs. There may be many users, which makes the length of the list uncontrollable.

What is the impact of the big key

We all know that a typical feature of redis is that the core worker thread is a single thread.
The processing of request tasks in a single thread is serial, which cannot be completed in the front and cannot be processed later, which also leads to an imbalance between memory data and CPU in the distributed architecture.

  • The client itself, which executes the big key command, takes a significant increase in time, or even times out
  • When performing large key-related read or delete operations, it will seriously occupy bandwidth and CPU, and affect other clients
  • The storage of the large key itself brings unbalanced fragmented data and unbalanced CPU usage in the distributed system
  • Sometimes a large key is also a hot key, and the read operation is frequent, which will have a great impact
  • When performing large key deletion, threads may be blocked in lower versions of redis

From this point of view, the impact of large keys is still obvious. The most typical one is blocking threads, reducing the concurrency, causing the client to time out, and the success rate of the server's business to drop.

How the big key is generated

The generation of large keys is often due to the unreasonable design of the business side, and the dynamic growth of vaule is not foreseen:

  • Keep stuffing data in the value, there is no deletion mechanism, sooner or later it will explode
  • The data is not properly sharded, and the large key is changed into a small key

How to find the big key

  • Increase the monitoring of indicators such as memory & flow & timeout

Because the value of a large key is very large, the thread may be blocked when performing a read, so the overall QPS of Redis will decrease, and the client timeout will increase, and the network bandwidth will increase. Configuring these alarms can let us find the existence of a large key.

  • bigkeys command

Use the bigkeys command to analyze all Keys in the Redis instance in a traversal manner, and return the overall statistics and the Top1 big Keys in each data type

  • redis-rdb-tools

Use the redis-rdb-tools offline analysis tool to scan RDB persistent files. Although the real-time performance is slightly worse, completely offline has no impact on performance.

redis-rdb-tools is a tool written by Python to analyze Redis rdb snapshot files. It can generate rdb snapshot files into json files or generate reports to analyze the usage details of Redis.

  • Integrated visualization tool

Redis based on some public cloud or company's internal architecture generally has visual pages and analysis tools to help us locate big keys. Of course, the bottom layer of the page may also be the result of offline analysis based on bigkeys or rdb files.

How to solve the big key problem

According to the actual use of the big key, it can be divided into two situations: deletable and non-deletable.

can be deleted

If you find that some big keys are not hot keys and can be queried and used in DB, you can delete them in Redis:

  • When the Redis version is greater than 4.0 , you can use the UNLINK command to safely delete large keys. This command can gradually clean up incoming keys in a non-blocking manner.

The Redis UNLINK command is similar to the DEL command, which means to delete the specified key. If the specified key does not exist, the command will be ignored.
The UNLINK command differs from the DEL command in that it is executed asynchronously, so it does not block.
The UNLINK command is a non-blocking delete. In short, the non-blocking delete is to put the delete operation on another thread for processing.

  • When the Redis version is less than 4.0 , avoid using the blocking command KEYS. Instead, it is recommended to use the SCAN command to perform incremental iterative scanning of keys, and then determine to delete them.

The Redis Scan command is used to iterate over the database keys in the database.
The SCAN command is a cursor-based iterator. After each call, a new cursor will be returned to the user. The user needs to use this new cursor as the cursor parameter of the SCAN command in the next iteration to continue the previous iteration process. .

Compress and split keys

  • When the vaule is a string, it is difficult to split. Use serialization and compression algorithms to control the size of the key within a reasonable range, but serialization and deserialization will bring more time consumption.

  • When the value is a string and it is still a large key after compression, it needs to be split. A large key is divided into different parts, the key of each part is recorded, and operations such as multiget are used to realize transactional reading.

  • When the value is a collection type such as list/set, fragmentation is performed according to the estimated data size, and different elements are divided into different fragments after calculation.

Generally, multiple Redis are used to store the user's data separately, so as to realize the expansion of the memory data.
For the user: the multiple Redis of the Redis fragment will be regarded as a whole, and they don't care where the data exists, but only whether they can storage. Reference: https://segmentfault.com/a/1190000024430014

The big key problem of Redis is very common in interviews and work. It is very worthwhile to understand it well.

Guess you like

Origin blog.csdn.net/itguangit/article/details/123049666