What are the dangers of Redis big key? How to troubleshoot and deal with it?

This is a problem that a classmate encountered while participating in Dewu Java this fall. The complete interview is as follows:

This question is relatively easy to encounter in interviews, especially when examining knowledge points related to Redis performance optimization.

Usually, after asking about bigkey (big key), you will continue to ask about hotkey (hot key). Even if you are not preparing for an interview, it is recommended to take a look. It can also be used in actual development.

Hotkey related content will be asked in the next article, and will also be included in the interview question album "Detailed Explanations of Common Java Interview Questions".

What is bigkey?

Simply put, if the memory occupied by the value corresponding to a key is relatively large, then the key can be regarded as a bigkey. Exactly how big is considered big? There is a reference standard that is not particularly precise:

  • String type value exceeds 1MB
  • The value of a composite type (List, Hash, Set, Sorted Set, etc.) contains more than 5000 elements (however, for a value of a composite type, the more elements it contains, the more memory it takes up).

bigkey judgment criteria

How did bigkey come about? What's the harm?

Bigkey is usually generated for the following reasons:

  • Improper program design, such as directly using the String type to store binary data corresponding to larger files.
  • The business data scale is not considered carefully. For example, when using collection types, the rapid growth of data volume is not considered.
  • Junk data is not cleaned up in time, such as a large number of redundant useless key-value pairs in the hash.

In addition to consuming more memory space and bandwidth, bigkey will also have a relatively large impact on performance.

In  the article Summary of Common Causes of Blocking in Redis [1] we mentioned that large keys can also cause blocking problems. Specifically, it is mainly reflected in the following three aspects:

  1. Client timeout blocking: Since Redis executes commands in a single thread, and it takes more time to operate large keys, Redis will be blocked. From the client's perspective, there will be no response for a long time.
  2. Network blocking: Each time a large key is obtained, the network traffic generated is large. If the size of a key is 1 MB and the number of visits per second is 1,000, then 1,000 MB of traffic will be generated per second. This is for a server with an ordinary Gigabit network card. is catastrophic.
  3. Worker thread blocking: If you use del to delete a large key, the worker thread will be blocked, making it impossible to process subsequent commands.

The blocking problem caused by large keys will further affect master-slave synchronization and cluster expansion.

In summary, there are many potential problems caused by big keys, and we should try to avoid the existence of big keys in Redis.

How to discover bigkey?

1. Use the --bigkeys parameter that comes with Redis to search.

# redis-cli -p 6379 --bigkeys

# Scanning the entire keyspace to find biggest keys as well as
# average sizes per key type.  You can use -i 0.1 to sleep 0.1 sec
# per 100 SCAN commands (not usually needed).

[00.00%] Biggest string found so far '"ballcat:oauth:refresh_auth:f6cdb384-9a9d-4f2f-af01-dc3f28057c20"' with 4437 bytes
[00.00%] Biggest list   found so far '"my-list"' with 17 items

-------- summary -------

Sampled 5 keys in the keyspace!
Total key length in bytes is 264 (avg len 52.80)

Biggest   list found '"my-list"' has 17 items
Biggest string found '"ballcat:oauth:refresh_auth:f6cdb384-9a9d-4f2f-af01-dc3f28057c20"' has 4437 bytes

1 lists with 17 items (20.00% of keys, avg size 17.00)
0 hashs with 0 fields (00.00% of keys, avg size 0.00)
4 strings with 4831 bytes (80.00% of keys, avg size 1207.75)
0 streams with 0 entries (00.00% of keys, avg size 0.00)
0 sets with 0 members (00.00% of keys, avg size 0.00)
0 zsets with 0 members (00.00% of keys, avg size 0.00

From the running results of this command, we can see that this command will scan (Scan) all keys in Redis, which will have a slight impact on the performance of Redis. Moreover, this method can only find the top 1 bigkey of each data structure (the String data type that takes up the largest memory, and the composite data type that contains the most elements). However, having many elements in a key does not mean that it takes up more memory. We need to make further judgments based on specific business conditions.

When executing this command online, in order to reduce the impact on Redis, you need to specify the -i parameter to control the frequency of scanning. redis-cli -p 6379 --bigkeys -i 3 means that the rest interval after each scan during the scanning process is 3 seconds.

2. Use the SCAN command that comes with Redis

The SCAN command can return matching keys according to a certain pattern and number. After obtaining the key, you can use STRLEN, HLEN, LLEN and other commands to return its length or number of members.

Data structure command complexity result (corresponding to key) StringSTRLENO (1) The length of the string value HashHLENO (1) The number of fields in the hash table ListLLENO (1) The number of list elements SetSCARDO (1) The number of set elements Sorted SetZCARDO (1) Yes The number of elements in the sorted set

For collection types, you can also use the MEMORY USAGE command (Redis 4.0+). This command will return the memory space occupied by key-value pairs.

3. Use open source tools to analyze RDB files.

Find the big key by analyzing the RDB file. The premise of this solution is that your Redis uses RDB persistence.

There are ready-made codes/tools available online that can be used directly:

  • redis-rdb-tools[2] : A tool written in Python language to analyze Redis RDB snapshot files
  • rdb_bigkeys[3]  : A tool written in Go language to analyze Redis RDB snapshot files, with better performance.

4. Use the Redis analysis service of the public cloud.

If you are using the public cloud Redis service, you can see if it provides key analysis function (usually it does).

Here we take Alibaba Cloud Redis as an example. It supports bigkey real-time analysis and discovery. Document address: https://www.alibabacloud.com/help/zh/apsaradb-for-redis/latest/use-the-real-time- key-statistics-feature .

Alibaba Cloud Key Analysis

How to deal with bigkey?

Common processing and optimization methods for bigkey are as follows (these methods can be used in conjunction):

  • Split bigkey : Split a bigkey into multiple small keys. For example, a Hash containing tens of thousands of fields is split into multiple Hash according to a certain strategy (such as secondary hashing).
  • Manual cleanup : Redis 4.0+ can use the UNLINK command to asynchronously delete one or more specified keys. For Redis 4.0 and below, you can consider using the SCAN command combined with the DEL command to delete in batches.
  • Use appropriate data structures : for example, do not use String to save file binary data, use HyperLogLog to count page UVs, and Bitmap to save status information (0/1).
  • Turn on lazy-free (lazy deletion/delayed release)  : The lazy-free feature was introduced in Redis 4.0. It refers to allowing Redis to use an asynchronous method to delay the release of the memory used by the key, and hand over the operation to a separate sub-thread to avoid Block the main thread.

Guess you like

Origin blog.csdn.net/pantouyuchiyu/article/details/135090891