Demystifying Huawei Cloud GaussDB (for Redis)丨Big Key Governance

This article is shared from Huawei Cloud Community " Huawei Cloud GaussDB (for Redis) Secret No. 31: Big Key Governance ", author: Gauss Redis official blog.

From the perspective of a DBA, large keys are undoubtedly a common cause of Redis online problems. In order to solve the hidden dangers of large keys, the business must first abide by reasonable development specifications and reduce the generation and access dependencies of large keys. But sometimes the big key is quietly generated during the running of the program, which is hard to guard against. Therefore, a Redis service product that can diagnose online at any time, and can proactively warn and prevent problems before they happen is particularly important.

GaussDB (for Redis): supports large Key online diagnosis

GaussDB (for Redis) adopts a highly reliable architecture that separates computing and storage, and background tasks are deployed on each computing node. GaussDB (for Redis) continuously detects and analyzes the status of large keys in the storage pool through background tasks. When the user executes the command, he directly fetches the result without affecting the online business. Compared with the blocking full scan method in the industry, it is safer.

cke_221.png

After the user executes the bigkeys command, the "answer" will be obtained directly from the node, without unnecessary performance impact caused by full database scanning.

cke_222.png

In addition, GaussDB (for Redis) supports user-defined large key standards, such as strings larger than 1MB, hash types larger than 10,000 elements, etc. Once this function was launched, it has received recognition and praise from many customers and DBA partners.

GaussDB (for Redis): supports large key monitoring and early warning

Share two real cases:

1. The business periodically executes "lrange 0 -1" to obtain all elements of the list key. However, due to program bugs, the business also continues to add to this key for a long time and slowly, resulting in longer and longer keys. It was not until there was a problem with the online business and after several twists and turns that I discovered this dangerous key.

2. The business has been running stably for a long time. One day, a new component will be launched, and the online business will continue to time out. After several investigations, it was found that the new component executed hmset f1 v1 f2 v2... on Redis. A write command carried as many as 20,000 parameters, which seriously affected the production business.

From the DBA's point of view, this kind of problem requires a "big key detective" to keep an eye on it at all times. Once there is a high-risk operation on the big key, it will immediately take the initiative to warn.

GaussDB (for Redis) has designed 10+ monitoring indicators to provide "big key detective" capabilities, such as: the maximum number of elements in a single request return packet (identify the scene where lrange 0 -1 operation big key causes blocking), single request carrying The maximum number of parameters (recognizing the blocking scene caused by tens of thousands of elements in hmset)... DBA only needs to subscribe to the alarm of such indicators based on years of experience, and then "catch the scene of the big key crime" in the first time, Nip risk in the bud.

GaussDB (for Redis): stronger capacity for large Keys

Even in some business scenarios where big keys exist, GaussDB (for Redis) performs far better than open source Redis. The following will introduce some problems often caused by large keys:

1. The big key causes 100% of the CPU, blocking the production business

In open source Redis, a large key can easily cause 100% CPU usage, damage production services, and cause online problems. This is because the open source Redis itself is single-threaded, especially when using a large key in this relatively fragile architecture, it is more likely to cause thread blocking, thus affecting the entire instance.

The multi-threaded architecture of GaussDB (for Redis) is naturally more friendly to large keys and will not be troubled by this problem. Even if a single thread is affected by a single large key, the entire GaussDB (for Redis) instance contains dozens or hundreds of threads, and the overall business will basically not be disturbed.

2. Large keys are frequently "flow controlled" by Redis due to the high bandwidth of individual fragments

At present, there are some open source Redis on the market that are based on a large container to mix and deploy the Redis processes of many tenants. However, under this architecture, in order to prevent a customer's Redis from affecting other customers, flow control is often performed on the customer's Redis process. When When a customer's business has frequent operations on large keys, it is easy to trigger the tenant's bandwidth threshold set for the customer and trigger flow control, resulting in online business damage.

In contrast, each shard of GaussDB (for Redis) is an independent container, which is an exclusive resource for customers and is more reliable. There is no active flow control for resources such as the number of connections and bandwidth, especially for node bandwidth resources. The "ceiling" is very high.

3. Large keys lead to inclination, and fragmented memory usage is uneven

In an open source Redis cluster, storing large keys will lead to uneven memory space and unbalanced consumption, and the shard where the large key is located has OOM risks.

cke_223.png

GaussDB (for Redis) uses a high-performance storage pool, which will not cause data volume inclination to a certain node fragmentation, supports reliable storage of large keys, and will not cause fragmentation OOM.

cke_224.png

4. Redis needs to relocate data when expanding capacity, and large keys always cause problems

When open source Redis is expanded, due to the cross-shard migration of data, the expansion process takes a long time and there is a risk of access blocking. As shown in the figure, so open source Redis must be cautious in expanding its capacity when there are large keys!

cke_225.png

GaussDB (for Redis) supports second-level non-inductive expansion. Regardless of capacity expansion or CPU expansion, there is no need to relocate data, so it is not affected by large keys, and the operation and maintenance experience is excellent.

cke_226.png

This article introduces GaussDB (for Redis)'s large key diagnosis and large key warning features, and how to solve the stability pain points of open source Redis in the large key scenario, providing customers with an efficient and reliable large key solution. In the future, GaussDB (for Redis) will continue to devote itself to developing more useful enterprise-level features to help customers operate and maintain easily and develop efficiently.

appendix

  • Author of this article:

       Huawei cloud database GaussDB (for Redis) team

  • Hangzhou/Xi'an/Shenzhen resume delivery:

       [email protected]

  • For more product information, welcome to visit the official blog:

       bbs.huaweicloud.com/blogs/248875

Click to follow and learn about Huawei Cloud's fresh technologies for the first time~

Guess you like

Origin blog.csdn.net/devcloud/article/details/132236147