What impact does Redis's large Key have on persistence?

Author: Kobayashi coding
graphic computer foundation (operating system, computer network, computer composition, database, etc.) website: https://xiaolincoding.com

Hello everyone, I am Xiaolin.

Last week, a reader was asked: What impact does Redis's large Key have on persistence?

·

There are two persistence methods of Redis: AOF log and RDB snapshot.

So next, analyze and analyze these two persistence methods in detail.

The impact of large Keys on AOF logs

Let me talk about the three strategies for writing AOF logs back to disk

Redis provides three strategies for writing AOF logs back to the hard disk, namely:

  • Always, this word means "always", so it means that after each write operation command is executed, the AOF log data will be written back to the hard disk synchronously;
  • Everysec, the word means "per second", so it means that after each write operation command is executed, the command is first written to the kernel buffer of the AOF file, and then the contents of the buffer are written every second write back to hard drive;
  • No, means that the timing of writing back to the hard disk is not controlled by Redis, and is transferred to the operating system to control the timing of writing back, that is, after each write operation command is executed, the command is first written to the kernel buffer of the AOF file, and then the operation The system decides when to write the buffer contents back to disk.

These three strategies are only controlling the timing of calling the fsync() function.

When an application writes data to a file, the kernel usually first copies the data into a kernel buffer, then enqueues it, and then the kernel decides when to write to the hard disk.

If you want the application to synchronize the data to the hard disk immediately after the application writes data to the file, you can call the fsync() function, so that the kernel will directly write the data in the kernel buffer to the hard disk, and wait until the hard disk write operation is completed , the function returns.

  • The Always strategy is to execute the fsync() function every time the AOF file data is written;
  • The Everysec strategy will create an asynchronous task to execute the fsync() function;
  • The No strategy is to never execute the fsync() function;

Let’s talk about these three strategies separately. What will be the impact when the big key is persisted?

When using the Always policy, the main thread will write the data to the AOF log file after executing the command, and then call the fsync() function to directly write the data in the kernel buffer to the hard disk, and wait until the hard disk write operation is completed , the function returns.

When using the Always strategy, if the write is a large Key, the main thread will block for a long time when executing the fsync() function, because when the amount of data written is large, the data is synchronized to the hard disk. The process is time consuming .

When using the Everysec policy, since the fsync() function is executed asynchronously, the process of persisting large keys (data synchronization to disk) will not affect the main thread.

When the No policy is used, since the fsync() function is never executed, the process of persisting large keys will not affect the main thread.

The impact of large Key on AOF rewriting and RDB

When many large keys are written in the AOF log, the size of the AOF log file will be large, and the AOF rewrite mechanism will be triggered soon .

The process of AOF rewriting mechanism and RDB snapshot (bgsave command) will respectively create a sub-process through fork()the function to handle the task.

In the process of creating a child process, the operating system will copy the "page table" of the parent process to the child process. This page table records the mapping relationship between virtual addresses and physical addresses without copying physical memory. That is to say, The virtual spaces of the two are different, but the corresponding physical spaces are the same.


In this way, the child process shares the physical memory data of the parent process, which can save physical memory resources, and the attribute of the page table entry corresponding to the page table will mark the permission of the physical memory as read- only .

As Redis has more and more large keys, Redis will occupy a lot of memory, and the corresponding page table will be larger.

When creating a child process through fork()the function , although the physical memory of the parent process will not be copied , the kernel will copy the page table of the parent process to the child process. If the page table is large, the copying process will be very time-consuming Yes, then blocking occurs when the fork function is executed .

Moreover, the fork function is called by the Redis main thread. If the fork function is blocked, it means that the Redis main thread will be blocked. Since the Redis execution command is processed in the main thread, when the Redis main thread is blocked, it cannot process subsequent commands sent by the client.

We can execute info the command to get the latest_fork_usec indicator, indicating that the latest fork operation of Redis takes time.

# 最近一次 fork 操作耗时
latest_fork_usec:315

If fork takes a long time, such as more than 1 second, you need to make optimization adjustments:

  • The memory usage of a single instance is controlled below 10 GB, so that the fork function can return quickly.
  • If Redis is only used as a pure cache and you don't care about Redis data security, you can consider turning off AOF and AOF rewriting, so that the fork function will not be called.
  • In the master-slave architecture, the repl-backlog-size should be appropriately increased to avoid that the master node frequently uses the full synchronization method because the repl_backlog_buffer is not large enough. When the full synchronization is performed, the RDB file will be created, that is, fork will be called function.

When will the copying of physical memory occur?

When the parent process or the child process initiates a write operation to the shared memory, the CPU will trigger a page fault interrupt . This page fault interrupt is caused by a violation of permissions, and then the operating system will perform physical memory processing in the "page fault exception handling function". copy, and reset its memory mapping relationship, set the memory read and write permissions of the parent and child processes to be readable and writable, and finally write to the memory. This process is called "Copy On Write (Copy On Write) " .

Copy-on-write, as the name suggests, only when a write operation occurs, the operating system will copy the physical memory. This is to prevent the parent process from being blocked for a long time due to the long copying time of the physical memory data when fork creates a child process.

If the parent process modifies the large Key in the shared memory after the child process is created, then the kernel will copy-on-write and copy the physical memory. Since the physical memory occupied by the large Key is relatively large, then The process of copying physical memory is also time-consuming, so the parent process (main thread) will be blocked .

So, there are two phases that lead to blocking the parent process:

  • On the way to create a child process, because the data structure such as the page table of the parent process needs to be copied, the blocking time is related to the size of the page table. The larger the page table, the longer the blocking time;
  • After the child process is created, if the child process or the parent process modifies the shared data, copy-on-write will occur. During this period, the physical memory will be copied. If the memory is larger, the natural blocking time will be longer;

Here is an additional mention, if Linux enables large memory pages, it will affect the performance of Redis .

The Linux kernel has supported the memory huge page mechanism since 2.6.38, which supports the memory page allocation of 2MB size, while the regular memory page allocation is performed at the granularity of 4KB.

If large memory pages are used, even if the client requests to modify only 100B of data, Redis needs to copy a large page of 2MB after copy-on-write occurs. On the contrary, if it is a conventional memory page mechanism, only 4KB is copied.

Comparing the two, you can see that the copy memory page unit caused by each write command is magnified by 512 times, which will slow down the execution time of the write operation and eventually lead to slower Redis performance .

So what to do? Very simple, turn off memory huge pages (default is off).

The method of disabling is as follows:

echo never >  /sys/kernel/mm/transparent_hugepage/enabled

Summarize

When the AOF write-back policy is configured with the Always policy, if the write is a large key, the main thread will block for a long time when executing the fsync() function, because when the amount of data written is large, the data synchronization This process to the hard disk is very time-consuming.

The process of AOF rewriting mechanism and RDB snapshot (bgsave command) will respectively create a sub-process through fork()the function to handle the task. There are two phases that result in blocking the parent process (main thread):

  • On the way to create a child process, because the data structure such as the page table of the parent process needs to be copied, the blocking time is related to the size of the page table. The larger the page table, the longer the blocking time;
  • After the child process is created, if the parent process modifies the large Key in the shared data, copy-on-write will occur, and the physical memory will be copied during this period. Since the physical memory occupied by the large Key will be large, then copying the physical memory will process, it will be more time-consuming, so it is possible to block the parent process.

In addition to affecting persistence, large keys also have the following effects.

  • The client timed out and blocked. Since Redis executes commands in a single-threaded process, and it takes time to operate large keys, Redis will be blocked. From the perspective of the client, there is no response for a long time.

  • cause network congestion. Every time a large key is acquired, the network traffic is relatively large. If the size of a key is 1 MB and the number of visits per second is 1000, then 1000 MB of traffic will be generated per second, which is catastrophic for a server with an ordinary gigabit network card. of.

  • Block worker threads. If you use del to delete a large key, the worker thread will be blocked, so that there is no way to process subsequent commands.

  • The memory is unevenly distributed. In the cluster model, when the slot fragmentation is uniform, there will be data and query skew. Some Redis nodes with large keys occupy a lot of memory, and the QPS will be relatively large.

How to avoid big key?

It is best to split the large key into small keys at the design stage. Or, regularly check whether there is a big key in Redis. If the big key can be deleted, do not use the DEL command to delete it, because the deletion process of this command will block the main thread. Instead, use the unlink command (Redis 4.0+) to delete the big key, because The deletion process of this command is asynchronous and will not block the main thread.

over!

Guess you like

Origin blog.csdn.net/qq_34827674/article/details/126829220