Redis high performance reason analysis-the influence of CPU core and NUMA architecture and optimization scheme

A very important reason why Redis is widely used is its high performance. Therefore, we must pay attention to all the factors, mechanisms and countermeasures that may affect Redis performance. The five potential factors that affect Redis performance are:

Blocking operations inside Redis
The impact of CPU core and NUMA architecture
Redis key system configuration
Redis memory fragmentation
Redis buffer

In this lecture, let's learn about the impact of CPU on Redis performance and how to deal with it.

Mainstream CPU architecture

Before learning, let's first understand the mainstream CPU architecture and what are its characteristics, so that we can better understand how the CPU affects Redis.

CPU multi-core architecture

A CPU processor generally has multiple operating cores, called physical cores.
The physical core includes private level one instruction/data cache (L1 cache) and level two cache (L2 cache).
Each physical core will run two hyperthreads, also called logical cores. The logical cores of the same physical core will share L1 and L2 caches.
Different physical cores share L3 cache (L3 cache)

Multi-CPU Socket architecture

On a multi-CPU architecture, applications can run on different processors.

When the application program is scheduled to run between different Sockets, it accesses the memory of the previous Socket. This access belongs to remote memory access.

Compared with accessing the memory directly connected to the Socket, remote memory access will increase the delay of the application.

This architecture is called non-uniform memory access architecture (Non-Uniform Memory Access, NUMA architecture).

The impact of CPU multi-core on Redis performance

If the Redis instance is frequently scheduled to run on different CPU cores in the CPU multi-core scenario, then the request processing time of the Redis instance will have a greater impact. Every time it is scheduled, some requests will be affected by the reloading process of runtime information, instructions, and data, which will cause some requests to have significantly higher latency than others .

To avoid Redis always scheduling execution back and forth on different CPU cores. The most direct method is to bind the Redis instance to the CPU core, and let a Redis instance run on a CPU core.

To bind the core through the taskset command :

taskset -c 0 ./redis-server

Tying cores is not only good for reducing tail latency, but also reducing average latency, increasing throughput, and improving Redis performance.

The impact of CPU NUMA architecture on Redis performance

In the actual application of Redis, there is a way: in order to improve the network performance of Redis, the network interrupt handler of the operating system is bound to the CPU core.

Under the NUMA architecture of the CPU, when the network interrupt handler and the Redis instance are bound to the CPU core respectively, there will be a potential risk: if the CPU cores tied to the network interrupt handler and the Redis instance are not in the same CPU Socket Above, then, when the Redis instance reads network data, it needs to access the memory across the CPU Socket, and this process will take more time.

In order to prevent Redis from accessing network data across CPU Sockets, we'd better tie the network interrupt program and Redis instance to the same CPU Socket, so that the Redis instance can read network data directly from the local memory.

For binding under the NUMA architecture of the CPU, pay attention to the numbering rule of the CPU core. You can execute the lscpu command to view the number of the core.

lscpu

Architecture: x86_64
...
NUMA node0 CPU(s): 0-5,12-17
NUMA node1 CPU(s): 6-11,18-23
...

However, everything has two sides, and there are certain risks associated with nuclear tying. Next, let's understand its potential risks and solutions.

The risks and solutions of nuclear binding

Option 1: One Redis instance corresponds to a physical core

When binding cores to a Redis instance, we should not bind an instance to a logical core, but to a physical core, that is to say, use both logical cores of a physical core.

Option 2: Optimize Redis source code

By modifying the Redis source code, the child processes and background threads are tied to different CPU cores.

This is the end of this article. If you are interested, you can read my previous article. You can also join the group 973961276 to communicate and learn with you. There are many video materials and technical experts in the group. You should understand together with the article. Let you have a good harvest.