[Redis] Why can Redis single thread support high concurrency? Why did multithreading be introduced after Redis6.0?

1. Why can Redis single thread support high concurrency?

What is the difference between redis and memcached? What is the threading model of redis? Why can redis single thread support high concurrency?

  This is the most basic question when asking redis. One of the most basic internal principles and characteristics of redis is that redis is actually a single-threaded working model. If you don’t know this, then when you play redis later, it will come out. Isn’t the problem that you don’t know anything?

  It is also possible that the interviewer will ask you the difference between redis and memcached, but memcached is a caching solution commonly used by major Internet companies in the early years, but now it is basically redis in recent years, and no company uses memcached anymore.

What is the difference between redis and memcached?

  • Storage method: memcached stores all the data in the memory, and it will hang up after a power failure. The data cannot exceed the memory size. Part of redis is stored on the hard disk, which can ensure the durability of the data.
  • Data support type: memcached supports relatively simple data types. Redis has complex data types.
  • Low-level model: The bottom-level implementation between them and the application protocol for communication with the client are different. Redis directly built the VM mechanism by itself, because the general system calls system functions, it will waste a certain amount of time to move and request.

Compared with memcached, redis has more data structures and can support richer data operations. If you need to cache to support more complex structures and operations, redis would be a good choice.


  Redis natively supports cluster mode. In the redis 3.x version, cluster mode can be supported, while memcached does not have a native cluster mode. It needs to rely on the client to implement fragmented write data to the cluster.

Performance comparison
  Because redis only uses a single core, and memcached can use multiple cores, redis has a higher performance than memcached when storing small data on average on each core. In the data above 100k, the performance of memcached is higher than that of redis. Although redis has recently optimized the performance of storing big data, it is still slightly inferior to memcached.

Redis thread model

  Redis uses a file event handler internally. This file event handler is single-threaded , so redis is called a single-threaded model. That is to say , the single thread of redis means that the core module that executes the redis command is single threaded, rather than the entire Redis instance is just one thread , and other Redis modules have threads of their own modules. Redis 4.0 has the concept of multithreading. For example, redis deletes objects in the background through multi-threading, and blocking commands implemented through the redis module. Redis uses the IO multiplexing mechanism to monitor multiple sockets at the same time , and selects the corresponding event handler for processing according to the events on the socket.

The structure of the file event handler consists of 4 parts: multiple sockets, IO multiplexing program, file event dispatcher, event handler (connection response handler, command request handler, command reply handler).
  Multiple sockets are possible Different operations will be generated concurrently, and each operation corresponds to a different file event, but the IO multiplexing program will monitor multiple sockets, put the events generated by the socket into the queue, and the event dispatcher will take it out of the queue each time An event, the event is handed over to the corresponding event handler for processing.

Let’s look at a communication process between the client and redis: the
Insert picture description here
  client socket01 requests the server socket of redis to establish a connection. At this time, the server socket will generate an AE_READABLE event. After the IO multiplexer monitors the event generated by the server socket, the The event is pushed into the queue. The file event dispatcher obtains the event from the queue and hands it to the connection response processor. The connection response processor creates a socket01 that can communicate with the client, and associates the AE_READABLE event of the socket01 with the command request processor.

  Assuming that the client sends a set key value request at this time, socket01 in redis will generate an AE_READABLE event, and the IO multiplexer will push the event into the queue. At this time, the event dispatcher gets the event from the queue. The AE_READABLE event of the previous socket01 has been associated with the command request processor, so the event dispatcher will hand the event to the command request processor for processing. The command requests the processor to read the key value of socket01 and complete the key value setting in its own memory. After the operation is complete, it associates the AE_WRITABLE event of socket01 with the command reply processor.

  If the client is ready to receive the returned result at this time, then socket01 in redis will generate an AE_WRITABLE event, which is also pressed into the queue, and the event dispatcher finds the associated command reply processor, and the command reply processor inputs this to socket01 A result of this operation, such as ok, then disassociate the AE_WRITABLE event of socket01 with the command reply processor.

This completes a communication.

Why is the redis single-threaded model so efficient? In summary, the following points:

  • Pure memory operation
  • The core is based on non-blocking IO multiplexing mechanism
  • Single thread avoids the frequent context switching problem of multiple threads

2. Why is multithreading introduced after Redis 6.0?

1. The traditional blocking IO model

  Before looking at the reactor model, it is necessary to mention the traditional blocking IO model.
  In the traditional blocking IO model, an independent Acceptor thread monitors the client's connection. Whenever a client requests it, it will allocate a new thread for the client to process. When multiple requests come in at the same time, the server will allocate the corresponding number of threads. This will cause the CPU to switch frequently and waste resources.
  Some connection requests come over and do nothing, but the server will also allocate the corresponding thread, which will cause unnecessary thread overhead. This is like going to a restaurant to eat, holding the menu for a long time and discovering it is really expensive, and then leaving. During this time, the waiter who is waiting for the order is equivalent to a corresponding thread, and the order can be regarded as a connection request.
Insert picture description here
  At the same time, every time a connection is established, when the thread calls the read and write method, the thread will be blocked until there is data to read and write, during which the thread cannot do other things. It's the example of eating in the restaurant above. You go around and find that this is the most cost-effective one. When I returned to this restaurant, I took the menu and looked at it for a long time. During this process, the waiter can't do anything, but can only wait and wait. This process is equivalent to blocking.
Insert picture description here
  In this way, a thread must be allocated for each request, and the thread must be blocked until the thread is processed. Some requests just come to connect, do nothing, and have to allocate a thread for it, which requires high server resources. Encountered high concurrency scenarios, I can't imagine. A fixed architecture with a relatively small number of connections can be considered.

2. Pseudo asynchronous IO model

  A solution optimized through thread pools, using thread pools and task queues. This is called the pseudo-asynchronous IO model.
When a client accesses, encapsulate the client's request into a task and deliver it to the back-end thread pool for processing. The thread pool maintains a message queue and multiple active threads to process tasks in the message queue.
Insert picture description here
  This solution avoids the problem of thread resource exhaustion caused by creating a thread for each request. But the bottom layer is still a synchronous blocking model. If all threads in the thread pool are blocked, then more requests cannot be responded to. Therefore, this mode will limit the maximum number of connections and cannot fundamentally solve the problem.
  Continuing to use the restaurant above as an example, after the restaurant owner has been in business for a period of time, customers have increased, and the original five waiters in the store can't handle it with one-to-one service. So the boss uses a five-person thread pool approach. The waiter serves one guest immediately after serving another.
  At this time, the problem arises. Some customers order food very slowly, and the waiter has to wait a long time until the customer finishes ordering. If all the five customers order very slowly, the five waiters will have to wait forever, which will result in the remaining customers being unserved. This is the situation where all threads in the thread pool are blocked as mentioned above.
So how can this problem be solved? Don't worry, the Reactor mode is about to appear.

3. Reactor design pattern

  The basic design idea of ​​Reactor mode is based on the I/O reuse model.
  Let's talk about the I/O reuse model here. Unlike traditional IO multi-threaded blocking, multiple connections in the I/O multiplexing model share a blocking object, and applications only need to wait on one blocking object. When a connection has new data that can be processed, the operating system notifies the application, the thread returns from the blocked state, and starts business processing.
  What does that mean? The owner of the restaurant also discovered the problem of slow ordering by customers, so he adopted a bold approach, leaving only one waiter. When a guest orders a meal, the waiter will entertain other guests. After the guest orders the meal, he will directly call the waiter to serve. The customer and the waiter here can be regarded as multiple connections and one thread, respectively. The waiter blocked a customer, and when another customer ordered a meal, she immediately went to serve the other customers.
  After understanding the design idea of ​​reactor, let's look at the single-threaded implementation of single reactor:
Insert picture description here
  Reactor monitors client request events through the I/O multiplexing program, and distributes them through the task dispatcher after receiving the events.

  • For the connection establishment request event, it is processed by Acceptor, and the corresponding handler is established to be responsible for subsequent business processing.
  • For non-connection events, Reactor will call the corresponding handler to complete the read->business processing->write processing process, and return the result to the client.

The whole process is completed in one thread.
Insert picture description here
In the single-threaded era,
  Redis is implemented based on the Reactor single-threaded model.
  After the IO multiplexing program receives the user's request, it pushes all of them to a queue and delivers them to the file dispatcher. For subsequent operations, as seen in the reactor single-threaded implementation, the entire process is completed in one thread, so Redis is called a single-threaded operation.
Insert picture description here
For single-threaded Redis, it is based on memory, and command operation time complexity is low, so the read and write rate is very fast.
Multithreading
  was introduced in the Redis6 version of the multithreading era . As mentioned above, Redis single-threaded processing has a very fast speed, so why introduce multi-threading? Where is the bottleneck of single thread?
  Let's first look at the second question. In Redis, the single-threaded performance bottleneck is mainly in network IO operations. That is, most of the CPU time is consumed during the execution of the read/write network read/write system call. If you want to delete some large key-value pairs, you can't delete them in a short time, so for a single thread, it will block the subsequent operations.
  Recall the single-threaded processing method in the Reactor mode mentioned above. For non-connection events, Reactor will call the corresponding handler to complete the read->business processing->write processing flow, which means that this step will cause a performance bottleneck.
Redis is designed to process network data reading and writing and protocol analysis in a multi-threaded manner. For command execution, single-threaded operations are still used.

to sum up

Reactor mode

  • The traditional blocking IO model allocates 1:1 between client and server threads, which is not conducive to expansion.
  • The pseudo-asynchronous IO model uses a thread pool method, but the bottom layer still uses a synchronous blocking method, which limits the maximum number of connections.
  • Reactor monitors client request events through the I/O multiplexing program, and distributes them through the task dispatcher.
    Single thread era
  • Based on the Reactor single-threaded mode, after receiving the user's request through the IO multiplexing program, all the requests are pushed to a queue and handed over to the file dispatcher for processing.

Multithreaded era

  • The single-threaded performance bottleneck is mainly on the network IO.
  • The network data reading and writing and protocol analysis are processed in a multi-threaded manner. For command execution, single-threaded operations are still used.

Article source:
[1] Why is Redis single-threaded but can support high concurrency?
[2], Redis6.0 multithreading

Guess you like

Origin blog.csdn.net/dl962454/article/details/115218011