Redis Chapter 9 - Detailed Explanation of Redis Queue Stream and Redis6 Multithreading

Redis queue Stream

Pre-note : The biggest new feature of Redis5.0 is the addition of a data structure Stream, which is a new powerful multicast and durable message queue. The author declares that Redis Stream borrows from Kafka's design.

insert image description here

  • The structure of Redis Stream is shown in the figure above. Each Stream has a message list, which strings all the added messages together. Each message has a unique ID and corresponding content. The message is persistent, and the content is still there after Redis restarts.
  • Each Stream has a unique name, which is the key of Redis, which is automatically created when we first use the xadd command to append a message.
  • Each Stream can be linked to multiple consumer groups, and each consumer group will have a cursor last_delivered_id that moves forward on the Stream array, indicating which message the current consumer group has consumed. Each consumer group has a unique name in the Stream. The consumer group will not be created automatically. It needs to be created by a separate command xgroup create. It needs to specify a message ID of the Stream to start consumption. This ID is used to initialize the last_delivered_id variable.
  • The state of each consumer group (Consumer Group) is independent and does not affect each other. That is to say, the messages inside the same Stream will be consumed by every consumer group.
  • The same consumer group (Consumer Group) can be attached to multiple consumers (Consumer), and these consumers are in a competitive relationship. Any consumer who reads the message will move the cursor last_delivered_id forward. Each consumer has a unique name within the group.
  • There will be a status variable pending_ids inside the consumer (Consumer), which records the messages that have been read by the client but have not yet been acknowledged. If the client does not have ack, the message ID in this variable will increase, and once a message is acked, it will start to decrease. This pending_ids variable is officially called PEL in Redis, that is, Pending Entries List. This is a very core data structure, which is used to ensure that the client consumes the message at least once, and will not be lost in the middle of the network transmission. deal with.
  • The format of the message ID is timestampInMillis-sequence, for example 1527846880572-5, which means that the current message is generated at the millisecond timestamp 1527846880572, and it is the 5th message generated within this millisecond. The message ID can be automatically generated by the server or specified by the client itself, but the form must be integer-integer, and the ID of the message added later must be greater than the ID of the previous message.
  • The content of the message is a key-value pair, which is shaped like a hash-structured key-value pair

Commonly used operation commands
on the production side
are xadd to add a message
and xdel to delete a message. The deletion here is only to set the flag, and will not actually delete the message.
xrange gets the message list, and it will automatically filter the deleted messages.
xlen message length
del delete Stream
For example:
insert image description here

  1. streamtest indicates the name of the queue
  2. * Indicates that the server automatically generates an id, and it is generally recommended to automatically generate
  3. The following is the key-value pair we store in the message
  4. The return value is the generated message ID, which consists of two parts: timestamp-serial number. The timestamp is in milliseconds, and it is the time of the Redis server that generated the message. It is a 64-bit integer. The sequence
    number is the sequence number of the message at this millisecond time point. (Because a redis command is at the nanosecond level, a timestamp plus a number is required to determine the only message)

In order to ensure that the messages are ordered, the IDs generated by Redis are in monotonically increasing order. Since the ID contains a timestamp part, in order to avoid problems caused by server time errors (for example, the server time is delayed), each Stream type data of Redis maintains a latest_generated_id attribute, which is used to record the ID of the last message. If it is found that the current timestamp is backward (less than the record of latest_generated_id), then use the scheme that the timestamp remains unchanged but the sequence number is incremented as the new message ID (this is why the sequence number uses int64 to ensure that there are enough sequence numbers), Thus, the monotonically increasing property of ID is guaranteed.

Inserting two pieces of data, there are three pieces of data in this queue at the moment
insert image description here

#其中-表示最小值 , + 表示最大值
xrange streamtest - +

insert image description here
Or we can specify a list of message IDs:

#复制编号,从第二条查询
xrange streamtest 1686400124425-0 + 

insert image description here

#查看消息队列长度
xlen streamtest

insert image description here

#删除消息
xdel streamtest 1686400124425-0

insert image description here
Consumer

  1. Single consumer
    Although there is a concept of consumer groups in Stream, it is possible to consume Stream messages independently without defining a consumer group. When Stream has no new messages, it can even block and wait. Redis has designed a separate consumption instruction xread, which can use Stream as an ordinary message queue (list). When using xread, we can completely ignore the existence of consumer groups, just like Stream is an ordinary list.
xread count 1 streams streamtest 0-0

insert image description here

insert image description here
So it is best to read the latest message at the tail in a blocking manner until a new message arrives

xread block 0 count 1 streams streamtest $

insert image description here
insert image description here
insert image description here
It can be seen that the blocking is lifted, the new message content is returned, and a waiting time is displayed. Here we waited for 22s. Generally speaking,
if the client wants to use xread for sequential consumption, it must remember where the current consumption is. , which is the returned message ID. Next time you continue to call xread, pass the last message ID returned last time as a parameter, and you can continue to consume subsequent messages.

Consumer group
Create a consumer group
Stream Create a consumer group (Consumer Group) through the xgroup create command, and need to pass the initial message ID parameter to initialize the last_delivered_id variable.

#“streamtest”指明了要读取的队列名称,“cg1”表示消费组的名称,
#“0-0”表示从头开始消费
xgroup create streamtest cg1 0-0
#  $表示从尾部开始消费,只接受新消息,当前 Stream 消息会全部忽略
xgroup create streamtest cg2 $

insert image description here

# 1 2可以看到消费队列长度,9 10可以看到最后生成消息id,7 8可以看到有两个消费组
xinfo stream streamtest

insert image description here
message consumption

With a consumer group, consumers are naturally needed. Stream provides the xreadgroup command to consume within a consumer group. It is necessary to provide the consumer group name, consumer name, and initial message ID.
Like xread, it can also block waiting for new messages. After reading a new message, the corresponding message ID will enter the consumer's PEL (message being processed) structure, and the client will use the xack command to notify the server after processing that the message has been processed, and the message ID will be transferred from the PEL removed from the .

#“GROUP”属于关键字,“cg1”是消费组名称,“c1”是消费者名称,“count 
#1”指明了消费数量,> 号表示从当前消费组的 last_delivered_id 后面开始
#读,每当消费者读取一条消息,last_delivered_id 变量就会前进
xreadgroup group cg1 c1 count 1 streams streamtest >

insert image description here
When I kept reading, I quickly finished reading the three inserted messages

Then set the blocking wait

xreadgroup GROUP cg1 c1 block 0 count 1 streams streamtest >

insert image description here
insert image description here

Go back to the original client and find that the blockage is released and new messages are received. insert image description here
Let’s observe and observe the status of the consumer group.
insert image description here
If there are multiple consumers in the same consumer group, we can also observe the status of each consumer through the xinfo consumers command.

xinfo consumers streamtest cg1

If we confirm a message, we find that there are still three without ack confirmation

xack streamtest cg1 1686403439863-0

insert image description here
insert image description here
insert image description here

Threads and IO model in Redis

What is the Reactor pattern?
The origin of "reaction" in the name of "reaction":
"reaction" means "inversion", "reversal of control", and the specific event handler does not call the reactor, but registers an event handler with the reactor, indicating that it is responsible for certain events Interested, time is coming, the specific event handler responds to a specified event occurrence through the event handler; this control inversion is also known as "Hollywood Law" (don't call me, let me call you)
insert image description here
insert image description here
insert image description here
insert image description here
Redis Threads and IO overview in
insert image description here
insert image description here
insert image description here
I/O multiplexer file insert image description here
Event Dispatcher

  • The file event dispatcher receives the socket from the I/O multiplexing program, and calls the corresponding event handler according to the event type generated by the socket.

file event handler

  • The server will associate different event handlers for sockets that perform different tasks. These handlers are functions that define the actions that the server should perform when an event occurs.
  • Redis has written multiple processors for various file event requirements. If the client connects to Redis and responds to each client connected to the server, it needs to map the socket to the connection response processor to write data to Redis and receive the data from the client. The command request needs to be mapped to the command request processor to read data from Redis and return the execution result of the command to the client. It needs to be mapped to the command reply processor. When the master server and the slave server perform replication operations, both the master and slave servers need to map to a copy processor written specifically for the copy function.

Types of file events
The I/O multiplexer can monitor multiple socket ae.h/AE_READABLE events and
ae.h/AE_WRITABLE events. The correspondence between these two types of events and socket operations is as follows:
When socket Readable (for example, the client executes write/close operation on Redis), or when there is a new answerable socket (that is, the client executes connect operation on Redis), the socket will generate an AE_READABLE event.
When the socket is writable (for example, the client performs a read operation on Redis), the socket will generate an AE_WRITABLE event.
The I/O multiplexer can monitor both AE_REABLE and AE_WRITABLE events at the same time. If a socket generates these two events at the same time, the file event dispatcher will give priority to the AE_REABLE event. That is, when a socket is both readable and writable, the Redis server reads first and then writes to the socket.

insert image description here

Multithreading in Redis6

1. Are versions before Redis 6.0 really single-threaded?
When Redis processes client requests, including acquisition (socket read), parsing, execution, content return (socket write), etc., are all processed by a serial main thread, which is the so-called "single thread".
But strictly speaking, Redis is not single-threaded after 4.0. In addition to the main thread, it also has background threads that handle some slow operations, such as cleaning dirty data, releasing useless connections, deleting large keys, and so on.

2. Why didn't Redis use multithreading before 6.0?
The official has responded to similar questions: When using Redis, there is almost no CPU bottleneck. Redis is mainly limited by memory and network. For example, on a normal Linux system, Redis can handle 1 million requests per second by using pipelining, so if the application mainly uses O(N) or O(log(N)) commands, it will hardly take up too much CPU.
After using a single thread, the maintainability is high. Although the multi-threaded model performs well in some aspects, it introduces the uncertainty of program execution order, which brings a series of problems of concurrent reading and writing, increases the complexity of the system, and may have thread switching and even locking at the same time. Performance loss caused by unlocking and deadlock. Redis has very high processing performance through AE event model and IO multiplexing technologies, so there is no need to use multithreading. The single-threaded mechanism greatly reduces the complexity of the internal implementation of Redis, and "thread-unsafe" commands such as Hash's lazy Rehash and Lpush can be performed without locks.

3. Why does Redis6.0 introduce multi-threading?
Redis puts all data in memory, and the memory response time is about 100 nanoseconds. For small data packets, the Redis server can handle 80,000 to 100,000 QPS, which is also the limit of Redis processing. For 80% of companies, a single Threaded Redis is enough to use
, but with more and more complex business scenarios, some companies have hundreds of millions of transactions at every turn, so they need a larger QPS.
A common solution is to partition data in a distributed architecture and use multiple servers, but this solution has very serious disadvantages, such as too many Redis servers to manage and high maintenance costs; Commands don't work with data partitions; data partitions don't solve hotspot read/write issues; data skew, reallocation and scaling up/down becomes more complicated, etc.
From the perspective of Redis itself, because the read/write system calls for reading and writing to the network occupy most of the CPU time during Redis execution, the bottleneck is mainly the IO consumption of the network. There are two main directions for optimization: • Improve network IO performance,
typically Implementations such as using DPDK to replace the kernel network stack
• Use multi-threading to make full use of multi-cores, typical implementations such as Memcached. This method of protocol stack optimization has little to do with Redis. Supporting multi-threading is the most effective and convenient operation method. So to sum up, there are two main reasons why redis supports multi-threading:
• It can make full use of the server CPU resources. Currently, the main thread can only use one core.
• Multi-threaded tasks can share the read and write load of Redis synchronous IO.

4. Does Redis 6.0 enable multi-threading by default?
Redis6.0's multithreading is disabled by default, and only the main thread is used. To enable it, you need to modify
the redis.conf configuration file: io-threads-do-reads yes

insert image description here
After enabling multi-threading, you also need to set the number of threads, otherwise it will not take effect.
Regarding the setting of the number of threads, there is an official suggestion: 2 or 3 threads are recommended for 4-core machines, and 6 threads are recommended for 8-core machines. The number of threads must be smaller than the number of machine cores. It should also be noted that the larger the number of threads, the better. Officials believe that more than 8 threads are basically meaningless.

5. After Redis6.0 adopts multi-threading, what is the effect of performance improvement?
Antirez, the author of Redis, mentioned in his sharing at RedisConf 2019: The multi-threaded IO feature introduced by Redis 6 has at least doubled the performance improvement. There are also Daniels in China who have used the unstable version to test on Alibaba Cloud esc. The performance of GET/SET commands in 4-thread IO is almost double that of single-thread. If multi-threading is enabled, at least a 4-core machine is required, and it is recommended only when the Redis instance has taken up a considerable amount of CPU time. Otherwise, it is meaningless to use multi-threading.

6. What is the implementation mechanism of Redis6.0 multithreading?
insert image description here

7. After enabling multi-threading, will there be thread concurrency security issues?
It can be seen from the above implementation mechanism that the multi-threaded part of Redis is only used to process the reading and writing of network data and protocol analysis, and the execution commands are still executed sequentially in a single thread. So we don't need to consider the concurrency and thread safety issues of controlling keys, lua, transactions, LPUSH/LPOP, etc.

Guess you like

Origin blog.csdn.net/qq_39944028/article/details/131146079