Redis data structure: comprehensive analysis of Stream type

Redis, as a high-performance key-value database, has received extensive attention and use because of its rich data types and efficient performance. Among the various data types in Redis, the Stream type is perhaps the newest and most challenging one. The Stream type is a new data type introduced in Redis version 5.0, which provides a persistent, queryable, and scalable message queue service.

In this article, we will fully analyze the Stream type of Redis. We'll start with the basic concepts and features of Stream, and then dive into its internal implementation and performance optimizations. We will also show how to use Stream in real applications through practical examples. Whether you are new to Redis or a developer with some experience, I believe you can learn some useful knowledge from this article.



1. Stream data type

1.1. Introduction to Stream type

Redis Stream is a new data type introduced in Redis version 5.0. It is a persistent, queryable, and scalable message queue service.

The data structure of the Stream type is similar to a log system, data is added to the end of the Stream, and each data will be assigned a unique sequence number, which is incremented in time order. This makes the Stream type ideal for implementing message queues, event-driven systems, data stream processing, and more.

Key features of the Stream type include:

  1. Persistence: Like other Redis data types, Stream type data can also be persisted to disk, which means that even if the Redis server is restarted, the data in Stream will not be lost.

  2. Consumer group: The Stream type supports the concept of consumer group, which enables multiple consumers to read data from the same Stream at the same time, and each consumer will read the data that he has not read yet.

  3. Blocking read: Consumers can choose to read data from the Stream in a blocking manner. If there is no new data currently, consumers can choose to wait until new data arrives.

  4. Historical data query: Consumers can query historical data in Stream, which allows consumers to process previous data after processing current data.

The above are just some basic characteristics of the Stream type. In fact, the Stream type has many other characteristics and usages, which can meet various complex application scenarios.

1.2, Stream usage scenarios

Redis Stream is a very flexible data structure that can be applied in many scenarios. The following are some common application scenarios:

  1. Message queue: Redis Stream can be used as a persistent and scalable message queue service for passing messages between different application components. Consumers can read new messages from the Stream in real time, or query historical messages.
  2. Event-driven system: In an event-driven system, Redis Stream can be used to store and deliver events. Each event can be used as a Stream element, which contains information such as event type, data and timestamp.
  3. Logging: Since Stream elements are stored in chronological order, Redis Stream is very suitable for logging. You can use log events as Stream elements, containing information such as log levels, messages, and timestamps.
  4. Data stream processing: Redis Stream can be used to implement data stream processing systems. You can treat streams of data as Stream elements and use consumer groups to process the data in parallel.
  5. Real-time analysis: You can use Redis Stream to collect real-time event data, and then analyze the data in real time, such as counting user behavior and monitoring system status.

The above are just some common application scenarios of Redis Stream. In fact, due to its powerful and flexible features, you can use Redis Stream in many other scenarios.


2. Stream underlying structure

2.1. Introduction to the underlying structure of Stream

The underlying data structure of Redis Stream is mainly composed of Radix Tree and Listpack. The radix tree is used to index the Listpack, and the Listpack is used to store the Stream Entry.

When a new Stream Entry is added to the Stream, Redis will first try to add it to the latest Listpack. However, if the size of this Listpack has reached the preset upper limit (4096 bytes by default), then Redis will create a new Listpack and add a new Stream Entry to this new Listpack.

This new Listpack will be added to the radix tree, and its corresponding key is the ID of the new Stream Entry. In this way, the radix tree can be used to quickly locate the Listpack containing the specified ID.

Therefore, the conversion condition between radix tree and Listpack is mainly whether the size of Listpack reaches the preset upper limit. If the upper limit is reached, a new Listpack needs to be created and the radix tree updated.

2.2、Listpack

Listpack: Listpack is a compact and efficient list type for storing multiple Stream Entry. Each Stream Entry contains the following parts:

  1. Entry ID: Each Entry has a unique ID, which consists of two parts, timestamp and serial number, used to ensure the uniqueness of each Entry.
  2. Field and Value: Each Entry contains multiple Field and Value pairs for storing actual data.

In the underlying implementation of Redis Stream, all Stream Entry are stored in Listpack. Each Listpack can store multiple Stream Entry, and multiple Listpacks are indexed by Radix Tree for quick search.

Listpack is a new data structure introduced by Redis version 5.0, which is designed to replace Ziplist (compact list). Listpack provides similar functionality to Ziplist, but is optimized in some ways for efficiency and usability.

Both Listpack and Ziplist are compact, efficient list types for storing multiple items. However, Listpack is optimized in the following ways:

  1. Larger maximum number of elements: Listpack can store more elements than Ziplist.
  2. More efficient memory usage: Listpack's memory layout is more compact, allowing it to use less memory while storing the same number of elements.
  3. Faster operations: Listpack is designed to make insertion, deletion, and lookup operations faster.

So while Listpack can be seen as an alternative to Ziplist, it is optimized and improved in many ways.

Each Listpack contains the following parts:

  1. Header: Contains some metadata, such as the total number of bytes of Listpack (occupies 6 bytes) and the number of elements (occupies 2 bytes). If the number of elements exceeds 65535, this value will be set to 65535, and the exact number of elements needs to be obtained by traversing the entire Listpack.

  2. Entries: This is the main part of Listpack, which contains all the elements. Each element consists of a header and a body. The header contains the length and encoding of the element, and the body contains the actual value of the element.

  3. End: marks the end of Listpack, 0xFFconsisting of a single byte.

The elements of each Listpack can be strings of any length or integers. Integers can be encoded differently to save space. For example, small integers can be stored directly as one to four bytes, while larger integers can be stored as strings.

Listpack is designed to be very efficient at storing large numbers of small elements, while also supporting insertion or deletion of elements at arbitrary positions.

2.3, radix tree

Radix Tree: Radix Tree is an efficient key-value pair storage data structure. Redis Stream uses Radix Tree to index Listpack. The key of the radix tree is the ID of the Stream Entry, and the value is the corresponding Listpack. Through the radix tree, you can quickly locate the Listpack containing the specified ID.


3. Stream common commands

Redis Stream provides a series of commands for manipulating and managing Stream data structures. The following are some commonly used commands:

  1. XADD: Add a new Entry to the Stream.

    XADD mystream * sensor-id 1234 temperature 19.8
    
  2. XRANGE: Get a series of Entry in the Stream.

    XRANGE mystream - +
    
  3. XREAD: Read new Entry from Stream.

    XREAD COUNT 2 STREAMS mystream 0
    
  4. XDEL: Delete the specified Entry from the Stream.

    XDEL mystream 1526569495631-0
    
  5. XTRIM: Trim the Stream and keep only the specified number of Entry.

    XTRIM mystream MAXLEN 1000
    
  6. XLEN: Get the number of Entry in the Stream.

    XLEN mystream
    
  7. XGROUP: Manage the consumer group of Stream.

    XGROUP CREATE mystream mygroup 0
    
  8. XREADGROUP: Read new Entry from the consumer group.

    XREADGROUP GROUP mygroup Alice STREAMS mystream >
    
  9. XACK: Confirm that the Entry in the consumer group has been processed.

    XACK mystream mygroup 1526569498055-0
    

The above are some commonly used Redis Stream commands, which can be used to manage and manipulate Stream data structures.

おすすめ

転載: blog.csdn.net/weixin_45187434/article/details/132593271