Redis data type and encoding

Redis data type and encoding

Speaking of the data types of Redis, we will probably quickly think of the five common data types of Redis: string (String), list (List), hash (Hash), set (Set), ordered set (Sorted Set) ), as well as their characteristics, application scenarios and common commands. But before talking about the five data types, we need to look at the global operation commands in redis. Below are some common operations and their time complexity and usage scenarios.

Insert picture description here

Most of the global commands are for key attribute setting and modification. In addition to global operations, the five commonly used data types of redis also have their own separate operation commands. Let’s take a look at them one by one. By the way, let’s take a look at these data types. The internal principles are also roughly understood.

Insert picture description here

To talk about the internal principles, first we have to look at how many encoding methods there are in redis:

/* Objects encoding. Some kind of objects like Strings and Hashes can be
 * internally represented in multiple ways. The 'encoding' field of the object
 * is set to one of this fields for this object. */
#define OBJ_ENCODING_RAW 0     /* Raw representation */
#define OBJ_ENCODING_INT 1     /* Encoded as integer */
#define OBJ_ENCODING_HT 2      /* Encoded as hashtable */
#define OBJ_ENCODING_ZIPMAP 3  /* Encoded as zipmap */ // 已废弃
#define OBJ_ENCODING_LINKEDLIST 4 /* Encoded as regular linked list */
#define OBJ_ENCODING_ZIPLIST 5 /* Encoded as ziplist */
#define OBJ_ENCODING_INTSET 6  /* Encoded as intset */
#define OBJ_ENCODING_SKIPLIST 7  /* Encoded as skiplist */
#define OBJ_ENCODING_EMBSTR 8  /* Embedded sds string encoding */
#define OBJ_ENCODING_QUICKLIST 9 /* Encoded as linked list of ziplists */

It doesn't matter if you don't understand these encoding methods, as we will talk about them later. Below we start from the five data types and explain them in turn.

(1) String

String is the most basic data type of redis, and a key corresponds to a value. The string type is binary safe, which means that the redis string can contain any data, such as pictures or serialized objects. The string type is the most basic data type of redis, and a key can store up to 512MB of data. The String data structure is a simple key-value type.

There are three encoding methods for string type, namely int, raw, embstr.

1.int encoding

The value saved by the string object is an integer value, and the integer value is within the range of long, then redis uses the integer value to save this information, and the string encoding is set to int.

2.raw encoding

The string object saves a string, the length is greater than 32 bytes, the SDS (simple dynamic string) data structure will be used to save the string value, and the encoding of the string object is set to raw.

3. embstr encoding

The string object saves a string with a length of less than 32 bytes. It will be saved using embstr. The embstr encoding is an optimization of SDS and is saved in a continuous space, that is, the value of the SDS and the value of the string object are placed On a contiguous memory space. Mainly used to improve efficiency when using smaller string objects. In addition, embstr encoding is read-only, as long as a modification operation occurs, the encoding will be converted to raw and then operated.

PS: If the string is stored as a floating-point number, the floating-point number will be converted to a string first, and then the encoding method will be selected according to the above three conditions. When the floating-point number is operated, it needs to be converted from a string to a floating-point number for calculation, and then converted into a string for storage.

Insert picture description here

(2) Hash

The hash type is very similar to the data table of a relational database. The Key of the hash is a unique value, and the Value part is a hashmap structure. The hash data type has the advantage of being more flexible and faster than the string type when storing objects. Specifically, the use of string type storage must inevitably require conversion and parsing of json format strings. Even if conversion is not required, the memory overhead is still Hash is more dominant. The hash model is basically like this:

Insert picture description here
The coding of hash objects includes ziplist and hashtable.

1.ziplist encoding

When the length of the key and value of the key-value pair is less than 64 bytes, and the number of key-value pairs is less than 512, use ziplist encoding, the underlying data structure uses ziplist, and two consecutive ziplist nodes are used to represent one of the hash objects Key-value pairs.

2.hashtable encoding

When the length of the key or value of the key-value pair is greater than 64 bytes, or the number of key-value pairs is greater than 512, hashtable encoding is used, and the underlying data structure uses hashtable. Hash is very similar in structure to hashtable, so each key-value pair in the hash object is a key-value pair in hashtable.

Insert picture description here

(Three) List

The list in redis is a simple list of strings, sorted in the order of insertion. You can add an element to the head (left) or the tail (right) of the list, like a double queue.

Insert picture description here

The list type is often used in the service of message queues to complete the message exchange between multiple programs. In the older version (it seems to be below 3.2), list uses a total of two data structures: compressed linked list and doubly linked list. When the number of elements is small, use a compressed list, when the number of elements increases, use a doubly linked list.

But after 3.2, a new data structure-quicklist (quicklist) appeared, this data structure has now become the underlying implementation of all lists.

Insert picture description here

(4) Set collection

A set is a collection, and the concept of a collection is a combination of unique values. Using the set data type provided by redis, some collective data can be stored. The set of redis is an unordered collection of string type. The biggest advantage of sets is that they can perform intersection, union, and subtraction operations. The maximum number of elements a Set can contain is 4294967295.

Insert picture description here
The encoding of collection objects includes intset and hashtable.

1.intset encoding

When all elements in the set are integers and the number is not more than 512, use intset encoding. The underlying intset encoding uses the intset data structure.

2.hashtable encoding

When the elements that do not match are all integer values ​​and the number of elements is less than 512, the encoding method used by the collection object is hashtable. Each key of hashtable is a string object, which stores an element in the collection, and the value of hashtable is all set to NULL.

Insert picture description here

(5) Sorted Set ordered set

Compared with set, sorted set adds a weight parameter score to the elements in the set, so that the elements in the set can be arranged in order according to the score. Sorted set is also a collection of string type elements like set, and duplicate members are not allowed. The difference is that each element is associated with a double type score. Redis uses scores to sort the members of the set from small to large. The members of the set are unique, but the score (score) can be repeated. The sorted set is inserted ordered, that is, automatically sorted.

There are ziplist and skiplist codes for ordered collection objects.

1.ziplist encoding

When the number of elements is less than 128 and the length of all element members is less than 64 bytes, use ziplist encoding. The data structure of the ordered collection object is ziplist. For each collection element (key-value), use two next to each other The nodes of ziplist are represented. The first node holds the members of the collection elements, and the second node holds the scores of the collection elements. Inside the compressed list, the set elements are sorted according to the score from small to large.

2. Skiplist encoding

When the number of elements is greater than 128 and the length of all element members is greater than 64 bytes, skiplist encoding is used, and zset is used internally to save data.

Insert picture description here

Finally, we summarize the five data types and the encoding used:

Insert picture description here
April 28, 2020

Guess you like

Origin blog.csdn.net/weixin_43907422/article/details/105728058