Detailed explanation of the object mechanism of Redis from entry to proficiency [Advanced Chapter]


insert image description here

0. Preface

The reason why Redis is a high-performance and popular in-memory database is not only that it supports multiple data types, including data structures such as strings, lists, hashes, sets, and ordered sets. Moreover, these data types are composed of an object structure (redisObject) and a corresponding encoded data structure. In Redis, the object structure is the underlying implementation of all data types, which contains information such as the type, reference count, encoding method, and value of the data object. This makes Redis have an unparalleled position in terms of performance and memory utilization. Established its status in the rivers and lakes.

This article will focus on the implementation principle of the Redis object structure (redisObject). We will start with the basic principles of the object structure, analyze the components of the object structure, and important concepts such as memory management and reference counting of the object structure. At the same time, we will also discuss the application scenarios and practical application cases of the Redis object structure to help you better understand the internal implementation mechanism of Redis.

Today we change the way of sharing, first throw some questions, these questions may be often asked in interviews

1. 什么是redisObject对象?
2. redisObject数据结构解析?
4. redisObject对象如何实现数据共享和对象池技术?
5. redisObject对象的大小是否会随着数据类型的不同而变化?
6. redisObject对象的序列化和反序列化有哪些常用的方法?
7. Redis中还有哪些技术可以提高系统的性能和稳定性?
8. Redis中的对象池技术如何管理内存?
9. Redis中的共享池如何管理共享字符串对象?

1. Detailed explanation

The redisObject object in Redis is the basic data structure in Redis, and it is one of the keys for Redis to store and process data efficiently. redisObject objects can represent various types of data, such as strings, lists, hash tables, etc., which are widely used in Redis.

1.1 Design purpose of redisObject object

Enhance the flexibility of data structures: Redis supports many different types of data structures, such as strings, lists, hash tables, sets, and ordered sets. These data structures are represented as redisObject objects in Redis. By uniformly encapsulating different types of data into redisObject objects, Redis can handle different types of data more flexibly, thus supporting a variety of different application scenarios.

Simplified memory management: All data in Redis is stored in memory, so memory needs to be managed effectively. The complexity of memory management can be simplified by using the redisObject object, because the redisObject object can manage the internal memory space by itself, thus avoiding the problem of manual memory management.

Improve the efficiency of data storage and access: The redisObject object in Redis is a lightweight object, its size is usually only a few bytes, so it can be stored and accessed quickly. In addition, Redis also optimizes memory usage efficiency through technologies such as object sharing and object pooling, thereby improving the efficiency of data storage and access.

Convenient data serialization and deserialization: The redisObject object in Redis can be serialized and deserialized conveniently, because the redisObject object is a structured data type, which can be serialized into binary format or other formats, which is convenient for data transmission and storage.

1.2 redisObject data structure

redisObject is a basic data structure in Redis, which is used to represent various data types in Redis, such as strings, lists, hash tables, sets, and ordered sets. The redisObject structure is defined as follows:
源码地址 https://github.com/redis/redis/blob/6.0/src/server.h
insert image description here
I copied the source code and analyzed it

typedef struct redisObject {
    
    
    unsigned type:4;
    unsigned encoding:4;
    unsigned lru:LRU_BITS; /* LRU time (relative to global lru_clock) or
                            * LFU data (least significant 8 bits frequency
                            * and most significant 16 bits access time). */
    int refcount;
    void *ptr;
} robj;

Field meaning:

  • type: Indicates the type of redisObject object, which is a 4-bit integer used to distinguish different data types. Redis supports a variety of different data types, such as strings, lists, hash tables, sets, and ordered sets, and each data type has a corresponding type value.
  • encoding: Indicates the encoding method of the redisObject object, which is a 4-bit integer used to distinguish different encoding methods. Since each data type in Redis can be represented by many different encoding methods, it is necessary to use the encoding field to distinguish different encoding methods.
  • lru: Indicates the LRU time of the redisObject object, which is an integer used to record the last time the object was accessed. The LRU (Least Recently Used) algorithm is used to determine which objects have been used the least recently, so as to be eliminated.
  • refcount: Indicates the reference count of the redisObject object, which is an integer used to record the number of times the current object is referenced. When the reference count is 0, it means that the current object can be released.
  • ptr: Indicates the actual data of the redisObject object, which is a pointer to the actual data of the object. The actual data types of different types of objects are different. For example, the actual data of a string object is a char array, and the actual data of a list object is a linked list.

1.2 How Redis uses redisObject

When Redis uses redisObject, it mainly involves the following aspects:

1.2.1. Object Creation

Redis uses the redisObjectCreate function to create data objects and initialize properties such as the object's type, encoding method, and actual value. When creating an object, Redis will choose an appropriate encoding method according to the type of the actual value to improve memory usage efficiency.
The definition and implementation of the redisObjectCreate function are located in the src/object.c file of the Redis source code.
源码位置https://github.com/redis/redis/blob/6.0/src/object.c
It’s still the old way, let’s copy the source code and analyze it

robj *createObject(int type, void *ptr) {
    robj *o = zmalloc(sizeof(*o));
    o->type = type;
    o->encoding = REDIS_ENCODING_RAW;
    o->ptr = ptr;
    o->refcount = 1;
    return o;
}

We can see in the source code that the createObject function accepts two parameters: type and ptr, which represent the type and actual value of the object respectively. Inside the function, it first uses the zmalloc function to allocate a piece of memory whose size is the size of the redisObject structure. Then, it initializes the member variables of the redisObject structure, including object type (type), encoding method (encoding), actual value pointer (ptr) and reference count (refcount). Finally, it returns the created redisObject object.

Declare that there are other redisObjectCreate functions in Redis to create different types of data objects. For example, the createStringObject function is used to create a string object, the createListObject function is used to create a list object, the createSetObject function is used to create a collection object, and so on. These functions are defined and implemented in the src/object.c source code file.
I won’t go into details to make it easy for everyone to understand and remember. Click the link above to go in and scan the source code to get a basic understanding.

1.2.2. Object reference counting

Redis uses object reference counting to manage the life cycle of data objects to ensure that objects are not accidentally released during use. Each object contains a reference count (refcount) attribute, indicating the number of times the object is currently referenced. When the object is created, the reference count is initialized to 1, when the object is referenced, the reference count is incremented by 1, and when the object is released, the reference count is decremented by 1. When the reference count reaches 0, the object is deallocated. So this is also a strategy in the GC mechanism of the JVM. Reference counting. The idea and purpose are the same, and everyone can learn by analogy.

1.2.3. Object sharing

Some data objects in Redis are shared, such as string constants, empty lists, etc. In order to save memory, Redis uses shared objects to represent these data objects, and multiple variables can share the same object. Redis distinguishes shared objects from normal objects by setting the reference count of the shared object to a negative number. In fact, if you are a java student who sees this piece, you may think of a feature in the java language called the constant pool (Constant Pool). The most common string constant pool (String Pool) is specially used to store string constant objects.
The string constant pool in Java is a special memory area used to store string constant objects. These string constant objects are created through literals (Literal) during compilation or runtime, such as string literals, character literals, and so on. When a string literal appears in a Java program, the Java compiler will automatically add it to the string constant pool, and share these string constant objects at runtime to improve memory usage efficiency.
Well, Redis actually has this idea similar to Java's constant pool idea to improve memory usage efficiency. So we can see that design thinking has nothing to do with language, as long as it can effectively solve a certain problem in an appropriate scene, it is the best design thinking. In fact, C++, python, and javascript all use this idea to improve memory usage efficiency and performance.Therefore, it is very important for everyone to understand and compare memory, and you will find that some sets of ideas in the computer world have been running rampant for decades and are still the optimal solution. Well stop talking, let's continue.

The value objects pre-allocated by redis are as follows: the return values ​​of various commands, such as OK returned when successful, ERROR returned when an error occurs, QUEUE returned when a command enqueues a transaction, etc., including 0, all integers less than REDIS_SHARED_INTEGERS ( The default value of REDIS_SHARED_INTEGERS is 10000)

The source code locations of Redis pre-allocated integer objects and string value objects are in the following files under the Redis src directory:

  • redis.h: Defines the Redis integer object structure redisObject and string value object structure robj, as well as the REDIS_SHARED_INTEGERS macro definition.

  • object.c: defines the object operation functions of Redis, including operations such as creation, release, increase and decrease of reference count, and also includes the function initServerSharingObjects() of pre-allocating integer objects and string value objects.
    insert image description here

In the initServerSharingObjects() function, Redis will pre-allocate a certain number of integer objects and string value objects and cache them in the global shared object pool. The number of these objects can be adjusted by setting the sharedobjects-pool-size option in the redis.conf configuration file. By default, the value defined by the REDIS_SHARED_INTEGERS macro is 10000, which means that 10000 integer objects are pre-allocated, while string-valued objects are pre-allocated some common string-valued objects such as "OK", "ERROR" and "QUEUE" as needed wait.

1.2.4. Object encoding

Redis supports multiple encoding methods to represent different types of data objects, for example, strings can use int, embstr or raw encoding methods. Redis will choose an appropriate encoding method according to the type and size of the actual value to improve memory usage efficiency. When the object is created, Redis will automatically select the appropriate encoding method, and save the encoding method in the object's encoding (encoding) attribute.
The ptr pointer of the object points to the underlying implementation data structure of the object, and these data structures are determined by the encoding property of the object. The encoding attribute records the encoding used by the object, that is to say, what data structure this object uses as the underlying implementation of the object.

/* Objects encoding. Some kind of objects like Strings and Hashes can be
 * internally represented in multiple ways. The 'encoding' field of the object
 * is set to one of this fields for this object. */
// encoding 的10种类型
#define OBJ_ENCODING_RAW 0     /* Raw representation */     //原始表示方式,字符串对象是简单动态字符串
#define OBJ_ENCODING_INT 1     /* Encoded as integer */         //long类型的整数
#define OBJ_ENCODING_HT 2      /* Encoded as hash table */      //字典
#define OBJ_ENCODING_ZIPMAP 3  /* Encoded as zipmap */          //不在使用
#define OBJ_ENCODING_LINKEDLIST 4 /* Encoded as regular linked list */  //双端链表,不在使用
#define OBJ_ENCODING_ZIPLIST 5 /* Encoded as ziplist */         //压缩列表
#define OBJ_ENCODING_INTSET 6  /* Encoded as intset */          //整数集合
#define OBJ_ENCODING_SKIPLIST 7  /* Encoded as skiplist */      //跳跃表和字典
#define OBJ_ENCODING_EMBSTR 8  /* Embedded sds string encoding */   //embstr编码的简单动态字符串
#define OBJ_ENCODING_QUICKLIST 9 /* Encoded as linked list of ziplists */   //由压缩列表组成的双向列表-->快速列表

Each type of object uses at least two different encodings, and the encodings that can be used for each object are listed below.
from the Internet
insert image description here

1.2.4. Object values

Redis uses the ptr attribute in redisObject to point to the actual data object, and the type and content of the actual data object depends on the type and encoding method of the object. For example, for a string object, ptr points to the character array of the string, and for a list object, ptr points to the head node of the list.

So looking at it this way, all the basic data types of Redis we talked about before are basically the same when the bottom layer is created. Redis uses redisObject to uniformly represent different types of data objects, and uses mechanisms such as object reference counting, shared objects, encoding methods, and actual values ​​to manage and optimize the use of data objects. And using redisObject can help Redis achieve efficient memory management and data storage. So if the interviewer Hua can briefly answer the key points of this paragraph, it can be regarded as a certain understanding.

2. Summary

redis uses its own object mechanism (redisObject) to implement type judgment, command polymorphism, and garbage collection based on the number of references; redis will pre-allocate some commonly used data objects, and share these objects to reduce memory usage and avoid frequent Allocate memory for small objects.
Finally, let's answer the questions raised above

2.1. How does the redisObject object implement data sharing and object pool technology?

Data sharing technology: String objects in Redis can be shared. If multiple string objects have the same value, they can share the same redisObject object. Redis implements the sharing of string objects by using a shared pool. All shared string objects are stored in the shared pool. Each shared string object has a reference count. When the string object is no longer used, it can be Removed from the shared pool.

Object pool technology: The object pool in Redis is a memory pool used to store redisObject objects, which can effectively manage the memory of objects and reduce the overhead of memory fragmentation and memory allocation. The object pool in Redis uses two stacks to manage idle redisObject objects, one is the large object stack for managing larger objects, and the other is the small object stack for managing smaller objects. When a new redisObject object needs to be allocated, Redis will first check whether there is an idle object in the object pool, and if there is, it will be directly allocated to a new object, if not, it will allocate a new memory space from the system.

2.2. Will the size of the redisObject object change with different data types?

The size of the redisObject object in Redis varies with the data type. Different types of redisObject objects require different memory spaces, so their sizes will also be different.
Take the string object as an example. The string object in Redis contains a len field and a buf field, where the len field represents the length of the string, and the buf field represents the actual content of the string. Therefore, the size of the string object is equal to the size of the len field plus the size of the buf field. For list objects, hash table objects, collection objects, and ordered collection objects, etc., their size will also vary with the complexity of the data structure and the number of elements.
It should be noted that the redisObject object in Redis also contains some additional fields, such as type, encoding, lru, and refcount, etc., and these fields will also occupy a certain amount of memory space. Therefore, when calculating the size of the redisObject object, the memory space occupied by these additional fields also needs to be considered.

2.3. How does the object pool technology in Redis manage memory?

The object pool technology in Redis is a technology for memory management, which can effectively reduce memory fragmentation and memory allocation overhead, and improve the memory usage efficiency of Redis. Object pool technology can be divided into two parts: memory pool and idle object management.

Memory pool: The object pool in Redis uses a large memory space to store redisObject objects. This memory space is called the memory pool, and it can be divided into two parts: the large object pool and the small object pool. The large object pool is used to manage larger redisObject objects, while the small object pool is used to manage smaller redisObject objects. Each redisObject object in the memory pool has a fixed size, so that memory fragmentation can be avoided.

Idle object management: The object pool in Redis uses two stacks to manage idle redisObject objects, one is the large object stack and the other is the small object stack. When a new redisObject object needs to be allocated, Redis will first check whether there is an idle object in the object pool, and if there is, it will be directly allocated to a new object, if not, it will allocate a new memory space from the system.

When the redisObject object is no longer used, the object pool will put this object back into the object pool so that it can be used again next time. This process will first clear the actual data of the object, then put the object into the corresponding free object stack, and update the refcount field of the object to set it to 0. When the number of objects in the free object stack reaches a certain threshold, the object pool will release part of the memory to reclaim the memory space.
This design is an idea that most programming languages ​​​​like to use.

2.4. How does the shared pool in Redis manage shared string objects?

The shared pool is a pool of memory used to manage shared string objects. Sharing a string object means that multiple string objects have the same value and can share the same redisObject object. Redis implements the sharing of string objects by using the shared pool, thereby improving the memory usage efficiency of Redis. All shared string objects are stored in the shared pool, and each shared string object has a reference count. When creating a new string object, Redis will first check whether there is a shared string object with the same value in the shared pool. If it exists, point the pointer of the new string object to the address of the shared string object, and increase the reference count of the shared string object by 1. If it does not exist, create a new string object and add it to the shared pool. When a string object is no longer used, Redis will delete it from the shared pool and decrement its reference count by 1. If the reference count becomes 0, it means that the string object is no longer referenced by any other object, it can be released and the memory it occupies can be reclaimed.

Each shared string object in this shared pool is read-only, similar to the String object in the JAVA language. If you need to modify the value of a shared string object, you need to copy it from the shared pool first, create a new string object, and modify its value to a new value. In this way, you can avoid modifying the value of the shared string object from affecting other objects.

2.5. How to judge whether a string object is in the shared pool?

You can use some Redis commands, such as object encoding, object idletimeand debug object, to determine whether a string object is in the shared pool. If the string object is a shared string object, Redis will return specific results, which can be used to determine whether the string object is in the shared pool.

3. Redis from entry to proficiency series of articles

"Redis from entry to proficiency [Advanced]: Detailed Explanation of Messaging Publishing and Subscription Mode" "
Redis from Entry to Proficiency [Advanced]: Detailed Explanation of Persistence AOF"
"Redis from Entry to Proficiency [Advanced] Persistence Detailed Explanation of RDB"
"Detailed Explanation of the Underlying Data Structure Dictionary (Dictionary) of Redis from Entry to Proficiency [Advanced Chapter]" "Detailed
Explanation of QuickList of the Underlying Data Structure of Redis from Entry to Proficiency [Advanced Chapter]" "
Redis From Entry to Proficiency [Advanced Chapter] Detailed Explanation of the Simple Dynamic String (SDS) of the Underlying Data Structure" "
Redis from Beginner to Master [Advanced Chapter] Detailed Explanation of the Underlying Data Structure Compression List (ZipList)" "
Redis From Beginner to Master [Advanced Chapter] 】Detailed explanation and usage examples of the data type Stream "
insert image description here
Hello everyone, I am Freezing Point, today's Redis from entry to proficiency [Advanced Chapter] detailed explanation of the object mechanism, the whole content is these. If you have questions or opinions, you can leave a message in the comment area.

Guess you like

Origin blog.csdn.net/wangshuai6707/article/details/131556935