Retrieved from: https://juejin.cn/post/6844903936520880135
For personal backup only, please see the original text for browsing
Redis is written in C language, but Redis does not use C's string representation (C is \0
a character array ending with a null character). Instead, it has built a simple dynamic string (SDS). ) Abstract type, and used as the default string representation of Redis.
The composition of the source code structure
len
: Record the number of bytes currently used (not included'\0'
), and the complexity of obtaining the SDS length is O(1)alloc
: Record the total number of bytes allocated in the current byte array (not included'\0'
)flags
:Mark the attributes of the current byte array, issdshdr8
it stillsdshdr16
waiting, the definition of the flags value can be seen in the following codebuf
: Byte array, used to store strings, including trailing blank characters'\0'
The difference between SDS and C string
1. Get string length with constant complexity
The C string does not record the length of the string. To obtain the length, the entire string must be traversed. The complexity is O(N); and the SDS structure itself has the len
attribute of recording the length of the string , and the complexity is O(1). Redis reduces the complexity required to obtain the string length from O(N) to O(1), ensuring that the work of obtaining the string length does not become a performance bottleneck in Redis
2. Prevent buffer overflow and reduce the number of memory redistributions caused by modifying strings
The C string does not record its own length. Each time a string is increased or shortened, a memory reallocation operation is performed on the underlying character array. If the size of the underlying data is not expanded by memory redistribution before the splicing append operation, buffer overflow will occur; if it is the truncation trim operation that does not release the unused space through memory redistribution, a memory leak will occur.
SDS uses unused space to disassociate the length of the string from the length of the underlying data. Version 3.0 uses free
attributes to record unused space, and version 3.2 uses the alloc
total number of bytes allocated for attribute records. Through the unused space, SDS realizes two optimized space allocation strategies of space pre-allocation and lazy space release , and solves the space problem of string splicing and interception.
3. Binary Security
The characters in the C string must conform to a certain encoding. Except for the end of the string, the string cannot contain null characters, otherwise it will be regarded as the end of the string. These limit the C string can only store text data, and Cannot save binary data like pictures
The SDS API will process buf
the data stored in the array in a binary manner , and will not impose any restrictions on the data inside. SDS uses len
the value of the attribute to determine whether the string ends, not a null character
4. Compatible with some C string functions
Although the SDS API is binary safe, it still ends with a null character like a C string. The purpose is to allow the SDS that saves text data to reuse a part of the C string function
Summary: Comparison of C string and SDS
C string | SDS |
---|---|
The complexity of getting the string length is O(N) | The complexity of getting the string length is O(1) |
API is not safe and may cause buffer overflow | API is safe and will not cause buffer overflow |
Modifying the length of the string will inevitably require memory reallocation | Modifying the string length N times will require at most N memory reallocations |
Can only save text data | Can save text or binary data |
Can use all <string.h> functions in the library |
You can use some <string.h> of the functions in the library |