The bottom layer of the string structure in redis-simple dynamic string (SDS)

Retrieved from: https://juejin.cn/post/6844903936520880135

For personal backup only, please see the original text for browsing

 

Redis is written in C language, but Redis does not use C's string representation (C is \0a character array ending with a null character). Instead, it has built a simple dynamic string (SDS). ) Abstract type, and used as the default string representation of Redis.

The composition of the source code structure

  • len: Record the number of bytes currently used (not included '\0'), and the complexity of obtaining the SDS length is O(1)
  • alloc: Record the total number of bytes allocated in the current byte array (not included '\0')
  • flags:Mark the attributes of the current byte array, is sdshdr8it still sdshdr16waiting, the definition of the flags value can be seen in the following code
  • buf: Byte array, used to store strings, including trailing blank characters'\0'

The difference between SDS and C string

1. Get string length with constant complexity

The C string does not record the length of the string. To obtain the length, the entire string must be traversed. The complexity is O(N); and the SDS structure itself has the lenattribute of recording the length of the string , and the complexity is O(1). Redis reduces the complexity required to obtain the string length from O(N) to O(1), ensuring that the work of obtaining the string length does not become a performance bottleneck in Redis

2. Prevent buffer overflow and reduce the number of memory redistributions caused by modifying strings

The C string does not record its own length. Each time a string is increased or shortened, a memory reallocation operation is performed on the underlying character array. If the size of the underlying data is not expanded by memory redistribution before the splicing append operation, buffer overflow will occur; if it is the truncation trim operation that does not release the unused space through memory redistribution, a memory leak will occur.

SDS uses unused space to disassociate the length of the string from the length of the underlying data. Version 3.0 uses freeattributes to record unused space, and version 3.2 uses the alloctotal number of bytes allocated for attribute records. Through the unused space, SDS realizes two optimized space allocation strategies of space pre-allocation and lazy space release , and solves the space problem of string splicing and interception.

3. Binary Security

The characters in the C string must conform to a certain encoding. Except for the end of the string, the string cannot contain null characters, otherwise it will be regarded as the end of the string. These limit the C string can only store text data, and Cannot save binary data like pictures

The SDS API will process bufthe data stored in the array in a binary manner , and will not impose any restrictions on the data inside. SDS uses lenthe value of the attribute to determine whether the string ends, not a null character

4. Compatible with some C string functions

Although the SDS API is binary safe, it still ends with a null character like a C string. The purpose is to allow the SDS that saves text data to reuse a part of the C string function

Summary: Comparison of C string and SDS

C string SDS
The complexity of getting the string length is O(N) The complexity of getting the string length is O(1)
API is not safe and may cause buffer overflow API is safe and will not cause buffer overflow
Modifying the length of the string will inevitably require memory reallocation Modifying the string length N times will require at most N memory reallocations
Can only save text data Can save text or binary data
Can use all <string.h>functions in the library You can use some <string.h>of the functions in the library

 

Guess you like

Origin blog.csdn.net/chushoufengli/article/details/115082902