Simple dynamic string of Redis data structure

 One: The structure of a simple dynamic string

1: SDS structure

struct sdshdr {
  //buf已占用的空间长度
  //等于SDS所保存的字符串长度
  int len;
  //buf中剩余的空间长度   
  int free; 
  //字符数组,用于保存字符串
  char but[];  
}

2: SDS data format example

  • The free attribute value is 5, which means that this SDS has 5 bytes of unused space.
  • The value of the len attribute is 5, which means that this SDS stores a five-byte string.
  • The buf attribute is an array of type char. The first five bytes of the array hold five characters of'R','e','d','i', and's' respectively, and the last byte is stored Empty string'\0'.

Two: the advantages of simple dynamic strings

1: Constant complexity to obtain the length of the string

Compared with C string traversal to obtain the length, SDS only needs to obtain the len attribute in the structure to obtain the length, which reduces the original complexity of obtaining the string length from O(N) to O(1)

2: Put an end to buffer overflow

When the C string executes the strcat function, if you forget to check whether the remaining space is sufficient before executing it, it is likely that the data will overflow to other locations, causing other strings to be modified.

If there are two C strings S1 and S2 next to each other in the memory, S1 stores the string'Redis' and S2 stores the string'MongoDB', as shown in the following figure

If a programmer executes strcat(1, "Cluster"), the data of S1 will overflow into the memory of S2, tampering with the string of S2.

When SDS performs string splicing, this problem will not exist. It will first check whether the remaining space is sufficient before execution. If it is not sufficient, it will expand the space to the size required for execution, and then execute the string in the C language. Connection function, so that there will be no buffer overflow

3: Reduce the number of memory reallocations when the string is modified

  • When the C string executes the append splicing function, memory needs to be reallocated to avoid buffer overflow caused by insufficient memory.
  • When the C string executes the trim truncation function, the memory needs to be reallocated to release the memory space that is not in use. If this step is forgotten, it will cause a memory leak.

    For SDS, the designer used space pre-allocation and lazily freed space to solve the problem of frequent memory reallocation.

  3.1: Pre-allocation of space

      When the splicing function is executed, it is found that the memory is not empty or insufficient, and the memory space will be pre-allocated. The allocation rules are as follows

  • If the modified length of the SDS is less than 1M, the same memory space as the len attribute value will be allocated to free. For example, is the length of the modified S1 string 13 bytes, then free will also allocate 13 bytes Memory space, and the actual length of the buf array is 13+13+1=27 bytes in length (1 is used to store an empty string)
  • If the modified length of the SDS is greater than or equal to 1M, the program will allocate free1MB of unused memory space. If the modified len attribute value of the SDS is 2MB, then free is 1MB, and the actual length of the buf array is 2MB+1MB +1byte

  3.2: Lazy space release

      When the SDS executes the sdstrim function, the released memory will be included in the free, and the released space can be used when the sdscat function is executed next time, thus avoiding frequent memory reallocation.

      At the same time, SDS also has a corresponding function that allows us to release the unused space of the SDS when we need it, so there is no need to worry about memory waste.

      Both space pre-allocation and lazy space release can avoid frequent memory reallocation problems.

4: Binary security

The C string must conform to a certain encoding (such as ASCII), and in addition to the end of the string, the string cannot contain an empty string, otherwise it will be considered as the end of the string. These restrict the C string to be text. It cannot save binary data such as pictures, audios, videos, and compressed files.

The SDS saves data through binary, and judges whether the string ends by the value of the len attribute instead of a null character. This ensures that Redis can not only save text data, but also binary data in any format.

5: Compatible with some C string functions

The end of the data in SDS will be set to an empty string, so that you can use some of the functions of <string.h>, such as the strcasecmp() function, so as to avoid unnecessary code duplication.

 

Guess you like

Origin blog.csdn.net/qq_37469055/article/details/114411035