Redis operation command and string sds source code analysis

I was stunned by the interviewer of Tianmei that I didn’t have any technical depth. I really didn’t read the source code very much, so I took a look at the source code of redis before graduation.

operation command

Get、Set、mset、mget

The following nx means that the key is created only if it does not exist, and xx means that the key can be modified only if it exists.
mset nx If there is a key, then this command is wrong.

strlen gets the string, the time complexity is O(1)

getrange Get range string, support positive index and negative index value

setrange, assign a value within the range, if the number of digits is not enough 0 to fill it up

append appends new content to the end of the string

incrby decrby incr decr incrbyfloat, the first four are integers, and the last one is a floating point number. If decrbyfloat is not provided, you can use incrbyfloat key -3.14 and add a negative number to implement subtraction. Like APPEND, it will be created automatically when processing a key that does not exist.

sds design

Strings are safe, char* strings can store \0, record the allocated space, reduce the number of memory reallocations, release lazy space, etc.

source code

Here are some configuration versions

centos8
g++11
redis6.2.1
__attribute__ ((__packed__))

What is this? , that is to cancel the byte alignment and compress the memory space. Each allocation by malloc will be a multiple of 8 bytes (in the case of 64 bytes), so we can use this attribute to cancel the useless byte alignment. Here I also learn something new At a function, malloc_usable_size, you can see the actual number of bytes allocated by the system. One more thing to remember is that there is no such thing as big-endian and small-endian for char* strings. After figuring out the byte order, I suddenly thought of what I had seen, and the problem was solved.

Introduce the magic pointer

When it comes to pointers, it is true that there are many uses, but I did not expect that pointers can be used in this way. This is also a point that must be mastered when reading the source code examples below.

#include <string>
#include <iostream>
#include <unordered_map>
#include <memory>
#include <functional>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <vector>
#include <map>
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
#include <malloc.h>
 
using namespace std;
using byte_pointer = unsigned char*;

void show_bytes(byte_pointer start,size_t len){
    
    
    for(size_t i = 0;i < len;++i){
    
    
        printf("%.2x ",start[i]);
    }
    cout << endl;
}


int main()
{
    
       
    void* ptr = malloc(1);
    memset(ptr,0,1);
    printf("%u\n",malloc_usable_size(ptr));
    size_t size = 5;
    *((size_t*)ptr) = size;
    cout << *((size_t*)ptr) << endl;
    memcpy(ptr + 8,"hello dxgzg",11);
    cout << (char*)ptr + 8 << endl;
    cout << *((size_t*)ptr) << endl;
    show_bytes((byte_pointer)ptr,19);
    
    return 0;
}

The output result is shown in the figure below. You can see that size_t is stored in little-endian mode, but strings are not in little-endian mode.
insert image description here
This is an example of combining numbers and strings.

The following example is an example of class objects and strings, using sizeof to obtain offsets.

#include <string>
#include <iostream>
#include <unordered_map>
#include <memory>
#include <functional>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <vector>
#include <map>
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
#include <malloc.h>
 
using namespace std;
using byte_pointer = unsigned char*;

void show_bytes(byte_pointer start,size_t len){
    
    
    for(size_t i = 0;i < len;++i){
    
    
        printf("%.2x ",start[i]);
    }
    cout << endl;
}
struct __attribute__((__packed__)) test
{
    
    
    char i = 'a';
    int val = 0;
};


int main()
{
    
       

    void* ptr = malloc(1);
    // printf("%u\n",malloc_usable_size(ptr));
    memset(ptr,0,malloc_usable_size(ptr));
    test* t = (test*)ptr;
    t->i = 'd';
    t->val = 10;

    memcpy(ptr + sizeof(test),"hello world",11);

    show_bytes((byte_pointer)&ptr,16);

    test* t2 = (test*)ptr;
    cout << t2->i << ' ' << t2->val << endl;
    cout << (char*)ptr + sizeof(test) << endl;

    
    return 0;
}

sdnew function

This function is finally called.

sds sdsnewlen(const void *init, size_t initlen) {
    
    
    return _sdsnewlen(init, initlen, 0);
}

memory layout of sds

Both sh and s are pointers

insert image description here

The meaning of this variable

The following is _sdsnewlen, let’s analyze it, type is to record the type of sdshdr. hdrlen is to see the size of that type of sdshdr.

sh allocates a piece of space (the space size is hdrlen+initlen+1+PREFIX_SIZE), and
the length of hdrlen stores the members in sdshdrX (X means 5, 8, 16, 32, 64)

The flag in sh is mainly to help us quickly find the address of sh, because we always operate on the address of s, and the address of this flag exists in the place of s - 1. Know the position of s, and know how much to subtract (the flag can be known) to get the position of sh

The actual allocated size is stored in usable, which is based on the above function. The s_malloc_usable function finally calls the ztrymalloc_usable function, which contains a macro called HAVE_MALLOC_SIZE. This macro is used to determine whether the current system contains the malloc_usable_size function. If not, redis will record the allocated size by itself. If HAVE_MALLOC_SIZE is defined, PREFIX_SIZE is 0

sds _sdsnewlen(const void *init, size_t initlen, int trymalloc) {
    
    
    void *sh;
    sds s;
    char type = sdsReqType(initlen);
    /* Empty strings are usually created in order to append. Use type 8
     * since type 5 is not good at this. */
    if (type == SDS_TYPE_5 && initlen == 0) type = SDS_TYPE_8;
    int hdrlen = sdsHdrSize(type);// 结构体的大小
    unsigned char *fp; /* flags pointer. */
    size_t usable; // malloc实际分配的值

    assert(initlen + hdrlen + 1 > initlen); /* Catch size_t overflow */
    sh = trymalloc?
        s_trymalloc_usable(hdrlen+initlen+1, &usable) :
        s_malloc_usable(hdrlen+initlen+1, &usable); // +1是留给\0的,还会多分配PREFIX_SIZE字节
    if (sh == NULL) return NULL;
    if (init==SDS_NOINIT)
        init = NULL;
    else if (!init)
        memset(sh, 0, hdrlen+initlen+1);
    s = (char*)sh+hdrlen;
    fp = ((unsigned char*)s)-1;
    usable = usable-hdrlen-1;
    if (usable > sdsTypeMaxSize(type))
        usable = sdsTypeMaxSize(type);
    switch(type) {
    
    
        case SDS_TYPE_5: {
    
    
            *fp = type | (initlen << SDS_TYPE_BITS);
            break;
        }
        case SDS_TYPE_8: {
    
    
            SDS_HDR_VAR(8,s);
            // 相当于插入这样的代码了
            // struct sdshdr8 *sh = (void*)((s)-(sizeof(struct sdshdr8)));

            sh->len = initlen;
            sh->alloc = usable;
            *fp = type;
            break;
        }
        .............
    }
    if (initlen && init)
        memcpy(s, init, initlen);
    s[initlen] = '\0';
    return s;
}

sdsfree function

Release space is the same as the above function, you can understand it by yourself

sdsavail function

Get the unallocated value, this is also the operation of sds, find the flag through s - 1, and then find the corresponding sh, sh->alloc is allocated, len is used, do a subtraction to know the remaining useless space.

static inline size_t sdsavail(const sds s) {
    
    
    unsigned char flags = s[-1];
    switch(flags&SDS_TYPE_MASK) {
    
    
        case SDS_TYPE_5: {
    
    
            return 0;
        }
        case SDS_TYPE_8: {
    
    
            SDS_HDR_VAR(8,s);
            return sh->alloc - sh->len;
        }
        ........
    }
    return 0;
}

sdscat function

Splicing a string to the back of sds is the append command of redis, which uses memcpy to splice strings. sdssetlen updates the new free space. sdsMakeRoomFor to determine whether expansion is required.

sds sdscatlen(sds s, const void *t, size_t len) {
    
    
    size_t curlen = sdslen(s);

    s = sdsMakeRoomFor(s,len);
    if (s == NULL) return NULL;
    memcpy(s+curlen, t, len);
    sdssetlen(s, curlen+len);
    s[curlen+len] = '\0';
    return s;
}

sdsMakeRoomFor

The principle of the sdsMakeRoomFor function is that if it is less than 1MB, it will double the new length, and if it is greater than 1MB, it will expand the capacity by 1MB.

sds sdsMakeRoomFor(sds s, size_t addlen) {
    
    
    void *sh, *newsh;
    size_t avail = sdsavail(s);
    size_t len, newlen;
    char type, oldtype = s[-1] & SDS_TYPE_MASK;// type只是声明了
    int hdrlen;
    size_t usable;

    /* Return ASAP if there is enough space left. */
    if (avail >= addlen) return s;

    len = sdslen(s);
    sh = (char*)s-sdsHdrSize(oldtype);
    newlen = (len+addlen);
    assert(newlen > len);   /* Catch size_t overflow */
    if (newlen < SDS_MAX_PREALLOC)
        newlen *= 2;
    else
        newlen += SDS_MAX_PREALLOC;

    type = sdsReqType(newlen);

    /* Don't use type 5: the user is appending to the string and type 5 is
     * not able to remember empty space, so sdsMakeRoomFor() must be called
     * at every appending operation. */
    if (type == SDS_TYPE_5) type = SDS_TYPE_8;

    hdrlen = sdsHdrSize(type);
    assert(hdrlen + newlen + 1 > len);  /* Catch size_t overflow */
    if (oldtype==type) {
    
    
        newsh = s_realloc_usable(sh, hdrlen+newlen+1, &usable);
        if (newsh == NULL) return NULL;
        s = (char*)newsh+hdrlen;
    } else {
    
    
        /* Since the header size changes, need to move the string forward,
         * and can't use realloc */
        newsh = s_malloc_usable(hdrlen+newlen+1, &usable);
        if (newsh == NULL) return NULL;
        memcpy((char*)newsh+hdrlen, s, len+1);
        s_free(sh);
        s = (char*)newsh+hdrlen;
        s[-1] = type;
        sdssetlen(s, len);
    }
    usable = usable-hdrlen-1;
    if (usable > sdsTypeMaxSize(type))
        usable = sdsTypeMaxSize(type);
    sdssetalloc(s, usable);
    return s;
}

When it is found that expansion is required, if the new type is still sds5, then it needs to be expanded directly to sds8 to prevent the next increase. If the type changes, you need to re-malloc before free, because hdrlen also changes with the structure change. In setting the alloc and len of the new sh

Follow up to find out what is being updated

Guess you like

Origin blog.csdn.net/dxgzg/article/details/121612097