Cao Gong said Redis source code (3)-complete analysis of redis server startup process (middle)

Article navigation

The original intention of the Redis source code series is to help us better understand Redis, and understand Redis better, but how to understand it is not enough. It is recommended to follow this article to build the environment and follow up to read the source code yourself, or Read along with me. Since I used C for several years ago, some errors are inevitable, I hope the reader can point out.

Cao Gong said that Redis source code (1)-redis debug environment build, use clion, achieve the same effect as debugging java

Cao Gong said Redis source code (2)-redis server startup process analysis and simple c language basic knowledge supplement

Topic of this lecture

First of all, I will add a bit of knowledge about pointers in C language. Next, I will start following yesterday ’s article about the startup process of redis, from large to small, to avoid quickly falling into the details.

Understanding of pointers

A pointer actually points to a memory address. This pointer can be arbitrarily interpreted by you if you know what is stored before and after this address. Let me give an example:

typedef struct Test_Struct{
    int a;
    int b;
}Test_Struct;

int main() {
    // 1
    void *pVoid = malloc(4);
    // 2
    memset(pVoid,0x01,4);

    // 3
    int *pInt = pVoid;
    // 4
    char *pChar = pVoid;
    // 5
    short *pShort = pVoid;
    // 6
    Test_Struct *pTestStruct = pVoid;

    // 7
    printf("address:%p, point to %d\n", pChar, *pChar);
    printf("address:%p, point to %d\n", pShort, *pShort);
    printf("address:%p, point to %d\n", pInt, *pInt);
    printf("address:%p, point to %d\n", pTestStruct, pTestStruct->a);
}
  • In one place, allocate a piece of memory, 4 bytes, 32 bits; return a pointer to this memory area, to be precise, point to the first byte, because the allocated memory is continuous, you can understand it as an array.

    The malloc() function allocates size bytes and returns a pointer to the allocated memory.

  • At two places, call memset to set the 4 bytes of the memory pointed to by pVoid to 0x01, in fact, set each byte to 00000001.

    The memset notes are as follows:

    NAME
           memset - fill memory with a constant byte
    
    SYNOPSIS
           #include <string.h>
    
           void *memset(void *s, int c, size_t n);
    
    DESCRIPTION
           The memset() function fills the first n bytes of the memory area pointed to by s with the constant byte c.
    

    Reference materials: https://www.cnblogs.com/yhlboke-1992/p/9292877.html

    Here we set each byte to 0x01, and the final binary is actually as follows:

  • 3 places, define a pointer of type int, assign pVoid to it, int takes 4 bytes

  • 4 places, define the pointer of type char, assign pVoid to it, char takes 1 byte

  • 5 places, define a pointer of type short, assign pVoid to it, short takes 2 bytes

  • In six places, a pointer of type Test_Struct is defined. This is a structure, similar to a high-level language class. The structure of this structure is as follows:

    typedef struct Test_Struct{
        int a;
        int b;
    }Test_Struct;
    

    Similarly, we assign pVoid to it.

  • At 7 locations, the addresses of various types of pointers and their dereferenced values ​​are printed separately.

The output is as follows:

The binary of 257 is: 0000 0001 0000 0001

The binary of 16843009 is: 0000 0001 0000 0001 0000 0001 0000 0001

The structure is also easy to understand because this structure, the first attribute a, is of type int and occupies 4 bytes.

In addition, everyone should note that the pointer addresses output above are exactly the same.

If you can understand this demo, then look at this link, I believe it will understand the pointer more:

Arithmetic operation of C pointer

Redis server approximate startup process

int main(int argc, char **argv) {
    struct timeval tv;

    /**
     * 1 设置时区等等
     */
    setlocale(LC_COLLATE,"");
    ...

    // 2 检查服务器是否以 Sentinel 模式启动
    server.sentinel_mode = checkForSentinelMode(argc,argv);

    // 3 初始化服务器配置
    initServerConfig();

	// 4
    if (server.sentinel_mode) {
        initSentinelConfig();
        initSentinel();
    }

    // 5 检查用户是否指定了配置文件,或者配置选项
    if (argc >= 2) {
        ...
        // 载入配置文件, options 是前面分析出的给定选项
        loadServerConfig(configfile,options);
        sdsfree(options);
    }

    // 6 将服务器设置为守护进程
    if (server.daemonize) daemonize();

    // 7 创建并初始化服务器数据结构
    initServer();

    // 8 如果服务器是守护进程,那么创建 PID 文件
    if (server.daemonize) createPidFile();

    // 9 为服务器进程设置名字
    redisSetProcTitle(argv[0]);

    // 10 打印 ASCII LOGO
    redisAsciiArt();

    // 11 如果服务器不是运行在 SENTINEL 模式,那么执行以下代码
    if (!server.sentinel_mode) {
        // 从 AOF 文件或者 RDB 文件中载入数据
        loadDataFromDisk();
        // 启动集群
        if (server.cluster_enabled) {
            if (verifyClusterConfigWithData() == REDIS_ERR) {
                redisLog(REDIS_WARNING,
                    "You can't have keys in a DB different than DB 0 when in "
                    "Cluster mode. Exiting.");
                exit(1);
            }
        }
        // 打印 TCP 端口
        if (server.ipfd_count > 0)
            redisLog(REDIS_NOTICE,"The server is now ready to accept connections on port %d", server.port);
    } else {
        sentinelIsRunning();
    }

    // 12 运行事件处理器,一直到服务器关闭为止
    aeSetBeforeSleepProc(server.el,beforeSleep);
    aeMain(server.el);

    // 13 服务器关闭,停止事件循环
    aeDeleteEventLoop(server.el);

    return 0;
}
  • 1, 2, 3, as already mentioned in the previous article, mainly initialize various configuration parameters, such as socket related; redis.conf involved, aof, rdb, replication, sentinel, etc .; Data structure, such as runid, configuration file address, server related information (32-bit or 64-bit, because redis runs directly on the operating system, rather than high-level languages ​​like virtual machines, 32-bit and 64-bit, different The length is different), log level, maximum number of clients, maximum idle time of clients, etc.

  • 4 places, because sentinel and common redis server actually share the same code, so when starting here, it depends on whether to start sentinel or ordinary redis server. If it is sent sentinel, configure the sentinel related configuration

  • 5 places, check whether the configuration file is specified in the command line parameters at startup, if specified, the configuration of the configuration file shall prevail

  • 6 places, set as daemon

  • 7 places, according to the previous configuration, initialize the redis server

  • 8 places, create pid file, the general default path: /var/run/redis.pid, this can be configured in redis.conf, such as:

    pidfile "/var/run/redis_6379.pid"

  • 9 places, set the name for the server process

  • 10 places, print logo

  • 11. If it is not sent in sentinel mode, load aof or rdb file

  • At 12, it jumps into an endless loop, starts to wait for receiving connections, and processes client requests; meanwhile, it periodically executes background tasks, such as deleting expired keys, etc.

  • At 13, the server is shut down. Generally speaking, if you don't go here, you are generally caught in an infinite loop at 12; only in certain scenarios, after changing a global variable stop to true, the program will jump out of 12. Endless loop, and then came here.

The process of initializing the redis server

This section is mainly to refine the previous step 7 operation, which is to initialize the redis server. This function, located in redis.c, is called initServer and does a lot of things, which will be explained sequentially.

Set global signal processing function

    // 设置信号处理函数
    signal(SIGHUP, SIG_IGN);
    signal(SIGPIPE, SIG_IGN);
    setupSignalHandlers();

The most important is the last line:

void setupSignalHandlers(void) {
    // 1
    struct sigaction act;

    /* When the SA_SIGINFO flag is set in sa_flags then sa_sigaction is used.
     * Otherwise, sa_handler is used. */
    sigemptyset(&act.sa_mask);
    act.sa_flags = 0;
    // 2
    act.sa_handler = sigtermHandler;
    // 3
    sigaction(SIGTERM, &act, NULL);

    return;
}

Three places, set: when receiving the SIGTERM signal, use actto process the signal, act is defined in 1, is a local variable, it has a field, is assigned in 2 places, this is a function pointer. The function pointer is similar to the reference of a static method in java, why is static, because the implementation of such methods does not require a new object; in the C language, all methods are top-level, when calling, do not need a new object ; So, from this point of view, the function pointer of the C language is similar to the reference of the static method in java.

We can look at 2 places,

    act.sa_handler = sigtermHandler;

This sigtermHandler should be a global function, see how it is defined:

// SIGTERM 信号的处理器
static void sigtermHandler(int sig) {
    REDIS_NOTUSED(sig);

    redisLogFromHandler(REDIS_WARNING,"Received SIGTERM, scheduling shutdown...");
    
    // 打开关闭标识
    server.shutdown_asap = 1;
}

This function is to open the shutdown_asap global variable of server. This field is used in the following places:

serverCron in redis.c
    
	/* We received a SIGTERM, shutting down here in a safe way, as it is
     * not ok doing so inside the signal handler. */
    // 服务器进程收到 SIGTERM 信号,关闭服务器
    if (server.shutdown_asap) {

        // 尝试关闭服务器
        if (prepareForShutdown(0) == REDIS_OK) exit(0);

        // 如果关闭失败,那么打印 LOG ,并移除关闭标识
        redisLog(REDIS_WARNING,"SIGTERM received but errors trying to shut down the server, check the logs for more information");
        server.shutdown_asap = 0;
    }

The first line of the above code, which identifies the location of this code, is the serverCron function in redis.c. This function is the periodic execution function of redis server, similar to ScheduledThreadPoolExecutor in java. After detecting that server.shutdown_asap is turned on, it will shut down the server.

Then, after receiving the signal above, the action to be performed is finished, then, what is the signal, the signal is actually a means of inter-process communication under Linux, such as kill -9, will send a SIGKILL command to the corresponding pid ; When the redis foreground is running, you press ctrl + c, in fact, it also sends a signal, the signal is SIGINT, the value is 2. You can see the picture below:

So, what is the signal we registered earlier, is: SIGTERM, 15. That is, when we press kill -15, this signal will be triggered.

About the difference between kill 9 and kill 15, you can read this blog:

The difference between Linux kill -9 and kill -15

Open syslog

// 设置 syslog
if (server.syslog_enabled) {
    openlog(server.syslog_ident, LOG_PID | LOG_NDELAY | LOG_NOWAIT,
        server.syslog_facility);
}

This is the syslog that sends the log to the Linux system. You can see the description of the openlog function:

send messages to the system logger

This feeling is not used much, you can check:

Redis's syslog log is not printed out of the exploration process

Initialize some properties of the current redisServer

	// 初始化并创建数据结构
    server.current_client = NULL;
	// 1
    server.clients = listCreate();
    server.clients_to_close = listCreate();
    server.slaves = listCreate();
    server.monitors = listCreate();
    server.slaveseldb = -1; /* Force to emit the first SELECT command. */
    server.unblocked_clients = listCreate();
    server.ready_keys = listCreate();
    server.clients_waiting_acks = listCreate();
    server.get_ack_from_slaves = 0;
    server.clients_paused = 0;

In fact, there is nothing to say. As you can see, for example, this server.clients, server is a global variable that maintains the various states of the current redis server. The clients are used to save the current client connected to the redis server. , The type is a linked list:

    // 一个链表,保存了所有客户端状态结构
    list *clients;              /* List of active clients */

So, here is actually calling listCreate(), creating an empty linked list, and then assigning values ​​to clients.

Other attributes are similar.

Create a constant string pool for reuse

As we all know, when redis returns a response, it is usually a sentence: "+ OK" and the like. This string, if you go to new one each time you respond, it is too wasteful, so, simply, redis caches these commonly used strings.

void createSharedObjects(void) {
    int j;

    // 常用回复
    shared.crlf = createObject(REDIS_STRING,sdsnew("\r\n"));
    shared.ok = createObject(REDIS_STRING,sdsnew("+OK\r\n"));
    shared.err = createObject(REDIS_STRING,sdsnew("-ERR\r\n"));
    ...
    // 常用错误回复
    shared.wrongtypeerr = createObject(REDIS_STRING,sdsnew(
        "-WRONGTYPE Operation against a key holding the wrong kind of value\r\n"));
    ...
}

This is the same as java, which caches string literals, all to improve performance; in Java, don't you also cache integers within 128, right?

Adjust the maximum number of files that the process can open

The server is generally in a real online environment. If you need to deal with high concurrency, there may be tens of millions of clients and a process on the server to establish a tcp connection. At this time, you generally need to adjust the process. The maximum number of open files (sockets are also files).

Before reading the redis source code, I know that the way to modify the maximum number of files that a process can open is through ulimit. Specifically, you can see the following two links:

Linux maximum file handle number summary

Elasticsearch optimization

However, in this source code, another way was found:

  • API to get the current limit value of the specified resource
#define RLIMIT_NOFILE	5		/* max number of open files */
    
struct rlimit {
	rlim_t	rlim_cur;
	rlim_t	rlim_max;
};
struct rlimit limit;

getrlimit(RLIMIT_NOFILE,&limit)

The above code obtains the resource limit size of the value of NOFILE (the maximum number of files in the process) in the current system.

Through man getrlimit (need to install first, installation method:) yum install man-pages.noarch, you can see:

  • setrlimit can set the relevant limit of resources

    limit.rlim_cur = f;
    limit.rlim_max = f;
    setrlimit(RLIMIT_NOFILE,&limit)
    

Create event loop related data structures

The structure of the event circulator is as follows:

/* 
 * State of an event based program 
 *
 * 事件处理器的状态
 */
typedef struct aeEventLoop {

    // 目前已注册的最大描述符
    int maxfd;   /* highest file descriptor currently registered */

    // 目前已追踪的最大描述符
    int setsize; /* max number of file descriptors tracked */

    // 用于生成时间事件 id
    long long timeEventNextId;

    // 最后一次执行时间事件的时间
    time_t lastTime;     /* Used to detect system clock skew */

    // 已注册的文件事件
    aeFileEvent *events; /* Registered events */

    // 已就绪的文件事件
    aeFiredEvent *fired; /* Fired events */

    // 时间事件
    aeTimeEvent *timeEventHead;

    // 事件处理器的开关
    int stop;

    // 多路复用库的私有数据
    void *apidata; /* This is used for polling API specific data */

    // 在处理事件前要执行的函数
    aeBeforeSleepProc *beforesleep;

} aeEventLoop;

The code to initialize the above data structure is at: aeCreateEventLoop in redis.c

In the above structure, the main ones are:

  1. In apidata, it is mainly used to store the relevant data of the multiplex library. Each time the multiplex library is called to select, if it finds that a ready io event occurs, it will be stored in the fired attribute.

    For example, select is an implementation of multiplexing in the old version of the linux kernel under linux. In redis, the code is as follows:

    static int aeApiPoll(aeEventLoop *eventLoop, struct timeval *tvp) {
    	...
    	// 1
        retval = select(eventLoop->maxfd+1,
                    &state->_rfds,&state->_wfds,NULL,tvp);
        if (retval > 0) {
            for (j = 0; j <= eventLoop->maxfd; j++) {
    			...
                // 2
                eventLoop->fired[numevents].fd = j;
                eventLoop->fired[numevents].mask = mask;
                numevents++;
            }
        }
        return numevents;
    }
    

    Omitted part of the code. Among them, 1 place selects, this step is similar to the select operation of nio in java; 2 places fill the file descriptors returned by select to the fired attribute.

  2. In addition, we mentioned that redis has some background tasks, such as cleaning up expired keys, this is not done overnight; every time the background task is run periodically, it will clean up a part, and the background task here is actually in the above data structure Time event.

        // 时间事件
        aeTimeEvent *timeEventHead;
    

Allocate memory space for 16 databases

server.db = zmalloc(sizeof(redisDb) * server.dbnum);

Open the listen port and listen for requests

    /* Open the TCP listening socket for the user commands. */
    // 打开 TCP 监听端口,用于等待客户端的命令请求
    listenToPort(server.port, server.ipfd, &server.ipfd_count)

This is where the usual port 6379 is opened.

Initialize the data structure corresponding to 16 databases

    /* Create the Redis databases, and initialize other internal state. */
    // 创建并初始化数据库结构
    for (j = 0; j < server.dbnum; j++) {
        server.db[j].dict = dictCreate(&dbDictType, NULL);
        server.db[j].expires = dictCreate(&keyptrDictType, NULL);
        server.db[j].blocking_keys = dictCreate(&keylistDictType, NULL);
        server.db[j].ready_keys = dictCreate(&setDictType, NULL);
        server.db[j].watched_keys = dictCreate(&keylistDictType, NULL);
        server.db[j].eviction_pool = evictionPoolAlloc();
        server.db[j].id = j;
        server.db[j].avg_ttl = 0;
    }

The data structure of db is as follows:

typedef struct redisDb {

    // 数据库键空间,保存着数据库中的所有键值对
    dict *dict;                 /* The keyspace for this DB */

    // 键的过期时间,字典的键为键,字典的值为过期事件 UNIX 时间戳
    dict *expires;              /* Timeout of keys with a timeout set */

    // 正处于阻塞状态的键
    dict *blocking_keys;        /* Keys with clients waiting for data (BLPOP) */

    // 可以解除阻塞的键
    dict *ready_keys;           /* Blocked keys that received a PUSH */

    // 正在被 WATCH 命令监视的键
    dict *watched_keys;         /* WATCHED keys for MULTI/EXEC CAS */

    struct evictionPoolEntry *eviction_pool;    /* Eviction pool of keys */

    // 数据库号码
    int id;                     /* Database ID */

    // 数据库的键的平均 TTL ,统计信息
    long long avg_ttl;          /* Average TTL, just for stats */

} redisDb;

Here you can see that the key with the expiration time set will be stored in the dict attribute and a record will be added to the expires dictionary.

expires dictionary key: pointer to the execution key; value: expiration time.

Create and initialize pub / sub related data structures

    // 创建 PUBSUB 相关结构
    server.pubsub_channels = dictCreate(&keylistDictType, NULL);
    server.pubsub_patterns = listCreate();

Initialize some statistical properties

	// serverCron() 函数的运行次数计数器
    server.cronloops = 0;
    // 负责执行 BGSAVE 的子进程的 ID
    server.rdb_child_pid = -1;
    // 负责进行 AOF 重写的子进程 ID
    server.aof_child_pid = -1;
    aofRewriteBufferReset();
    // AOF 缓冲区
    server.aof_buf = sdsempty();
    // 最后一次完成 SAVE 的时间
    server.lastsave = time(NULL); /* At startup we consider the DB saved. */
    // 最后一次尝试执行 BGSAVE 的时间
    server.lastbgsave_try = 0;    /* At startup we never tried to BGSAVE. */
    server.rdb_save_time_last = -1;
    server.rdb_save_time_start = -1;
    server.dirty = 0;
    resetServerStats();
    /* A few stats we don't want to reset: server startup time, and peak mem. */
    //  服务器启动时间
    server.stat_starttime = time(NULL);
    //  已使用内存峰值
    server.stat_peak_memory = 0;
    server.resident_set_size = 0;
    // 最后一次执行 SAVE 的状态
    server.lastbgsave_status = REDIS_OK;
    server.aof_last_write_status = REDIS_OK;
    server.aof_last_write_errno = 0;
    server.repl_good_slaves_count = 0;
    updateCachedTime();

Set the function pointer corresponding to the time event

    /* Create the serverCron() time event, that's our main way to process
     * background operations. */    
	// 为 serverCron() 创建时间事件
    if (aeCreateTimeEvent(server.el, 1, serverCron, NULL, NULL) == AE_ERR) {
        redisPanic("Can't create the serverCron time event.");
        exit(1);
    }

The serverCron here is a function, and the serverCron will be run every time a time event is triggered in the subsequent cycle.

You can see the English comment here, the author also mentioned that this is the main way to handle background tasks.

This will also be analyzed in the future.

Set the connection handler corresponding to the connect event

aeCreateFileEvent(server.el, server.ipfd[j], AE_READABLE, acceptTcpHandler, NULL)

The acceptTcpHandler here is the function to handle the new connection:

void acceptTcpHandler(aeEventLoop *el, int fd, void *privdata, int mask) {
    int cport, cfd, max = MAX_ACCEPTS_PER_CALL;
    char cip[REDIS_IP_STR_LEN];
    REDIS_NOTUSED(el);
    REDIS_NOTUSED(mask);
    REDIS_NOTUSED(privdata);

    while (max--) {
        // accept 客户端连接
        cfd = anetTcpAccept(server.neterr, fd, cip, sizeof(cip), &cport);
        if (cfd == ANET_ERR) {
            if (errno != EWOULDBLOCK)
                redisLog(REDIS_WARNING,
                         "Accepting client connection: %s", server.neterr);
            return;
        }
        // 为客户端创建客户端状态(redisClient)
        acceptCommonHandler(cfd, 0);
    }
}

Create aof file

If aof is open, you need to create an aof file.

    if (server.aof_state == REDIS_AOF_ON) {
        server.aof_fd = open(server.aof_filename,
                             O_WRONLY | O_APPEND | O_CREAT, 0644);
    }

The remaining few tasks that are not involved for the time being

    // 如果服务器以 cluster 模式打开,那么初始化 cluster
    if (server.cluster_enabled) clusterInit();

    // 初始化复制功能有关的脚本缓存
    replicationScriptCacheInit();

    // 初始化脚本系统
    scriptingInit();

    // 初始化慢查询功能
    slowlogInit();

    // 初始化 BIO 系统
    bioInit();

We can't explain the above ones for the time being, just look at them first.

At this point, initializing the redis server is basically over.

to sum up

There are many contents in this lecture, mainly in the redis startup process, there are too many things to do. I hope I have made it clear. Among them, those connected to the processor are only roughly explained, and I will continue later. thank you all.

Guess you like

Origin www.cnblogs.com/grey-wolf/p/12685918.html