Cao Gong said Redis source code (2)-redis server startup process analysis and simple c language basic knowledge supplement

Article navigation

The original intention of the Redis source code series is to help us better understand Redis, and understand Redis better, but how to understand it is not enough. It is recommended to follow this article to build the environment and follow up to read the source code yourself, or Read along with me. Since I used C for several years ago, some errors are inevitable, I hope the reader can point out.

Cao Gong said that Redis source code (1)-redis debug environment build, use clion, achieve the same effect as debugging java

Some supplementary knowledge

Project structure and entrance

In addition to the university toys, a real project is composed of a large number of source code files to form a project. In Java, if a java file uses functions, types, variables, etc. in other java files, it needs to be imported using import statements. The same is true in the C language. In the C language, to include the functions of other files, you need to use the include statement.

For example, in the main entry of redis, the redis.c file contains the following bunch of statements:

#include "redis.h"
#include "cluster.h"
#include "slowlog.h"
#include "bio.h"

#include <time.h>
#include <signal.h>

Among them, those beginning with <, such as <time.h> is the header file of the standard library, which will be searched under the path specified by the system, which can be compared to the jdkofficial class; "bio.h" is wrapped in "", It is customized in the project.

For example, time.h, I found it in the following path of Linux:

[root@mini1 src]# locate time.h
/usr/include/time.h

For other related knowledge, please refer to:

https://www.runoob.com/cprogramming/c-header-files.html

My understanding of the header file

Generally speaking, we will write our business logic methods in the .c file. Among them, some methods may be used only inside this file, similar to the private method of the java class; some methods may be Need to be used in other external source code files. How can these methods be used externally?

It can be understood as an interface in major high-level languages ​​through the header file mechanism. In java, a class is defined. Although the method can be directly set to public, other classes can be directly accessed; however, in usual business development, we Generally, you don't directly access an implementation class, but through the interface it implements; a good implementation class should not set methods not defined in the interface to public permissions.

Speaking of the header file, for example, there is a source file test.cas follows:


    
long long ustime(void) {
    struct timeval tv;
    long long ust;

    gettimeofday(&tv, NULL);
    ust = ((long long)tv.tv_sec)*1000000;
    ust += tv.tv_usec;
    return ust;
}
/* Return the UNIX time in milliseconds */
// 返回毫秒格式的 UNIX 时间
// 1 秒 = 1 000 毫秒
long long mstime(void) {
    return ustime()/1000;
}

In this file, two methods are defined, but assuming that we only need to expose mstime(void)methods externally , then the header file test.hshould look like this:



long long mstime(void);

In this case, our other method, ustime, is not visible to the outside world.

In short, you can understand the header file as the interface that the implementation class wants to expose to the outside world; you may think my analogy is inappropriate. Why do you say the c file as the implementation class? In fact, when we were in Huawei, we used c ++ The idea, object-oriented thinking, to write c language.

I saw an article on the Internet, quote it here ( https://zhuanlan.zhihu.com/p/57882822):

In contrast to Redis, he is pure C coding, but incorporates object-oriented ideas. Contrary to the above point of view, it can be described as "design in C ++ and code in C." Of course, the purpose of this article is not to provoke language disputes. Various languages ​​have their own pros and cons. The language selection of open source projects is also mainly due to the personal experience and subjective wishes of the project authors.

But the header file in the C language is different from the interface in the Java language. In Java, the interface and the implementation class are eventually compiled into independent class files.

In the C language, before compiling the implementation class, there will be a preprocessing process. The preprocessing process is to directly replace the include statement with the content of the included header file. For example, take the example in the rookie tutorial as an example:

 header.h
 char *test (void);

In the following program.cthe, Test methods require the use of the above header.h is required include:

int x;
#include "header.h"

int main (void)
{
   puts (test ());
}

After preprocessing, (that is, a simple replacement), the effect is as follows:

int x;
char *test (void);

int main (void)
{
   puts (test ());
}

We can use the following command to demonstrate this process:

[root@mini1 test]# gcc -E program.c 
int x;
# 1 "header.h" 1

char *test (void);
# 3 "program.c" 2

int main (void)
{
   puts (test ());
}

As you can see from the above, it has been replaced; what if we include it twice?

[root@mini1 test]# gcc -E program.c 
int x;
# 1 "header.h" 1

char *test (void);
# 3 "program.c" 2
# 1 "header.h" 1

char *test (void);
# 4 "program.c" 2
int main (void)
{
   puts (test ());
}

It can be found that the content of this header appears twice and repeats. But in the above case, no error is reported, but the method is defined twice.

Why should there be an ifndef in the header file

When you look at the header file, you will find the following statement, such as in redis.h:

#ifndef __REDIS_H
#define __REDIS_H

#include "fmacros.h"
#include "config.h"

...
    
typedef struct redisObject {

    // 类型
    unsigned type:4;

    // 编码
    unsigned encoding:4;

    // 对象最后一次被访问的时间
    unsigned lru:REDIS_LRU_BITS; /* lru time (relative to server.lruclock) */

    // 引用计数
    int refcount;

    // 指向实际值的指针
    void *ptr;

} robj;

...
    
#endif

As you can see, at the beginning, there is a sentence:

#ifndef __REDIS_H
#define __REDIS_H

There is a sentence at the end:

#endif

This is to solve the following problems:

When the header file is repeatedly imported (indirectly, or directly, it is included twice), if this is not added, it will cause the content in the header file to be imported twice; after adding this, even if it is included Two times, when the program is running, at the beginning, it is found that __REDIS_Hthis macro is not defined , and then define it; when the program encounters the content of the second include, it is found that __REDIS_Hthis macro has been defined, and it is skipped directly, so that guarantee With the same header file, even if it is included multiple times, its content can only be parsed once.

In addition, like the method declaration, the definition may be fine multiple times, but if there is the following type definition in the header file:

typedef char my_char;
char *test (void);

If you repeatedly include the same header file, it will cause the type definition to be repeated. However, it is very strange that I tried it on centos 7.3.1611, gcc version:, gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16)but it didn't report an error. It seems that my previous C language knowledge has not been learned at home.

I haven't found duplicate include on the Internet temporarily. What are the specific harms? There are two answers found on the Internet:

  1. Global variables are defined in the header file;
  2. Wasting compilation time

However, the first answer, strictly speaking, does not exist, because companies generally prohibit the definition of variables in header files.

There is a problem that you know, you can take a look at: What are the hazards of repeated inclusion of header files?

In the c language programming specification of Huawei, part of the provisions of the header file

You can search for yourself: Huawei Technology Co., Ltd. c language programming specification

I only intercept part here:

规则1.6 禁止在头文件中定义变量。
说明: 在头文件中定义变量,将会由于头文件被其他.c文件包含而导致变量重复定义。
    
规则1.7 只能通过包含头文件的方式使用其他.c提供的接口,禁止在.c中通过extern的方式使用外部
函数接口、变量。
说明:若a.c使用了b.c定义的foo()函数,则应当在b.h中声明extern int foo(int input);并在a.c
中通过#include <b.h>来使用foo。禁止通过在a.c中直接写extern int foo(int input);来使用foo,
后面这种写法容易在foo改变时可能导致声明和定义不一致。

The 1.7 here is also consistent with our understanding. The header file is an external interface that implements the module. In general, only the following content is allowed:

  • Type definition
  • Macro definition
  • Function declaration (not including implementation)
  • Variable declaration (not definition)

This last point, I want to add. We just forbidden to define variables in the header file, so our variables are defined in the c file. For example, in redis.c, a global variable is defined:

/* Global vars */
struct redisServer server; /* server global state */

Such an important global variable basically maintains all state values ​​of an instance of redis-server and is only used in its own redis.c, which is impossible. How to use it in other files, the following statement must be made in the redis.h header file:

/*-----------------------------------------------------------------------------
 * Extern declarations
 *----------------------------------------------------------------------------*/

extern struct redisServer server;

About type definition

Generally use struct to define a structure, similar to the class in high-level language.

For example, the strings in redis are generally stored using the data structure sds. The structure definition is as follows:

struct sdshdr {

    // buf 中已占用空间的长度
    int len;

    // buf 中剩余可用空间的长度
    int free;

    // 数据空间
    char buf[];
};

In addition, in C language, typedef will be used extensively to define a type alias.

You can refer to this tutorial for details:

https://www.runoob.com/cprogramming/c-typedef.html

About pointers

Basic knowledge: https://www.runoob.com/cprogramming/c-pointers.html

Let me talk about my understanding of pointers here. Pointers generally point to a memory address. You can ignore the type of this pointer first. In fact, when we do n’t care about the data type at the address it points to, we can directly define it as void * ptr.

This pointer is assumed to point to the address A. When we think that the above is stored as a char, we can cast this pointer from void * to char * type, and then dereference the pointer, because the char type only occupies one Byte, so just need to start from the position pointed by the pointer, take the current content of this byte, and then parse it as char, you can get the char value at this address.

If we forcibly convert void * to int *, when dereferencing it, we will take 4 bytes starting from the current pointer position, because the integer occupies 4 bytes, and then convert it to an integer.

In general, when dereferencing a pointer, the first thing is to look at the data type of the current pointer, such as int * pointer, then pointing to int, it will take 4 bytes to dereference; if it points to a structure, It will calculate the number of bytes occupied by the structure, and then take the corresponding bytes to dereference the variable of the structure type.

In this part, you can take a look at this:

https://www.runoob.com/cprogramming/c-data-types.html

https://www.runoob.com/cprogramming/c-pointer-arithmetic.html

Initialization of configuration items during the redis startup process

I said a lot in the previous section. We are not enough to talk about the complete redis startup process in this lecture. There may be two lectures. This lecture will explain part of it first.

The startup entry is in the main method in redis.c. If you use my code to build a debugging environment, you can start redis-server directly.

int main(int argc, char **argv) {
    struct timeval tv;

    /**
     * 1 设置时区
     */
    setlocale(LC_COLLATE,"");
    /**
     *2
     */
    zmalloc_enable_thread_safeness();
    // 3
    zmalloc_set_oom_handler(redisOutOfMemoryHandler);
    // 4
    srand(time(NULL)^getpid());
    // 5
    gettimeofday(&tv,NULL);
    // 6
    dictSetHashFunctionSeed(tv.tv_sec^tv.tv_usec^getpid());

    // 检查服务器是否以 Sentinel 模式启动
    server.sentinel_mode = checkForSentinelMode(argc,argv);

    // 7 初始化服务器
    initServerConfig();
  • 1 place, set time zone

  • 2 places, set the number of threads for memory allocation, here will be set to 1

  • In three places, set the function pointer when the oom occurs. The function pointer points to a function, similar to java 8, in the lambda expression, throw a method reference to the stream; the function pointer will be called back when the oom. It is similar to the template design pattern or strategy pattern in java.

  • 4 places, set the seed of random number

  • At 5, obtaining the current time, set to tvthe variable

    Note that the address of tv is passed in here, which is a typical usage in c language, similar to passing an object reference in java, and then inside the method, the internal field of the object will be modified, etc.

  • 6 places, seed the hash function

  • At seven, the server is initialized.

Here are seven points:

void initServerConfig() {
    int j;

    // 服务器状态

    // 设置服务器的运行 ID
    getRandomHexChars(server.runid,REDIS_RUN_ID_SIZE);
    // 设置默认配置文件路径
    server.configfile = NULL;
    // 设置默认服务器频率
    server.hz = REDIS_DEFAULT_HZ;
    // 为运行 ID 加上结尾字符
    server.runid[REDIS_RUN_ID_SIZE] = '\0';
    // 设置服务器的运行架构
    server.arch_bits = (sizeof(long) == 8) ? 64 : 32;
    // 设置默认服务器端口号
    server.port = REDIS_SERVERPORT;
    // tcp 全连接队列的长度
    server.tcp_backlog = REDIS_TCP_BACKLOG;
    // 绑定的地址的数量
    server.bindaddr_count = 0;
    // UNIX socket path
    server.unixsocket = NULL;
    server.unixsocketperm = REDIS_DEFAULT_UNIX_SOCKET_PERM;
    // 绑定的 TCP socket file descriptors
    server.ipfd_count = 0;
    server.sofd = -1;
    // redis可使用的redis db的数量
    server.dbnum = REDIS_DEFAULT_DBNUM;
    // redis 日志级别
    server.verbosity = REDIS_DEFAULT_VERBOSITY;
    // Client timeout in seconds,客户端最大空闲时间;超过这个时间的客户端,会被强制关闭
    server.maxidletime = REDIS_MAXIDLETIME;
    // Set SO_KEEPALIVE if non-zero. 如果设为非0,则开启tcp的SO_KEEPALIVE
    server.tcpkeepalive = REDIS_DEFAULT_TCP_KEEPALIVE;
    // 打开这个选项,会周期性地清理过期key
    server.active_expire_enabled = 1;
    // 客户端发来的请求中,查询缓存的最大值;比如一个set命令,value的大小就会和这个缓冲区大小比较,
    // 如果大了,就根本放不进缓冲区
    server.client_max_querybuf_len = REDIS_MAX_QUERYBUF_LEN;

    // rdb保存参数,比如每60s保存,n个键被修改了保存,之类的
    server.saveparams = NULL;
    // 如果为1,表示服务器正在从磁盘载入数据: We are loading data from disk if true
    server.loading = 0;
    // 日志文件位置
    server.logfile = zstrdup(REDIS_DEFAULT_LOGFILE);
    // 开启syslog等机制
    server.syslog_enabled = REDIS_DEFAULT_SYSLOG_ENABLED;
    server.syslog_ident = zstrdup(REDIS_DEFAULT_SYSLOG_IDENT);
    server.syslog_facility = LOG_LOCAL0;
    // 后台运行
    server.daemonize = REDIS_DEFAULT_DAEMONIZE;
    // aof状态
    server.aof_state = REDIS_AOF_OFF;
    // aof的刷磁盘策略,默认每秒刷盘
    server.aof_fsync = REDIS_DEFAULT_AOF_FSYNC;
    // 正在rewrite时,不刷盘
    server.aof_no_fsync_on_rewrite = REDIS_DEFAULT_AOF_NO_FSYNC_ON_REWRITE;
    // Rewrite AOF if % growth is > M and...
    server.aof_rewrite_perc = REDIS_AOF_REWRITE_PERC;
    // the AOF file is at least N bytes. aof达到多大时,触发rewrite
    server.aof_rewrite_min_size = REDIS_AOF_REWRITE_MIN_SIZE;
    //  最后一次执行 BGREWRITEAOF 时, AOF 文件的大小
    server.aof_rewrite_base_size = 0;
    // Rewrite once BGSAVE terminates.开启该选项时,BGSAVE结束时,触发rewrite
    server.aof_rewrite_scheduled = 0;
    // 最近一次aof进行fsync的时间
    server.aof_last_fsync = time(NULL);
    // 最近一次aof重写,消耗的时间
    server.aof_rewrite_time_last = -1;
    //  Current AOF rewrite start time.
    server.aof_rewrite_time_start = -1;
    // 最后一次执行 BGREWRITEAOF 的结果
    server.aof_lastbgrewrite_status = REDIS_OK;
    // 记录 AOF 的 fsync 操作被推迟了多少次
    server.aof_delayed_fsync = 0;
    //  File descriptor of currently selected AOF file
    server.aof_fd = -1;
    // AOF 的当前目标数据库
    server.aof_selected_db = -1; /* Make sure the first time will not match */
    // UNIX time of postponed AOF flush
    server.aof_flush_postponed_start = 0;
    // fsync incrementally while rewriting? 重写过程中,增量触发fsync
    server.aof_rewrite_incremental_fsync = REDIS_DEFAULT_AOF_REWRITE_INCREMENTAL_FSYNC;
    // pid文件
    server.pidfile = zstrdup(REDIS_DEFAULT_PID_FILE);
    // rdb 文件名
    server.rdb_filename = zstrdup(REDIS_DEFAULT_RDB_FILENAME);
    // aof 文件名
    server.aof_filename = zstrdup(REDIS_DEFAULT_AOF_FILENAME);
    // 是否要密码
    server.requirepass = NULL;
    // 是否进行rdb压缩
    server.rdb_compression = REDIS_DEFAULT_RDB_COMPRESSION;
    // rdb checksum
    server.rdb_checksum = REDIS_DEFAULT_RDB_CHECKSUM;
    // bgsave失败,停止写入
    server.stop_writes_on_bgsave_err = REDIS_DEFAULT_STOP_WRITES_ON_BGSAVE_ERROR;
    // 在执行 serverCron() 时进行渐进式 rehash
    server.activerehashing = REDIS_DEFAULT_ACTIVE_REHASHING;

    server.notify_keyspace_events = 0;
    // 支持的最大客户端数量
    server.maxclients = REDIS_MAX_CLIENTS;
    // bpop阻塞的客户端
    server.bpop_blocked_clients = 0;
    // 可以使用的最大内存
    server.maxmemory = REDIS_DEFAULT_MAXMEMORY;
    // 内存淘汰策略,也就是key的过期策略
    server.maxmemory_policy = REDIS_DEFAULT_MAXMEMORY_POLICY;
    server.maxmemory_samples = REDIS_DEFAULT_MAXMEMORY_SAMPLES;
    // hash表的元素小于这个值时,使用ziplist 编码模式;以下几个类似
    server.hash_max_ziplist_entries = REDIS_HASH_MAX_ZIPLIST_ENTRIES;
    server.hash_max_ziplist_value = REDIS_HASH_MAX_ZIPLIST_VALUE;
    server.list_max_ziplist_entries = REDIS_LIST_MAX_ZIPLIST_ENTRIES;
    server.list_max_ziplist_value = REDIS_LIST_MAX_ZIPLIST_VALUE;
    server.set_max_intset_entries = REDIS_SET_MAX_INTSET_ENTRIES;
    server.zset_max_ziplist_entries = REDIS_ZSET_MAX_ZIPLIST_ENTRIES;
    server.zset_max_ziplist_value = REDIS_ZSET_MAX_ZIPLIST_VALUE;
    server.hll_sparse_max_bytes = REDIS_DEFAULT_HLL_SPARSE_MAX_BYTES;
    // 该标识打开时,表示正在关闭服务器
    server.shutdown_asap = 0;
    // 复制相关
    server.repl_ping_slave_period = REDIS_REPL_PING_SLAVE_PERIOD;
    server.repl_timeout = REDIS_REPL_TIMEOUT;
    server.repl_min_slaves_to_write = REDIS_DEFAULT_MIN_SLAVES_TO_WRITE;
    server.repl_min_slaves_max_lag = REDIS_DEFAULT_MIN_SLAVES_MAX_LAG;
    // cluster模式相关
    server.cluster_enabled = 0;
    server.cluster_node_timeout = REDIS_CLUSTER_DEFAULT_NODE_TIMEOUT;
    server.cluster_migration_barrier = REDIS_CLUSTER_DEFAULT_MIGRATION_BARRIER;
    server.cluster_configfile = zstrdup(REDIS_DEFAULT_CLUSTER_CONFIG_FILE);
    // lua脚本
    server.lua_caller = NULL;
    server.lua_time_limit = REDIS_LUA_TIME_LIMIT;
    server.lua_client = NULL;
    server.lua_timedout = 0;
    //
    server.migrate_cached_sockets = dictCreate(&migrateCacheDictType,NULL);
    server.loading_process_events_interval_bytes = (1024*1024*2);

    // 初始化 LRU 时间
    server.lruclock = getLRUClock();

    // 初始化并设置保存条件
    resetServerSaveParams();

    // rdb的默认保存策略
    appendServerSaveParams(60*60,1);  /* save after 1 hour and 1 change */
    appendServerSaveParams(300,100);  /* save after 5 minutes and 100 changes */
    appendServerSaveParams(60,10000); /* save after 1 minute and 10000 changes */

    /* Replication related */
    // 初始化和复制相关的状态
    server.masterauth = NULL;
    server.masterhost = NULL;
    server.masterport = 6379;
    server.master = NULL;
    server.cached_master = NULL;
    server.repl_master_initial_offset = -1;
    server.repl_state = REDIS_REPL_NONE;
    server.repl_syncio_timeout = REDIS_REPL_SYNCIO_TIMEOUT;
    server.repl_serve_stale_data = REDIS_DEFAULT_SLAVE_SERVE_STALE_DATA;
    server.repl_slave_ro = REDIS_DEFAULT_SLAVE_READ_ONLY;
    server.repl_down_since = 0; /* Never connected, repl is down since EVER. */
    server.repl_disable_tcp_nodelay = REDIS_DEFAULT_REPL_DISABLE_TCP_NODELAY;
    server.slave_priority = REDIS_DEFAULT_SLAVE_PRIORITY;
    server.master_repl_offset = 0;

    /* Replication partial resync backlog */
    // 初始化 PSYNC 命令所使用的 backlog
    server.repl_backlog = NULL;
    server.repl_backlog_size = REDIS_DEFAULT_REPL_BACKLOG_SIZE;
    server.repl_backlog_histlen = 0;
    server.repl_backlog_idx = 0;
    server.repl_backlog_off = 0;
    server.repl_backlog_time_limit = REDIS_DEFAULT_REPL_BACKLOG_TIME_LIMIT;
    server.repl_no_slaves_since = time(NULL);

    /* Client output buffer limits */
    // 设置客户端的输出缓冲区限制
    for (j = 0; j < REDIS_CLIENT_LIMIT_NUM_CLASSES; j++)
        server.client_obuf_limits[j] = clientBufferLimitsDefaults[j];

    /* Double constants initialization */
    // 初始化浮点常量
    R_Zero = 0.0;
    R_PosInf = 1.0/R_Zero;
    R_NegInf = -1.0/R_Zero;
    R_Nan = R_Zero/R_Zero;


    // 初始化命令表,比如get、set、hset等各自的处理函数,放进一个hash表,方便后续处理请求
    server.commands = dictCreate(&commandTableDictType,NULL);
    server.orig_commands = dictCreate(&commandTableDictType,NULL);
    populateCommandTable();
    server.delCommand = lookupCommandByCString("del");
    server.multiCommand = lookupCommandByCString("multi");
    server.lpushCommand = lookupCommandByCString("lpush");
    server.lpopCommand = lookupCommandByCString("lpop");
    server.rpopCommand = lookupCommandByCString("rpop");
    
    /* Slow log */
    // 初始化慢查询日志
    server.slowlog_log_slower_than = REDIS_SLOWLOG_LOG_SLOWER_THAN;
    server.slowlog_max_len = REDIS_SLOWLOG_MAX_LEN;

    /* Debugging */
    // 初始化调试项
    server.assert_failed = "<no assertion failed>";
    server.assert_file = "<no file>";
    server.assert_line = 0;
    server.bug_report_start = 0;
    server.watchdog_period = 0;
}

All of the above are commented, we can not look at it first: copy, cluster, lua, etc., first look at the other.

to sum up

It ’s been a long time since I touched c, and it ’s forgotten, but in general, it ’s not difficult. The hard thing is memory leaks, but we just use it for debugging and do n’t worry about these problems.

The pointer part needs a little foundation, and everyone can take a moment to learn it.

If you have any questions or suggestions, please let me know.

Guess you like

Origin www.cnblogs.com/grey-wolf/p/12682760.html