mongodb kernel source code implementation, performance tuning, best operation and maintenance practice series - command command processing module source code implementation 2

About the author

       Former technical expert of Didi Chuxing, currently in charge of mongodb, the OPPO document database, responsible for the R&D and operation and maintenance of the mongodb kernel of oppo's 10 million-level peak TPS/10 trillion-level data volume document database, has been focusing on distributed cache and high-performance server , database, middleware and other related research and development. Follow-up will continue to share "MongoDB kernel source code design, performance optimization, best operation and maintenance practices", Github account address: https://github.com/y123456yz

background

       <<transport_layer network transport layer module source code implementation>> shares the implementation of the underlying network IO processing of the mongodb kernel, including socket initialization, reading a complete mongodb message, and sending the obtained DB data to the client. Mongodb supports a variety of operations such as addition, deletion, modification, query, aggregation processing, cluster processing, etc. Each operation corresponds to a command in the kernel implementation, and each command has different functions. How the mongodb kernel processes the command source code will be analyzed in this article. the key of

In addition, mongodb provides the mongostat tool to monitor various operational statistics of the current cluster. Mongostat monitoring statistics are shown in the following figure:

         Among them, the four statistics of insert, delete, update, and query are relatively easy to understand, which correspond to addition, deletion, modification, and query respectively. However, comand and getmore are not well understood. What statistics does command represent? What stat does getMore stand for? , which are relatively difficult to understand.

1. Review of Command command processing module

       In "Mongodb command command processing module source code implementation 1", we analyzed that after a client request arrives, the general processing flow of the mognodb server is as follows:

  • Parse the initial message header from the message to determine a complete mongodb message
  • Parse the initial OpCode opcode information from the body, the default OpCode opcode in version 3.6 is OP_MSG
  • According to the OP_MSG operation code at the beginning of parsing, the corresponding OpMsg class is constructed, and the real command request is saved in the body of this class member in bson data format.
  • Parse the command string information (such as "insert", "update", etc.) from the body.
  • Find out whether the command is supported from the global _commands map table, and execute the command processing if it is supported, and report an error message directly if it is not supported.
  • After finally finding the corresponding command command, execute the function run interface of the command.

       The command command information supported by the Mongodb kernel is stored in a global map table _commands. After parsing the command command string from the command request bson, it is searched from the global map table. If the command is found, it means that mongodb supports the command. If not, it means that it is not supported. The whole process is summarized as shown in the following figure:

       After parsing the command name string from the OpMsg class (for example: " insert " , " delete " , etc.), look it up from the global map table _commands, and execute the corresponding command if found. If it is not found, it means that the command operation is not supported, and an exception prompt is processed.

       The command commands supported by different instances of Mongodb depend entirely on the global map table _commands. Let’s continue to analyze the source of the global map.

2. Command command processing module source code directory structure

       A mongodb cluster usually contains three node instance roles: mongos, mongod (ShardServer), and mongod (ConfigServer). The three example color correction functions are as follows:

  • Mongos: proxy, obtains routing information from shardServer, and forwards client requests to shard.
  • mongod (ShardServer): data storage node, all client data is recorded in shard.
  • mongod (ConfigServer): Records data routing information and some metadata.

       The name of the Mongos agent process is unique, that is, " mongos " , and the command information supported by the agent mongos is better to confirm. But the process names of ShardServer and ConfigServer are both " mongod " , how to distinguish which commands they support?

       configServer is actually a special shardServer, which not only has the function of shard data sharding, but also has special metadata management functions, such as recording chunk metadata information, mongos information, sharding operation log information, etc. Therefore, in addition to supporting the commands of shardServer, configServer will also support more unique commands.

       The command information supported by the mongos agent is all implemented in the src/mongo/s/commands directory. The source code files are as follows:

       The command information supported by mongod (shardServer) is all implemented in the src/mongo/db/commands directory. The source code files are as follows:

       mongod (configServer) supports almost all commands supported by shardServer (Note: there are also some special cases, such as "mapreduce.shardedfinish" ), and also supports some special commands, which are implemented in the src/mongo/db/s/config directory , the source file is as follows:

       As can be seen from the source code directory files of the commands supported by different instances above, the mongodb kernel source code design is excellent. From the directory structure, the different command information supported by different instance roles can be determined at a glance, and the code readability is very good. The directory structure can be summarized in the following table:

       The range of commands supported by configServer and shardServer is similar to the relationship between inclusion and inclusion in the following figure. The small oval represents shardServer, and the large circle represents configServer:

3. Command module class inheritance relationship

       It can be seen from the code directory structure in Chapter 2 that most of the command functions are implemented by the corresponding source code files, such as the find_cmd.cpp source code file for "find" command processing. In addition, there are also some source code files, one file corresponds to multiple command implementations, such as write_commands The .cpp source code file is also responsible for the addition, deletion and modification of " insert " , " update " , and " delete " .

       Since there are many commands, after understanding the code directory structure, before analyzing the core code, let's first understand the various inheritance relationships of the command class. Different commands have different functions and require different implementations, but all commands also have some common interface features, such as whether the command requires authentication, whether it supports slave node operations, whether it supports WriteConcern operations, etc.

       Different commands have the same commonalities, and also have their own unique characteristics. Therefore, mongodb fully considers these issues in the source code implementation, abstracting some common feature interfaces to be implemented by the base class, and some unique features used by command are implemented in the inherited class. The main inheritance relationship diagram of the core source code classes related to the command command processing module is as follows:

       As shown in the figure above, the relevant implementation classes of the command processing module can include four layers according to the parent-child inheritance relationship. The functions of each layer are described as follows:

  • CommandInterface class: virtual interface class, only defines the virtual interface, does not do specific implementation.
  • Command class: Complete some basic function checks, such as whether to support slave node operations, whether to require authentication, whether to support WriteConcern, to obtain command names, and whether to operate only in the admin library.
  • BasicCommand class: Authenticate related interface implementation and define virtual run interface.
  • Specific command class: Each command has a corresponding class definition, which is implemented in this layer, and the real command run interface implementation is completed in this layer.

4. Command command register core code implementation

       As mentioned in the previous analysis, when the corresponding command string (such as " insert " , " update " , etc.) is parsed, it is searched from the _commands in the global map table, and it is found that the command is supported, and if it is not found, it is not supported. The global _commands table stores the command command information supported by the instance. Different commands need to be registered in the map table in advance. There are two registration methods:

  • Each command defines a corresponding global class variable
  • new() a message of the command class

       The source code implementation of the class registration process is completed by the command class initialization construction interface. The core code of the registration process is as follows:

1.//命令注册,所有注册的命令最终全部保存到_commands全局map表中  
2.//name和oldName实际上是同一个command,只是可能因为历史原因,命令名改名了  
3.Command::Command(StringData name, StringData oldName)   
4.    //命令名字符串  
5.    : _name(name.toString()),   
6.     //对应命令执行统计,total代表总的,failed代表执行失败的次数  
7.     _commandsExecutedMetric("commands." + _name + ".total", &_commandsExecuted),  
8.     _commandsFailedMetric("commands." + _name + ".failed", &_commandsFailed) {  
9.    //如果_commands map表还没有生成,则new一个  
10.    if (_commands == 0)  
11.        _commands = new CommandMap();  
12.    ......  
13.    //把name命令对应的command添加到map表中  
14.    Command*& c = (*_commands)[name];  
15.    if (c)  
16.        log() << "warning: 2 commands with name: " << _name;  
17.    c = this;  
18.    ......  
19.  
20.    //大部分命令name和oldName是一样的,所以在数组中只会记录一个  
21.    //如果改名过,则name和oldName就不一样,这时候都需要注册到map表,对应同一个command  
22.    if (!oldName.empty()) //也就是name和oldName两个命令对应的是同一个this类  
23.        (*_commands)[oldName.toString()] = this;  
24.}  

           There are two input parameters in the command initialization constructor. The sub-table represents the current command name and the old command name. This is designed for compatibility.

4.1 Command registration method 1

       More than 99% of commands are registered by defining a global class variable. This article takes " insert " , " update " , " delete " , and "find" of the shardServer instance as examples. The registration methods of these commands are as follows:

1.//insert命令初始化  
2.class CmdInsert : public WriteCommand { //  
3.public:  
4.    //insert命令初始化构造  
5.    CmdInsert() : WriteCommand("insert") {}  
6.    ......  
7.    //认证检查  
8.    Status checkAuthForRequest(...) final {  
9.        ......  
10.    }  
11.  
12.    //真正的Insert插入文档会走这里面  
13.    void runImpl(...);  
14.    }  
15.} cmdInsert; //直接定义一个cmdInsert全局变量  
16.  
17.//update命令初始化  
18.class CmdUpdate: public WriteCommand { //  
19.public:  
20.    //update命令初始化构造  
21.    CmdUpdate() : WriteCommand("update") {}  
22.    ......  
23.    //认证检查  
24.    Status checkAuthForRequest(...) final {  
25.        ......  
26.    }  
1.    //查询计划执行过程  
2.    Status explain(...) const override {  
3.          ......  
4.    }  
27.    //真正的update插入文档会走这里面  
28.    void runImpl(...);  
29.    }  
30.} cmdUpdate; //直接定义一个cmdUpdate全局变量  
31.  
32.//delete命令初始化  
33.class CmdDelete: public WriteCommand { //  
34.public:  
35.    //delete命令初始化构造  
36.    CmdDelete() : WriteCommand("delete") {}  
37.    ......  
38.    //认证检查  
39.    Status checkAuthForRequest(...) final {  
40.        ......  
41.    }  
5.    //查询计划执行过程  
6.    Status explain(...) const override {  
7.          ......  
8.    }  
42.  
43.    //真正的delete插入文档会走这里面  
44.    void runImpl(...);  
45.    }  
46.} cmdDelete; //直接定义一个cmdDelete全局变量  

       The " find " command also completes the registration process of the command by defining a global FindCmd class variable. The registration process code is as follows:

9.//find命令实现类  
10.class FindCmd : public BasicCommand {  
11.public:  
12.    //初始化构造  
13.    FindCmd() : BasicCommand("find") {}  
14.    ......  
15.      
16.    //查询计划执行过程  
17.    Status explain(...) const override {  
18.          ......  
19.    }  
20.} findCmd; //直接定义一个findCmd全局变量    

       The above class can not only determine the registration method of shardServer read and write commands, but also can see that the class inheritance relationship is slightly different during the implementation of read and write commands. Mainly reflected in: The FindCmd  (check) command class directly inherits the BasicCommand  command class, and the three write-related commands, CmdInsert (increase)  , CmdDelete (delete), and Cmd Update (change), transfer once by inheriting WriteCommand  , and WriteCommand  realizes WriteCommand  has a common interface, while the three subclasses implement their own unique functions.

       For a shardServer instance, the inheritance relationship diagram of the four-level commands of adding, deleting, modifying, and checking can be summarized as shown in the following figure:

4.2 Command registration method 2

       In addition to directly defining a global command class variable, when the mongodb kernel command registration is implemented, some command registrations are implemented through a new command class. For example, several commands corresponding to the planCache execution plan are implemented in this way. The code implementation is as follows:

1.//执行计划相关的几个command注册过程,通过new实现  
2.MONGO_INITIALIZER_WITH_PREREQUISITES(SetupPlanCacheCommands, MONGO_NO_PREREQUISITES)  
3.(InitializerContext* context) {  
4.    //执行计划相关的几个命令注册  
5.    new PlanCacheListQueryShapes();  
6.    new PlanCacheClear();  
7.    new PlanCacheListPlans();  
8.    return Status::OK();  
9.}   
10.  
11.//test命令相关的几个command注册过程,也是通过new实现  
12.MONGO_INITIALIZER(RegisterEmptyCappedCmd)(InitializerContext* context) {  
13.    //必须使能testCommandsEnabled,该命令才有效  
14.    if (Command::testCommandsEnabled) {  
15.        new CapTrunc();  
16.        new CmdSleep();  
17.        new EmptyCapped();  
18.        new GodInsert();  
19.    }  
20.    return Status::OK();  
21.}  

       At this point, the mongodb kernel command command registration process has been analyzed. If you want to register a new command, you can imitate this process.

5. mongos, mongod (shardServer), mongod (configServer) naming convention

       The commands supported by different color calibration binary instances of mongodb are different, and the corresponding command functions are implemented by different code files. The mongodb kernel design is very good, the corresponding command can be determined by the file name, and which role instance the command belongs to. Here is a review of the command code directory implementation corresponding to the different color correction instances mentioned above:

  • mongos agent: code directory src/mongo/s/commands
  • mongod (shardServer): code directory src/mongo/db/commands
  • mongod(configServer): code directory src/mongo/db/s/config

       In addition to the clear distinction between code directories, code file names and command class names are also different. However, the command class name and file name also have specific naming conventions, and there are certain naming rules. The following is to use mongod (including shardServer and configServer ) and mongos agent as examples to illustrate the most commonly used addition, deletion, modification, and checking the corresponding command commands source file naming and command class naming.

       Sorting out the naming conventions of each color correction instance in advance has a multiplier effect on our understanding of the entire code, and at the same time, it is also convenient for us to quickly find the code file of any command and the core code implementation of the corresponding command, which has the effect of " inferring others " .

5.1 mongos, mongod (including shardServer and configServer ) naming convention

        The write operation commands (add, delete, and modify) of the mongod instance are implemented by the write_commands.cpp file. The CmdInsert, CmdDelete, and CmdUpdate classes in this file correspond to the specific add, delete, and modify command operations, respectively. The read operation command is implemented by the find_cmd.cpp file, and the corresponding command class is FindCmd

       In addition to mongod instances, mongos, as a proxy forwarding node, also supports add, delete, and modify operations. When the mongodb kernel is implemented, if the cluster deployment is in the sharding cluster mode, the mongos proxy is required, and the client access entry is the proxy. It is precisely because the proxy mode is the sharding sharding cluster mode that the commands supported by mongos are specially marked when the source file is named and the command class is named. Compared with the mongod instance, all the command-related original files and class implementations supported by mongos basically add the " cluster " special tag.

       Taking add, delete, modify, check, isMaster, getMore, and findAndModify as examples, the list of commands supported by mongos and mongod (including shardServer and configServer ) is summarized as follows:

       As can be seen from the above naming files and command class names, most mongos proxy related commands will add the " cluster " mark (but there are also some cases, such as findAndModify corresponding to the class name without the change mark ).

       In addition, there are also some mongos and mongod instance commands that do not meet the above naming conventions, such as "dropIndexes", "createIndexes", "reIndex", "create", "renameCollection" and other commands. The respective naming rules are as follows:

       As above, most mongos command source files and command implementation class names are marked with " cluster " compared to mongod instances, but there are still some command names that are not allowed to find this rule. If you want to know the source code implementation file of a command, you can locate it by searching for the corresponding string in the three examples mentioned above. Note: Double quotes are required when searching.

5.2 mongod ( configServer ) specific command naming rules

       Similar to the mongos naming rule, the unique command source file naming rule supported by configServer adds the "configsvr" feature compared to shardServer. It can be clearly seen from the source code file name that it is a unique command of configServer.

       In addition, the corresponding class naming of commands also has the "ConfigSvr" feature, such as class ConfigSvrAddShardCommand{}, class ConfigSvrMoveChunkCommand{}, etc. The naming rules are similar to the command naming rules supported by the mongos agent.

5.3 Summary of naming rules

       The above naming rules can be summarized into the following graphic information:

7. Command run

         Combining with "Command Processing Module Source Code Implementation 1" and the command processing flow in this chapter, it can be concluded that the runCommandImpl interface finally executes the run interface of a specific command through the following calling process. Here, the insert writing and reading processes are taken as an example, the mongod instance writes The calling process is shown in the following figure:

       Finally, the mongod and mongos instances call the run interface of the relevant command to complete the specific command command processing operation. Commonly used operation commands related to mongos, mongod (shardServer), and mongod (configServer) (taking the most basic read and write commands as an example) are summarized in the following table:

{{o.name}}
{{m.name}}

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324141280&siteId=291194637