In-depth understanding of Commonjs specification and Node module implementation

 

Reprinted 2017-05-17 Author: Little Match's blue ideal     I want to comment

This article mainly introduces the in-depth understanding of the Commonjs specification and the implementation of the Node module. The editor thinks it is quite good. Now I will share it with you and give you a reference. Come along with me and have a look
 

previous words

Node is not completely implemented in accordance with the CommonJS specification, but makes certain trade-offs for the module specification, and also adds a few features that it needs. This article will introduce the module implementation of NodeJS in detail

introduce

Nodejs is different from javascript, the top-level object in javascript is window, and the top-level object in node is global

[Note] In fact, there is also a global object in javascript, but it does not access externally, but uses the window object to point to the global object.

In javascript, through var a = 100; you can get 100 through window.a

But in nodejs, it cannot be accessed through global.a, and the result is undefined

This is because var a = 100; the variable a in this statement is just the variable a in the module scope, not the a under the global object

In nodejs, a file is a module, and each module has its own scope. A variable declared with var is not global, but belongs to the current module

If you want to declare the variable under the global scope like this

 Overview

Modules in Node are divided into two categories: one is the module provided by Node, called the core module; the other is the module written by the user, called the file module

The core module part is compiled into the binary executable file during the compilation process of the Node source code. When the Node process starts, some core modules are directly loaded into the memory, so when this part of the core module is introduced, the two steps of file location and compilation and execution can be omitted, and priority is given in the path analysis, so its loading speed is the fastest

The file module is dynamically loaded at runtime, requiring complete path analysis, file location, compilation and execution process, and the speed is slower than the core module

Next, we expand the detailed module loading process

module loading

In javascript, the script tag can be used to load modules, but in nodejs, how to load another module in one module?

Use the require() method to import

【Cache Loading】

Before introducing the identifier analysis of the require() method, you need to know that, just as the front-end browser caches static script files to improve performance, Node caches imported modules to reduce the overhead of secondary introduction. The difference is that browsers only cache files, while Node caches objects after compilation and execution

Whether it is a core module or a file module, the require() method will always use the cache-first method for the secondary loading of the same module, which is the first priority. The difference is that the cache check of the core module precedes the cache check of the file module

【Identifier Analysis】

The require() method accepts an identifier as a parameter. In the Node implementation, it is based on such an identifier that module lookups are performed. Module identifiers are mainly divided into the following categories in Node: [1] Core modules, such as http, fs, path, etc.; [2] Relative path file modules starting with . or ..; [3] Absolute paths starting with / File module; [4] Non-path file module, such as custom connect module

According to the different formats of the parameters, the require command goes to different paths to find the module file

1. If the parameter string starts with "/", it means that a module file located in an absolute path is loaded. For example, require('/home/marco/foo.js') will load /home/marco/foo.js

2. If the parameter string starts with "./", it means that a module file located in a relative path (compared to the current script execution location) is loaded. For example, require('./circle') will load circle.js in the same directory as the current script

3. If the parameter string does not start with "./" or "/", it means that a core module provided by default (located in Node's system installation directory), or an installed module located in the node_modules directory at all levels is loaded (Global installation or local installation)

[Note] If it is a file module in the current path, it must start with ./, otherwise nodejs will try to load the core module, or the module in node_modules

//a.js
console.log('aaa');

//b.js
require('./a');//'aaa'
require('a');//Error

【File extension analysis】

In the process of require() parsing the identifier, there will be cases where the identifier does not contain the file extension. The CommonJS module specification also allows the identifier to not contain the file extension. In this case, Node will first check if there is a file without a suffix, and if not, then fill in the extension in the order of .js, .json, and .node , in turn try

In the process of trying, you need to call the fs module to synchronously block to determine whether the file exists. Because Node is single-threaded, this is where performance issues arise. Here's a little trick: in the case of .node and .json files, it's a little faster to include the extension in the identifier passed to require(). Another trick is: synchronization and caching can greatly alleviate the defects of blocking calls in Node's single thread

【Catalog Analysis and Package】

In the process of analyzing identifiers, after requiring() analyzes the file extension, the corresponding file may not be found, but a directory is obtained, which often occurs when introducing custom modules and searching for module-by-module paths. At this time Node treats a directory as a package

In the process, Node has supported the CommonJS package specification to a certain extent. First, Node searches for package.json (package description file defined by the CommonJS package specification) in the current directory, parses the package description object through JSON.parse(), and takes out the file name specified by the main attribute for positioning. If the filename lacks an extension, it will enter the extension analysis step

If the file name specified by the main property is wrong, or there is no package.json file at all, Node will use index as the default file name, and then search for index.js, index.json, and index.node in turn.

If no file is located successfully in the process of directory analysis, the custom module enters the next module path to search. If the module path array has been traversed and the target file is still not found, a search failure exception will be thrown

 access variable

How to access variables defined in another module in one module?

【global】

The easiest way to think of it is to copy the variables defined by a module into the global environment, and then another module can access the global environment.

//a.js
var a = 100;
global.a = a;

//b.js
require('./a');
console.log(global.a);//100

Although this method is simple, it is not recommended because it will pollute the global environment.

【module】

The common method is to use the module object Module provided by nodejs, which saves some information related to the current module

function Module(id, parent) {
  this.id = id;
  this.exports = {};
  this.parent = parent;
  if (parent && parent.children) {
    parent.children.push(this);
  }
  this.filename = null;
  this.loaded = false;
  this.children = [];
}
  1. module.id The identifier of the module, usually the module filename with an absolute path.
  2. module.filename The filename of the module, with an absolute path.
  3. module.loaded returns a boolean value indicating whether the module has finished loading.
  4. module.parent returns an object representing the module that called this module.
  5. module.children returns an array of other modules used by this module.
  6. module.exports represents the value exported by the module to the outside world.

【exports】

The module.exports attribute represents the external output interface of the current module. When other files load the module, they actually read the module.exports variable.

//a.js
var a = 100;
module.exports.a = a;

//b.js
var result = require('./a');
console.log(result);//'{ a: 100 }'

For convenience, Node provides an exports variable for each module, pointing to module.exports. The result is that when exporting the module interface externally, you can add methods to the exports object

console.log(module.exports === exports);//true

[Note] You cannot directly point the exports variable to a value, because this is equivalent to cutting the connection between exports and module.exports

module compilation

Compilation and execution are the final stages of a module's implementation. After locating a specific file, Node will create a new module object, and then load and compile it according to the path. For different file extensions, the loading method is also different, as shown below

js file - Compile and execute after synchronously reading the file through the fs module

node file - this is an extension file written in C/C++, and the final compiled file is loaded through the dlopen() method

json file - after synchronously reading the file through the fs module, use JSON.parse() to parse the return result

The rest of the extension files - they are all loaded as .js files

Each successfully compiled module will cache its file path as an index on the Module._cache object to improve the performance of secondary introduction

Depending on the file extension, Node will call different reading methods. For example, the .json file is called as follows:

// Native extension for .json
Module._extensions['.json'] = function(module, filename) {
  var content = NativeModule.require('fs').readFileSync(filename, 'utf8');
  try {
    module.exports = JSON.parse(stripBOM(content));
  } catch (err) {
    err.message = filename + ': ' + err.message;
    throw err;
  }
};

Among them, Module._extensions will be assigned to the extensions attribute of require(), so by accessing require.extensions in the code, you can know the existing extension loading method in the system. Write the following code to test it:

console.log(require.extensions);

The execution results obtained are as follows:

{ '.js': [Function], '.json': [Function], '.node': [Function] }

After determining the file extension, Node will call the specific compilation method to execute the file and return it to the caller

[Compilation of JavaScript modules]

Going back to the CommonJS module specification, we know that there are three variables: require, exports, and module in each module file, but they are not defined in the module file, so where do they come from? Even in the API documentation of Node, we know that there are two variables, filename and dirname, in each module. Where do they come from? If we put the process of directly defining modules on the browser side, there will be a situation of polluting global variables

In fact, during the compilation process, Node wraps the content of the acquired JavaScript file head and tail. Added (function(exports, require, module, filename, dirname) {\n at the head, added \n} at the end);

A normal JavaScript file would be wrapped like this

(function (exports, require, module, filename, dirname) {
  var math = require('math');
  exports.area = function (radius) {
    return Math.PI * radius * radius;
  };
});

In this way, scope isolation is performed between each module file. The wrapped code will be executed through the runInThisContext() method of the vm native module (similar to eval, but with a clear context and no global pollution), returning a specific function object. Finally, pass the exports attribute of the current module object, the require() method, module (the module object itself), and the full file path and file directory obtained in the file location as parameters to this function() execution

That's why these variables are not defined in every module file but exist. After execution, the module's exports property is returned to the caller. Any method and property on the exports property can be called externally, but other variables or properties in the module cannot be called directly

So far, the process of require, exports, and module has been completed. This is Node's implementation of the CommonJS module specification.

[Compilation of C/C++ modules]

Node calls the process.dlopen() method to load and execute. Under the Node architecture, the dlopen() method has different implementations under Windows and *nix platforms, and is encapsulated by the libuv compatibility layer

In fact, the .node module file does not need to be compiled, because it is compiled and generated after writing the C/C++ module, so there is only the process of loading and execution. During the execution process, the exports object of the module is associated with the .node module and then returned to the caller

The advantages that C/C++ modules bring to Node users are mainly in terms of execution efficiency, while the disadvantage is that the writing threshold of C/C++ modules is higher than that of JavaScript

[Compilation of JSON files]

Compilation of .json files is the easiest of the 3 compilation methods. After Node uses the fs module to synchronously read the content of the JSON file, it calls the JSON.parse() method to get the object, and then assigns it to the exports of the module object for external calls

JSON files are useful when used as configuration files for projects. If you define a JSON file as configuration, you don't have to call the fs module to read and parse asynchronously, just call require() to import it. In addition, you can also enjoy the convenience of module caching, and there is no performance impact when re-importing

 CommonJS

After introducing Node's module implementation, come back and learn the CommonJS specification, which is relatively easy to understand

The CommonJS specification is proposed mainly to make up for the current lack of standards in javascript, so that it has the basic ability to develop large-scale applications, rather than staying in the stage of small script programs

The definition of module in CommonJS is very simple, which is mainly divided into three parts: module reference, module definition and module identification

【Module reference】

var math = require('math');

In the CommonJS specification, there is a require() method, which accepts a module identifier to introduce a module's API into the current context

【Module Definition】

In modules, the context provides the require() method to import external modules. Corresponding to the imported functions, the context provides the exports object for exporting the methods or variables of the current module, and it is the only exported export. In the module, there is also a module object, which represents the module itself, and exports are the properties of the module. In Node, a file is a module, and you can define the export method by mounting the method on the exports object as an attribute:

// math.js
exports.add = function () {
  var sum = 0, i = 0,args = arguments, l = args.length;
  while (i < l) {
    sum += args[i++];
  }
  return sum;
};

In another file, after we introduce the module through the require() method, we can call the defined property or method

// program.js
var math = require('math');
exports.increment = function (val) {
  return math.add(val, 1);
};

【Module Identification】

The module identifier is actually the parameter passed to the require() method. It must be a string that conforms to the small camel case, or a relative path starting with ., .., or an absolute path. It can be without filename suffix .js

The definition of the module is very simple, and the interface is also very simple. Its significance is to limit clustering methods and variables to private scopes, and to support import and export functions to smoothly connect upstream and downstream dependencies. Each module has an independent space, they do not interfere with each other, and they appear clean when referencing

The above is the whole content of this article, I hope it will be helpful to everyone's learning, and I hope everyone will support Scripting Home.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324713726&siteId=291194637