Exploration of front-end technology - CommonJS specification implementation principle of Nodejs | JD Logistics Technology Team

Learn about Node.js

Node.js is a JavaScript running environment based on the ChromeV8 engine. It uses an event-driven, non-blocking I/O model to allow JavaScript to run on the server side. It allows JavaScript to become a service with PHP, Python, Perl, Ruby, etc. A scripting language on par with terminal languages. Many built-in modules have been added to Node to provide a variety of functions, and many third-party modules are also provided.

module problem

Why have modules?

Complex front-end projects need to be layered and split into modules according to functions, businesses, and components. Modular projects have at least the following advantages:

  1. Facilitates unit testing
  2. Facilitate collaboration among colleagues
  3. Extracting public methods makes development faster
  4. Load on demand, excellent performance
  5. High cohesion and low coupling
  6. Prevent variable conflicts
  7. Facilitate code project maintenance

Several modular specifications

  • CMD (SeaJS implements CMD)
  • AMD (RequireJS implements AMD)
  • UMD (supports both AMD and CMD)
  • IIFE (self-executing function)
  • CommonJS (Node uses CommonJS)
  • ES Module specification (JS official modularization solution)

Modules in Node

CommonJS specification adopted in Node

Implementation principle:

Node will read the file, get the content to implement modularization, and use the Require method to reference it synchronously.

Tips: Any js file in Node is a module, and every file is a module.

Module types in Node

  1. Built-in modules are core modules and do not need to be installed. They do not require relative path references in the project. Node itself provides them.
  2. File module, a js file module written by the programmer himself.
  3. Third-party modules need to be installed, and there is no need to add a path after installation.

Built-in modules in Node

fs filesystem

This module is required to operate files

const path = require('path'); // 处理路径
const fs = require('fs'); // file system
// // 同步读取
let content = fs.readFileSync(path.resolve(__dirname, 'test.js'), 'utf8');
console.log(content);

let exists = fs.existsSync(path.resolve(__dirname, 'test1.js'));
console.log(exists);

path path processing

const path = require('path'); // 处理路径


// join / resolve 用的时候可以混用

console.log(path.join('a', 'b', 'c', '..', '/'))

// 根据已经有的路径来解析绝对路径, 可以用他来解析配置文件
console.log(path.resolve('a', 'b', '/')); // resolve 不支持/ 会解析成根路径

console.log(path.join(__dirname, 'a'))
console.log(path.extname('1.js'))
console.log(path.dirname(__dirname)); // 解析父目录

vm runs code

How can a string be turned into JS for execution?

1.eval

The scope when the code in eval is executed is the current scope. It can access local variables in functions.

let test = 'global scope'
global.test1 = '123'
function b(){
  test = 'fn scope'
  eval('console.log(test)'); //local scope
  new Function('console.log(test1)')() // 123
  new Function('console.log(test)')() //global scope
}
b()

2.new Function

When new Function() creates a function, it does not reference the current lexical environment, but the global environment. The variables used in the expressions in Function are either passed in parameters or global values.

Function can obtain global variables, so it may still have variable pollution.

function getFn() {
  let value = "test"
  let fn = new Function('console.log(value)')
  return fn
}

getFn()()

global.a = 100 // 挂在到全局对象global上
new Function("console.log(a)")() // 100

3.vm

In the previous two methods, we have always emphasized one concept, which is the pollution of variables.

The characteristic of VM is that it is not affected by the environment. It can also be said that it is a sandbox environment.

In Node, global variables are shared among multiple modules, so try not to define properties in global.

Therefore, vm.runInThisContext can access global variables on global , but cannot access custom variables. vm.runInNewContext cannot access global or custom variables. It exists in a brand new execution context .

const vm = require('vm')
global.a = 1
// vm.runInThisContext("console.log(a)")
vm.runInThisContext("a = 100") // 沙箱,独立的环境
console.log(a) // 1
vm.runInNewContext('console.log(a)')
console.log(a) // a is not defined

Node modular implementation

Node has its own modularization mechanism. Each file is a separate module, and it follows the CommonJS specification, that is, using require to import modules and exporting modules through module.export.

The running mechanism of the node module is also very simple. In fact, each module is wrapped with a layer of functions. With the wrapping of functions, scope isolation between codes can be achieved.

We first print the arguments directly in a js file. The result is as shown in the figure below. Let's remember these parameters first.

console.log(arguments) // exports, require, module, __filename, __dirname

In Node, it is exported through modules.export and introduced by require. Among them, require relies on the fs module in node to load the module file, and what is read through fs.readFile is a string.

In javascrpt, you can use eval or new Function to convert a string into js code for running. But as mentioned before, they all have a fatal problem, which is variable pollution .

Implement the require module loader

First import the dependent module path , fs , vm , and create a Require function. This function receives a modulePath parameter, indicating the file path to be imported.

const path = require('path');
const fs = require('fs');
const vm = require('vm');
// 定义导入类,参数为模块路径
function Require(modulePath) {
   ...
}

Get the absolute path of the module in Require, use fs to load the module, here use new Module to abstract the module content, use tryModuleLoad to load the module content, Module and tryModuleLoad will be implemented later, the return value of Require should be the content of the module, That is module.exports.

// 定义导入类,参数为模块路径
function Require(modulePath) {
    // 获取当前要加载的绝对路径
    let absPathname = path.resolve(__dirname, modulePath);
    // 创建模块,新建Module实例
    const module = new Module(absPathname);
    // 加载当前模块
    tryModuleLoad(module);
    // 返回exports对象
    return module.exports;
}

The implementation of Module is to create an exports object for the module. When tryModuleLoad is executed, the content is added to the exports . The id is the absolute path of the module.

// 定义模块, 添加文件id标识和exports属性
function Module(id) {
    this.id = id;
    // 读取到的文件内容会放在exports中
    this.exports = {};
}

The node module runs in a function. Here, the static attribute wrapper is mounted to the Module, which defines the string of this function. The wrapper is an array. The first element of the array is the parameter part of the function, which includes exports, module, Require, __dirname, __filename, are all commonly used global variables in modules.

The second parameter is the end of the function. Both parts are strings. When using, just wrap them outside the string of the module.

// 定义包裹模块内容的函数
Module.wrapper = [
    "(function(exports, module, Require, __dirname, __filename) {",
    "})"
]

_extensions is used to use different loading methods for different module extensions. For example, JSON and javascript loading methods are definitely different. JSON uses JSON.parse to run.

JavaScript uses vm.runInThisContext to run. You can see that fs.readFileSync passes in module.id. That is, when Module is defined, the id stores the absolute path of the module. The read content is a string. Use Module.wrapper to Wrapping it is equivalent to wrapping another function outside this module, thus realizing a private scope.

Use call to execute the fn function. The first parameter changes the running this and passes it into module.exports. The subsequent parameters are the parameters exports, module, Require, __dirname, __filename wrapped around the function. /

// 定义扩展名,不同的扩展名,加载方式不同,实现js和json
Module._extensions = {
    '.js'(module) {
        const content = fs.readFileSync(module.id, 'utf8');
        const fnStr = Module.wrapper[0] + content + Module.wrapper[1];
        const fn = vm.runInThisContext(fnStr);
        fn.call(module.exports, module.exports, module, Require,__filename,__dirname);
    },
    '.json'(module) {
        const json = fs.readFileSync(module.id, 'utf8');
        module.exports = JSON.parse(json); // 把文件的结果放在exports属性上
    }
}

The tryModuleLoad function receives the module object, obtains the module's suffix name through path.extname , and then uses Module._extensions to load the module.

// 定义模块加载方法
function tryModuleLoad(module) {
    // 获取扩展名
    const extension = path.extname(module.id);
    // 通过后缀加载当前模块
    Module._extensions[extension](module); // 策略模式???
}

At this point, the Require loading mechanism is basically finished. When Require loads a module, pass in the module name and use path.resolve(__dirname, modulePath) in the Require method to get the absolute path of the file. Then create the module object by instantiating new Module, store the absolute path of the module in the id attribute of the module, and create the exports attribute in the module as a json object.

Use the tryModuleLoad method to load the module. Use path.extname in tryModuleLoad to obtain the file extension, and then execute the corresponding module loading mechanism based on the extension.

The module that will eventually be loaded is mounted in module.exports. After tryModuleLoad is executed, module.exports already exists, so just return it directly.

Next, we add caching to the module. That is, when the file is loaded, the file is put into the cache. When loading the module, first check whether it exists in the cache. If it exists, use it directly. If it does not exist, reload it, and then put it into the cache after loading.

// 定义导入类,参数为模块路径
function Require(modulePath) {
  // 获取当前要加载的绝对路径
  let absPathname = path.resolve(__dirname, modulePath);
  // 从缓存中读取,如果存在,直接返回结果
  if (Module._cache[absPathname]) {
      return Module._cache[absPathname].exports;
  }
  // 创建模块,新建Module实例
  const module = new Module(absPathname);
  // 添加缓存
  Module._cache[absPathname] = module;
  // 加载当前模块
  tryModuleLoad(module);
  // 返回exports对象
  return module.exports;
}

Added function: omit module suffix name.

Automatically add a suffix name to the module to load the module without the suffix name. In fact, if the file does not have a suffix name, it will traverse all the suffix names to see if the file exists.

// 定义导入类,参数为模块路径
function Require(modulePath) {
  // 获取当前要加载的绝对路径
  let absPathname = path.resolve(__dirname, modulePath);
  // 获取所有后缀名
  const extNames = Object.keys(Module._extensions);
  let index = 0;

  // 存储原始文件路径
  const oldPath = absPathname;
  function findExt(absPathname) {
      if (index === extNames.length) {
         return throw new Error('文件不存在');
      }
      try {
          fs.accessSync(absPathname);
          return absPathname;
      } catch(e) {
          const ext = extNames[index++];
          findExt(oldPath + ext);
      }
  }
  
  // 递归追加后缀名,判断文件是否存在
  absPathname = findExt(absPathname);
  // 从缓存中读取,如果存在,直接返回结果
  if (Module._cache[absPathname]) {
      return Module._cache[absPathname].exports;
  }
  // 创建模块,新建Module实例
  const module = new Module(absPathname);
  // 添加缓存
  Module._cache[absPathname] = module;
  // 加载当前模块
  tryModuleLoad(module);
  // 返回exports对象
  return module.exports;
}

Source code debugging

We can debug Node.js through VSCode

step

Create file a.js

module.exports = 'abc'

1.File test.js

let r = require('./a')

console.log(r)

1. Configuring debug is essentially configuring the .vscode/launch.json file, and the essence of this file is to provide multiple startup command entry options.

Some common parameters are as follows:

  • program controls the path of the startup file (i.e. entry file)
  • The name displayed in the name drop-down menu (the entry name corresponding to this command)
  • Request is divided into launch and attach (the process has been started)
  • skipFiles specifies code to skip during single-step debugging
  • runtimeExecutable sets the runtime executable file. The default is node. It can be set to nodemon, ts-node, npm, etc.

Modify launch.json, skipFiles specifies the code to be skipped by single-step debugging

  1. Put a break point in front of the line where the require method in the test.js file is located.
  2. Execute debugging and enter the source code related entry method

Sort out the code steps

1. First enter the require method: Module.prototype.require

2. Debug into the Module._load method, which returns module.exports. The Module._resolveFilename method returns the file address after processing, changes the file to an absolute address, and adds the file suffix if the file does not have a suffix.

3. The Module class is defined here. id is the file name. The exports attribute is defined in this class

4. Then debug to the module.load method, which uses the strategy mode. Module._extensions[extension](this, filename) calls different methods according to the different file suffix names passed in.

5. Enter the method, see the core code, read the incoming file address parameter, get the string content in the file, and execute module._compile

6. The wrapSafe method is executed in this method. Add function suffixes before and after the string, and use the runInthisContext method in the vm module in Node to execute the string, which will directly execute the content of the console.log code line in the incoming file.

At this point, the entire process code for implementing the require method in the entire Node has been debugged. By debugging the source code, it can help us learn its implementation ideas, code style and specifications, help us implement the tool library, and improve our code ideas. , at the same time, we know the relevant principles, and it also helps us solve the problems we encounter in daily development work.

Author: JD Logistics Qiao Panpan

Source: JD Cloud Developer Community Ziyuanqishuo Tech Please indicate the source when reprinting

 

OpenAI opens ChatGPT Voice Vite 5 for free to all users. It is officially released . Operator's magic operation: disconnecting the network in the background, deactivating broadband accounts, forcing users to change optical modems. Microsoft open source Terminal Chat programmers tampered with ETC balances and embezzled more than 2.6 million yuan a year. Used by the father of Redis Pure C language code implements the Telegram Bot framework. If you are an open source project maintainer, how far can you endure this kind of reply? Microsoft Copilot Web AI will be officially launched on December 1, supporting Chinese OpenAI. Former CEO and President Sam Altman & Greg Brockman joined Microsoft. Broadcom announced the successful acquisition of VMware.
{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/4090830/blog/10150940