Handwritten webpack core principles

Handwritten webpack core principles

[Heel]

One, core packaging principle

1.1 The main process of packaging is as follows

  1. Need to read the contents of the entry file.
  2. Analyze the entry file, recursively read the content of the file that the module depends on, and generate the AST syntax tree.
  3. According to the AST syntax tree, generate code that the browser can run

    1.2 Specific details

  4. Get the main module content
  5. Analysis module
    • Install @babel/parser package (to AST)
  6. Process the module content
    • Install @babel/traverse package (traverse AST to collect dependencies)
    • Install @babel/core and @babel/preset-env packages (es6 to ES5)
  7. Recurse all modules
  8. Generate final code

2. Basic preparations

Let's build a project first

The project directory is temporarily as follows:

Handwritten webpack core principles

The project has been put on github : https://github.com/Sunny-lucking/howToBuildMyWebpack . Can you ask for a star?

We created the add.js file and minus.js file, and then introduced it into index.js, and then introduced the index.js file into index.html.

code show as below:

add.js

export default (a,b)=>{
  return a+b;
}

minus.js

export const minus = (a,b)=>{
    return a-b
}

index.js

import add from "./add"
import {minus} from "./minus";

const sum = add(1,2);
const division = minus(2,1);

console.log(sum);
console.log(division);

index.html

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Title</title>
</head>
<body>
<script src="./src/index.js"></script>
</body>
</html>

Now we open index.html. Guess what will happen? ? ? Obviously an error will be reported, because the browser still does not recognize the import syntax

Handwritten webpack core principles

But that’s okay, because we were meant to solve these problems.

Three, get module content

Well, now we start to practice based on the above core packaging principles. The first step is to obtain module content.

Let's create a bundle.js file.

// 获取主入口文件
const fs = require('fs')
const getModuleInfo = (file)=>{
    const body = fs.readFileSync(file,'utf-8')
    console.log(body);
}
getModuleInfo("./src/index.js")

The current project directory is as follows

Handwritten webpack core principles

Let's execute bundle.js to see when the content of the entry file is successfully obtained

Handwritten webpack core principles

Wow, unsurprisingly successful. Everything is under control. Well, the first step has been achieved, let me see what the second step is.

Oh? Is the analysis module

Four, analysis module

The main task of the analysis module is to parse the acquired module content into an AST syntax tree, which requires a dependency package @babel/parser

npm install @babel/parser

ok, the installation is complete, we will introduce @babel/parser into bundle.js,

// 获取主入口文件
const fs = require('fs')
const parser = require('@babel/parser')
const getModuleInfo = (file)=>{
    const body = fs.readFileSync(file,'utf-8')
    const ast = parser.parse(body,{
        sourceType:'module' //表示我们要解析的是ES模块
    });
    console.log(ast);
}
getModuleInfo("./src/index.js")

Let's take a look at @babel/parser's documentation:

Handwritten webpack core principles
It can be seen that three APIs are provided, and we are currently using the parse API.

Its main function is to parse the provided code as an entire ECMAScript program, that is, to parse the provided code as an entire ECMAScript program.

Look at the parameters provided by the API

Handwritten webpack core principles
What we temporarily use is sourceType, which is used to indicate what module of the code we want to parse.

Okay, now let's execute bundle.js to see if the AST is successfully generated.

Handwritten webpack core principles

success. It was an unexpected success.

However, what we need to know is that currently what we parse out is not only the content in the index.js file, it also includes other information about the file.
And its content is actually in the body in its attribute program. as the picture shows

Handwritten webpack core principles

We can change to print ast.program.body to see

// 获取主入口文件
const fs = require('fs')
const parser = require('@babel/parser')
const getModuleInfo = (file)=>{
    const body = fs.readFileSync(file,'utf-8')
    const ast = parser.parse(body,{
        sourceType:'module' //表示我们要解析的是ES模块
    });
    console.log(ast.program.body);
}
getModuleInfo("./src/index.js"

carried out

Handwritten webpack core principles

Look, what is printed now is the content in the index.js file (that is, the code we wrote in index.js).

Five, collection dependence

Now we need to traverse the AST and collect the dependencies used. What does that mean? In fact, it is to collect the file path introduced with the import statement. We put the collected path into deps.

As we mentioned earlier, @babel/traverse dependency package is used to traverse AST

npm install @babel/traverse

Now we introduce.

const fs = require('fs')
const path = require('path')
const parser = require('@babel/parser')
const traverse = require('@babel/traverse').default
const getModuleInfo = (file)=>{
    const body = fs.readFileSync(file,'utf-8')
    const ast = parser.parse(body,{
        sourceType:'module' //表示我们要解析的是ES模块
    });

    // 新增代码
    const deps = {}
    traverse(ast,{
        ImportDeclaration({node}){
            const dirname = path.dirname(file)
            const abspath = './' + path.join(dirname,node.source.value)
            deps[node.source.value] = abspath
        }
    })
    console.log(deps);

}
getModuleInfo("./src/index.js")

Let's take a look at the description of @babel/traverse in the official document

Handwritten webpack core principles
Ok so simple

But it is not difficult to see that the first parameter is AST. The second parameter is the configuration object

Let's take a look at the code we wrote

traverse(ast,{
    ImportDeclaration({node}){
        const dirname = path.dirname(file)
        const abspath = './' + path.join(dirname,node.source.value)
        deps[node.source.value] = abspath
    }
})

In the configuration object, we have configured the ImportDeclaration method. What does this mean?
Let's take a look at the AST printed earlier.

Handwritten webpack core principles

The ImportDeclaration method represents the processing of nodes whose type is ImportDeclaration.

Here we get the value of the source in the node, which is node.source.value,

What does the value here mean? In fact, it is the value of import, you can look at our index.js code.

import add from "./add"
import {minus} from "./minus";

const sum = add(1,2);
const division = minus(2,1);

console.log(sum);
console.log(division);

It can be seen that value refers to the'./add' and'./minus' behind import

Then we concatenate the file directory path with the obtained value value and save it in deps, which is called the collection dependency.

ok, this operation is over, execute it to see if the collection is successful?

Handwritten webpack core principles

oh my god. It succeeded again.

6. Convert ES6 to ES5 (AST)

Now we need to convert the obtained ES6 AST into ES5 AST. As mentioned earlier, two dependency packages are required to perform this step

npm install @babel/core @babel/preset-env

We will now introduce the dependency and use

const fs = require('fs')
const path = require('path')
const parser = require('@babel/parser')
const traverse = require('@babel/traverse').default
const babel = require('@babel/core')
const getModuleInfo = (file)=>{
    const body = fs.readFileSync(file,'utf-8')
    const ast = parser.parse(body,{
        sourceType:'module' //表示我们要解析的是ES模块
    });
    const deps = {}
    traverse(ast,{
        ImportDeclaration({node}){
            const dirname = path.dirname(file)
            const abspath = "./" + path.join(dirname,node.source.value)
            deps[node.source.value] = abspath
        }
    })

    新增代码
    const {code} = babel.transformFromAst(ast,null,{
        presets:["@babel/preset-env"]
    })
    console.log(code);

}
getModuleInfo("./src/index.js")

Let's take a look at the official website document's introduction to @babel/core's transformFromAst

Handwritten webpack core principles

Harm, it is as simple as always. . .

To put it simply, it actually converts the AST we passed in into the module type we configured in the third parameter.

Okay, now let’s execute it and see the result

Handwritten webpack core principles

My God, success as always. It can be seen that it converts our writing const into var.

Well, this step is over here. Hey, you may have questions. The collection dependency of the previous step doesn't matter here, it is true. The collection of dependencies is for the recursive operation below.

Seven, recursively get all dependencies

After the above process, we now know that getModuleInfo is used to obtain the content of a module, but we have not returned the obtained content, so we change the getModuleInfo method

const getModuleInfo = (file)=>{
    const body = fs.readFileSync(file,'utf-8')
    const ast = parser.parse(body,{
        sourceType:'module' //表示我们要解析的是ES模块
    });
    const deps = {}
    traverse(ast,{
        ImportDeclaration({node}){
            const dirname = path.dirname(file)
            const abspath = "./" + path.join(dirname,node.source.value)
            deps[node.source.value] = abspath
        }
    })
    const {code} = babel.transformFromAst(ast,null,{
        presets:["@babel/preset-env"]
    })
    // 新增代码
    const moduleInfo = {file,deps,code}
    return moduleInfo
}

We returned an object, this object includes the path of the module (file) , the dependency of the module (deps) , and the module is transformed into es5 code

This method can only get information about a module, but how do we get information about dependent modules in a module?

That's right, looking at the title, you should think of it even recursively.

Now let’s write a recursive method to get dependencies recursively

const parseModules = (file) =>{
    const entry =  getModuleInfo(file)
    const temp = [entry]
    for (let i = 0;i<temp.length;i++){
        const deps = temp[i].deps
        if (deps){
            for (const key in deps){
                if (deps.hasOwnProperty(key)){
                    temp.push(getModuleInfo(deps[key]))
                }
            }
        }
    }
    console.log(temp)
}

Explain the parseModules method:

  1. We first pass in the main module path
  2. Put the obtained module information into the temp array.
  3. The outside loop traverses the temp array. At this time, the temp array only has the main module
  4. The dependency deps of the main module is obtained inside
  5. Traverse the deps, and push the obtained dependent module information to the temp array by calling getModuleInfo.

The current bundle.js file:

const fs = require('fs')
const path = require('path')
const parser = require('@babel/parser')
const traverse = require('@babel/traverse').default
const babel = require('@babel/core')
const getModuleInfo = (file)=>{
    const body = fs.readFileSync(file,'utf-8')
    const ast = parser.parse(body,{
        sourceType:'module' //表示我们要解析的是ES模块
    });
    const deps = {}
    traverse(ast,{
        ImportDeclaration({node}){
            const dirname = path.dirname(file)
            const abspath = "./" + path.join(dirname,node.source.value)
            deps[node.source.value] = abspath
        }
    })
    const {code} = babel.transformFromAst(ast,null,{
        presets:["@babel/preset-env"]
    })
    const moduleInfo = {file,deps,code}
    return moduleInfo
}

// 新增代码
const parseModules = (file) =>{
    const entry =  getModuleInfo(file)
    const temp = [entry]
    for (let i = 0;i<temp.length;i++){
        const deps = temp[i].deps
        if (deps){
            for (const key in deps){
                if (deps.hasOwnProperty(key)){
                    temp.push(getModuleInfo(deps[key]))
                }
            }
        }
    }
    console.log(temp)
}
parseModules("./src/index.js")

According to the current execution of our project, it should be that temp should store three modules: index.js, add.js, and minus.js.
, Execute and see.

Handwritten webpack core principles

Awesome! ! ! It is true.

However, the current object format in the temp array is not conducive to the subsequent operations. We hope to store it in the form of the file path as the key and {code, deps} as the value. Therefore, we create a new object depsGraph.

const parseModules = (file) =>{
    const entry =  getModuleInfo(file)
    const temp = [entry] 
    const depsGraph = {} //新增代码
    for (let i = 0;i<temp.length;i++){
        const deps = temp[i].deps
        if (deps){
            for (const key in deps){
                if (deps.hasOwnProperty(key)){
                    temp.push(getModuleInfo(deps[key]))
                }
            }
        }
    }
    // 新增代码
    temp.forEach(moduleInfo=>{
        depsGraph[moduleInfo.file] = {
            deps:moduleInfo.deps,
            code:moduleInfo.code
        }
    })
    console.log(depsGraph)
    return depsGraph
}

ok, now it’s stored in this format

Handwritten webpack core principles

Eight, deal with two keywords

Our goal now is to generate a bundle.js file, which is a packaged file. In fact, the idea is very simple, that is, to integrate the content of index.js with its dependent modules. Then write the code to a new js file.

Handwritten webpack core principles

Let's format this code

// index.js
"use strict"
var _add = _interopRequireDefault(require("./add.js"));
var _minus = require("./minus.js");
function _interopRequireDefault(obj) { return obj && obj.__esModule ? obj : { "default": obj }; }
var sum = (0, _add["default"])(1, 2);
var division = (0, _minus.minus)(2, 1);
console.log(sum); console.log(division);
// add.js
"use strict";
Object.defineProperty(exports, "__esModule", {  value: true});
exports["default"] = void 0;
var _default = function _default(a, b) {  return a + b;};
exports["default"] = _default;

But we can't execute the code of index.js now, because the browser will not recognize and execute require and exports.

Why can't it be recognized? It's not because the require function and exports object are not defined. Then we can define it ourselves.

We create a function

const bundle = (file) =>{
    const depsGraph = JSON.stringify(parseModules(file))

}

We save the depsGraph obtained in the previous step.

Now return an integrated string code.

How to return? Change the bundle function

const bundle = (file) =>{
    const depsGraph = JSON.stringify(parseModules(file))
    return `(function (graph) {
                function require(file) {
                    (function (code) {
                        eval(code)
                    })(graph[file].code)
                }
                require(file)
            })(depsGraph)`

}

Let's look at the returned code

 (function (graph) {
        function require(file) {
            (function (code) {
                eval(code)
            })(graph[file].code)
        }
        require(file)
    })(depsGraph)

Is actually

  1. Pass the saved depsGraph to an immediate execution function.
  2. Pass the main file path to the require function for execution
  3. When the reuire function is executed, an immediate execution function is executed immediately, here is the value of the code is passed in
  4. Execute eval(code). That is to execute this code

Let's look at the value of code

// index.js
"use strict"
var _add = _interopRequireDefault(require("./add.js"));
var _minus = require("./minus.js");
function _interopRequireDefault(obj) { return obj && obj.__esModule ? obj : { "default": obj }; }
var sum = (0, _add["default"])(1, 2);
var division = (0, _minus.minus)(2, 1);
console.log(sum); console.log(division);

Yes, when this code is executed, the require function is used again. At this time, the parameter of require is the path of add.js, hey, it is not an absolute path, it needs to be converted into an absolute path. So write a function absRequire to transform. How to achieve it? Let's look at the code

(function (graph) {
    function require(file) {
        function absRequire(relPath) {
            return require(graph[file].deps[relPath])
        }
        (function (require,code) {
            eval(code)
        })(absRequire,graph[file].code)
    }
    require(file)
})(depsGraph)

In fact, a layer of interception is realized.

  1. Execute the require('./src/index.js') function
  2. Executed
    (function (require,code) {
    eval(code)
    })(absRequire,graph[file].code)
  3. Executing eval means that the code of index.js is executed.
  4. The execution process will execute to the require function.
  5. At this time, this require will be called, which is the absRequire we passed in
    Handwritten webpack core principles
  6. The execution of absRequire executes return require(graph[file].deps[relPath])this code, that is, executes the external require

Handwritten webpack core principles
Here return require(graph[file].deps[relPath]), we have converted the path into an absolute path. Therefore, the absolute path is passed in when the external require is executed.

  1. After executing require("./src/add.js"), eval will be executed again, that is, the code of the add.js file will be executed.

Is it a bit convoluted? It is actually recursive.

This will integrate the code, but there is a problem, that is, when the code of add.js is executed, it will encounter the undefined problem of exports. As follows

// add.js
"use strict";
Object.defineProperty(exports, "__esModule", {  value: true});
exports["default"] = void 0;
var _default = function _default(a, b) {  return a + b;};
exports["default"] = _default;

We found that it uses exports as an object, but this object has not been defined yet, so we can define an exports object ourselves.

(function (graph) {
    function require(file) {
        function absRequire(relPath) {
            return require(graph[file].deps[relPath])
        }
        var exports = {}
        (function (require,exports,code) {
            eval(code)
        })(absRequire,exports,graph[file].code)
        return exports
    }
    require(file)
})(depsGraph)

We have added an empty object exports. When the add.js code is executed, some attributes will be added to this empty object.

// add.js
"use strict";
Object.defineProperty(exports, "__esModule", {  value: true});
exports["default"] = void 0;
var _default = function _default(a, b) {  return a + b;};
exports["default"] = _default;

For example, after executing this code

exports = {
  __esModule:{  value: true},
  default:function _default(a, b) {  return a + b;}
}

Then we return the exports object.

var _add = _interopRequireDefault(require("./add.js"));

It can be seen that the return value is received by _interopRequireDefault, and _interopRequireDefault returns the default attribute to _add, so_add = function _default(a, b) { return a + b;}

Now I understand why the ES6 module introduces an object reference, because exports is an object.

At this point, processing; the function of the two keywords is complete.

const fs = require('fs')
const path = require('path')
const parser = require('@babel/parser')
const traverse = require('@babel/traverse').default
const babel = require('@babel/core')
const getModuleInfo = (file)=>{
    const body = fs.readFileSync(file,'utf-8')
    const ast = parser.parse(body,{
        sourceType:'module' //表示我们要解析的是ES模块
    });
    const deps = {}
    traverse(ast,{
        ImportDeclaration({node}){
            const dirname = path.dirname(file)
            const abspath = "./" + path.join(dirname,node.source.value)
            deps[node.source.value] = abspath
        }
    })
    const {code} = babel.transformFromAst(ast,null,{
        presets:["@babel/preset-env"]
    })
    const moduleInfo = {file,deps,code}
    return moduleInfo
}
const parseModules = (file) =>{
    const entry =  getModuleInfo(file)
    const temp = [entry]
    const depsGraph = {}
    for (let i = 0;i<temp.length;i++){
        const deps = temp[i].deps
        if (deps){
            for (const key in deps){
                if (deps.hasOwnProperty(key)){
                    temp.push(getModuleInfo(deps[key]))
                }
            }
        }
    }
    temp.forEach(moduleInfo=>{
        depsGraph[moduleInfo.file] = {
            deps:moduleInfo.deps,
            code:moduleInfo.code
        }
    })
    return depsGraph
}
// 新增代码
const bundle = (file) =>{
    const depsGraph = JSON.stringify(parseModules(file))
    return `(function (graph) {
        function require(file) {
            function absRequire(relPath) {
                return require(graph[file].deps[relPath])
            }
            var exports = {}
            (function (require,exports,code) {
                eval(code)
            })(absRequire,exports,graph[file].code)
            return exports
        }
        require('${file}')
    })(${depsGraph})`

}
const content = bundle('./src/index.js')

console.log(content);

Let's execute it and see the effect

Handwritten webpack core principles
It is indeed successful. Next, write the returned code into the newly created file

//写入到我们的dist目录下
fs.mkdirSync('./dist');
fs.writeFileSync('./dist/bundle.js',content)

This is the end of our handwritten core principles of webpack.

Let's take a look at the generated bundle.js file

Handwritten webpack core principles

It is found that all the dependencies we collected earlier are passed as parameters to the immediate execution function, and then each dependent code is executed recursively through eval.

Now we import the bundle.js file into index.html to see if it can be executed

Handwritten webpack core principles

success. . . . . Surprise. .

Thank you and congratulations for seeing this, can I humblely ask for a star! ! !
Handwritten webpack core principles

Guess you like

Origin blog.51cto.com/14180083/2542895