Handwriting the core principles of webpack, no longer afraid of the interviewer asking me the principles of webpack

1. The core packaging principle

1.1 The main process of packaging is as follows

  1. Need to read the contents of the entry file.

  2. Analyze the entry file, recursively read the content of the file that the module depends on, and generate the AST syntax tree.

  3. According to the AST syntax tree, generate code that the browser can run

1.2 Specific details

  1. Get the content of the main module

  2. Analysis module

  • Install @babel/parser package (to AST)

  • Process the module content

    • Install @babel/traverse package (traverse AST to collect dependencies)

    • Install @babel/core and @babel/preset-env packages (es6 to ES5)

  • Recurse all modules

  • Generate final code

  • 2. Basic preparations

    Let's build a project

    The project directory is temporarily as follows:

    The project has been put on "github" : https://github.com/Sunny-lucking/howToBuildMyWebpack. Can you ask for a star?

    We created the add.js file and minus.js file, and then introduced it into index.js, and then introduced the index.js file into index.html.

    code show as below:

    add.js

    export default (a,b)=>{
      return a+b;
    }
    

    minus.js

    export const minus = (a,b)=>{
        return a-b
    }
    

    index.js

    import add from "./add"
    import {minus} from "./minus";
    
    const sum = add(1,2);
    const division = minus(2,1);
    
    console.log(sum);
    console.log(division);
    

    index.html

    <!DOCTYPE html>
    <html lang="en">
    <head>
        <meta charset="UTF-8">
        <title>Title</title>
    </head>
    <body>
    <script src="./src/index.js"></script>
    </body>
    </html>
    

    Now we open index.html. Guess what will happen? ? ? Obviously an error will be reported, because the browser still does not recognize the import syntax

    But it doesn't matter, because we were meant to solve these problems.

    Three, get module content

    Well, now we start to practice based on the above core packaging principles. The first step is to obtain the module content.

    Let's create a bundle.js file.

    // 获取主入口文件
    const fs = require('fs')
    const getModuleInfo = (file)=>{
        const body = fs.readFileSync(file,'utf-8')
        console.log(body);
    }
    getModuleInfo("./src/index.js")
    

    The current project directory is as follows

    Let's execute bundle.js to see when the content of the entry file is successfully obtained

    Wow, unsurprisingly successful. Everything is under control. Well, the first step has been achieved, let me see what the second step is to do.

    Oh? It's the analysis module

    Four, analysis module

    The main task of the analysis module is to parse the acquired module content into an AST syntax tree, which requires a dependency package @babel/parser

    npm install @babel/parser
    

    ok, the installation is complete, we will introduce @babel/parser into bundle.js,

    // 获取主入口文件
    const fs = require('fs')
    const parser = require('@babel/parser')
    const getModuleInfo = (file)=>{
        const body = fs.readFileSync(file,'utf-8')
        const ast = parser.parse(body,{
            sourceType:'module' //表示我们要解析的是ES模块
        });
        console.log(ast);
    }
    getModuleInfo("./src/index.js")
    

    Let's take a look at the documentation of @babel/parser:

    It can be seen that three APIs are provided, and what we are currently using is the parse API.

    Its main function is to parse the provided code as an entire ECMAScript program, that is, to parse the provided code as an entire ECMAScript program.

    Look at the parameters provided by the API

    What we use temporarily is the sourceType, which is used to specify the module of the code we want to parse.

    Ok, now let's execute bundle.js to see if the AST is successfully generated.

    success. It was an unexpected success.

    However, what we need to know is that currently what we parse out is not only the content in the index.js file, it also includes other information about the file. And its content is actually the body in its attribute program. as the picture shows

    We can change the print ast.program.bodylook

    // 获取主入口文件
    const fs = require('fs')
    const parser = require('@babel/parser')
    const getModuleInfo = (file)=>{
        const body = fs.readFileSync(file,'utf-8')
        const ast = parser.parse(body,{
            sourceType:'module' //表示我们要解析的是ES模块
        });
        console.log(ast.program.body);
    }
    getModuleInfo("./src/index.js"
    

    carried out

    Look, what is printed now is the content of the index.js file (that is, the code we wrote in index.js).

    Five, collection dependence

    Now we need to traverse the AST and collect the dependencies used. What does that mean? In fact, it is to collect the file path introduced with the import statement. We put the collected path into deps.

    As we mentioned earlier, @babel/traverse dependency package is used to traverse AST

    npm install @babel/traverse
    

    Now we introduce.

    const fs = require('fs')
    const path = require('path')
    const parser = require('@babel/parser')
    const traverse = require('@babel/traverse').default
    const getModuleInfo = (file)=>{
        const body = fs.readFileSync(file,'utf-8')
        const ast = parser.parse(body,{
            sourceType:'module' //表示我们要解析的是ES模块
        });
        
        // 新增代码
        const deps = {}
        traverse(ast,{
            ImportDeclaration({node}){
                const dirname = path.dirname(file)
                const abspath = './' + path.join(dirname,node.source.value)
                deps[node.source.value] = abspath
            }
        })
        console.log(deps);
    
    
    }
    getModuleInfo("./src/index.js")
    

    Let's take a look at the description of @babel/traverse in the official document

    Ok so simple

    But it is not difficult to see that the first parameter is AST. The second parameter is the configuration object

    Let's take a look at the code we wrote

    traverse(ast,{
        ImportDeclaration({node}){
            const dirname = path.dirname(file)
            const abspath = './' + path.join(dirname,node.source.value)
            deps[node.source.value] = abspath
        }
    })
    

    In the configuration object, we have configured the ImportDeclaration method. What does this mean? Let's look at the AST printed earlier.

    ImportDeclaration The method represents the processing of nodes whose type is ImportDeclaration.

    Here we get the value of the source in the node, which is node.source.value,

    What does the value here mean? In fact, it is the value of import, you can see the code of our index.js.

    import add from "./add"
    import {minus} from "./minus";
    
    const sum = add(1,2);
    const division = minus(2,1);
    
    console.log(sum);
    console.log(division);
    

    It can be seen that value refers to the'./add' and'./minus' after import

    Then we concatenate the file directory path with the obtained value and save it in deps, which is called the collection dependency.

    ok, this operation is over, execute it to see if the collection is successful?

    oh my god. It succeeded again.

    6. Convert ES6 to ES5 (AST)

    Now we need to convert the obtained ES6 AST into ES5 AST. As mentioned earlier, two dependency packages are required to perform this step

    npm install @babel/core @babel/preset-env
    

    We will now introduce the dependency and use

    const fs = require('fs')
    const path = require('path')
    const parser = require('@babel/parser')
    const traverse = require('@babel/traverse').default
    const babel = require('@babel/core')
    const getModuleInfo = (file)=>{
        const body = fs.readFileSync(file,'utf-8')
        const ast = parser.parse(body,{
            sourceType:'module' //表示我们要解析的是ES模块
        });
        const deps = {}
        traverse(ast,{
            ImportDeclaration({node}){
                const dirname = path.dirname(file)
                const abspath = "./" + path.join(dirname,node.source.value)
                deps[node.source.value] = abspath
            }
        })
        
        新增代码
        const {code} = babel.transformFromAst(ast,null,{
            presets:["@babel/preset-env"]
        })
        console.log(code);
    
    }
    getModuleInfo("./src/index.js")
    

    Let's take a look at the official website document's introduction to @babel/core's transformFromAst

    Harm, it is as simple as always. . .

    To put it briefly, it actually converts the AST we passed in into the module type we configured in the third parameter.

    Okay, now let’s execute it and see the result

    My God, success as always. It can be seen that it converts our writing const into var.

    Okay, this step ends here. Hey, you may have questions. The collection dependency in the previous step doesn't matter here, it is true. The collection of dependencies is for the recursive operation below.

    Seven, recursively get all dependencies

    After the above process, we now know that getModuleInfo is used to obtain the content of a module, but we have not returned the obtained content, so we change the getModuleInfo method

    const getModuleInfo = (file)=>{
        const body = fs.readFileSync(file,'utf-8')
        const ast = parser.parse(body,{
            sourceType:'module' //表示我们要解析的是ES模块
        });
        const deps = {}
        traverse(ast,{
            ImportDeclaration({node}){
                const dirname = path.dirname(file)
                const abspath = "./" + path.join(dirname,node.source.value)
                deps[node.source.value] = abspath
            }
        })
        const {code} = babel.transformFromAst(ast,null,{
            presets:["@babel/preset-env"]
        })
        // 新增代码
        const moduleInfo = {file,deps,code}
        return moduleInfo
    }
    

    We returned an object, this object includes "the path of the module (file)" , "the dependency of the module (deps)" , "the module is converted into es5 code"

    This method can only get information about a module, but how do we get information about dependent modules in a module?

    That's right, looking at the title, you should think of it even recursively.

    Now let’s write a recursive method to get dependencies recursively

    const parseModules = (file) =>{
        const entry =  getModuleInfo(file)
        const temp = [entry]
        for (let i = 0;i<temp.length;i++){
            const deps = temp[i].deps
            if (deps){
                for (const key in deps){
                    if (deps.hasOwnProperty(key)){
                        temp.push(getModuleInfo(deps[key]))
                    }
                }
            }
        }
        console.log(temp)
    }
    

    Explain the parseModules method:

    1. We first pass in the main module path

    2. Put the obtained module information into the temp array.

    3. The outer loop traverses the temp array. At this time, the temp array only has the main module

    4. Get the dependency deps of the main module inside

    5. Traverse the deps, and push the obtained dependent module information to the temp array by calling getModuleInfo.

    The current bundle.js file:

    const fs = require('fs')
    const path = require('path')
    const parser = require('@babel/parser')
    const traverse = require('@babel/traverse').default
    const babel = require('@babel/core')
    const getModuleInfo = (file)=>{
        const body = fs.readFileSync(file,'utf-8')
        const ast = parser.parse(body,{
            sourceType:'module' //表示我们要解析的是ES模块
        });
        const deps = {}
        traverse(ast,{
            ImportDeclaration({node}){
                const dirname = path.dirname(file)
                const abspath = "./" + path.join(dirname,node.source.value)
                deps[node.source.value] = abspath
            }
        })
        const {code} = babel.transformFromAst(ast,null,{
            presets:["@babel/preset-env"]
        })
        const moduleInfo = {file,deps,code}
        return moduleInfo
    }
    
    // 新增代码
    const parseModules = (file) =>{
        const entry =  getModuleInfo(file)
        const temp = [entry]
        for (let i = 0;i<temp.length;i++){
            const deps = temp[i].deps
            if (deps){
                for (const key in deps){
                    if (deps.hasOwnProperty(key)){
                        temp.push(getModuleInfo(deps[key]))
                    }
                }
            }
        }
        console.log(temp)
    }
    parseModules("./src/index.js")
    

    According to the current implementation of our project, it should be that temp should store three modules: index.js, add.js, and minus.js. , Execute and see.

    Awesome! ! ! It is true.

    However, the current object format in the temp array is not conducive to the subsequent operations. We hope to store it in the form of the file path as the key and {code, deps} as the value. Therefore, we create a new object depsGraph.

    const parseModules = (file) =>{
        const entry =  getModuleInfo(file)
        const temp = [entry] 
        const depsGraph = {} //新增代码
        for (let i = 0;i<temp.length;i++){
            const deps = temp[i].deps
            if (deps){
                for (const key in deps){
                    if (deps.hasOwnProperty(key)){
                        temp.push(getModuleInfo(deps[key]))
                    }
                }
            }
        }
        // 新增代码
        temp.forEach(moduleInfo=>{
            depsGraph[moduleInfo.file] = {
                deps:moduleInfo.deps,
                code:moduleInfo.code
            }
        })
        console.log(depsGraph)
        return depsGraph
    }
    

    ok, now it’s stored in this format

    Eight, deal with two keywords

    Our goal now is to generate a bundle.js file, which is a packaged file. In fact, the idea is very simple, which is to integrate the content of index.js with its dependent modules. Then write the code to a new js file.

    Let's format this code

    // index.js
    "use strict"
    var _add = _interopRequireDefault(require("./add.js"));
    var _minus = require("./minus.js");
    function _interopRequireDefault(obj) { return obj && obj.__esModule ? obj : { "default": obj }; }
    var sum = (0, _add["default"])(1, 2);
    var division = (0, _minus.minus)(2, 1);
    console.log(sum); console.log(division);
    // add.js
    "use strict";
    Object.defineProperty(exports, "__esModule", {  value: true});
    exports["default"] = void 0;
    var _default = function _default(a, b) {  return a + b;};
    exports["default"] = _default;
    

    But we cannot execute the code of index.js now, because the browser will not recognize and execute require and exports.

    Why can't it be recognized? Not because the require function and exports object are not defined. Then we can define it ourselves.

    We create a function

    const bundle = (file) =>{
        const depsGraph = JSON.stringify(parseModules(file))
        
    }
    

    We save the depsGraph obtained in the previous step.

    Now return an integrated string code.

    How to return? Change the bundle function

    const bundle = (file) =>{
        const depsGraph = JSON.stringify(parseModules(file))
        return `(function (graph) {
                    function require(file) {
                        (function (code) {
                            eval(code)
                        })(graph[file].code)
                    }
                    require(file)
                })(depsGraph)`
        
    }
    

    Let’s look at the returned code

     (function (graph) {
            function require(file) {
                (function (code) {
                    eval(code)
                })(graph[file].code)
            }
            require(file)
        })(depsGraph)
    

    Is actually

    1. Pass the saved depsGraph to an immediate execution function.

    2. Pass the main file path to the require function for execution

    3. When the reuire function is executed, an immediate execution function is executed immediately, here is the value of the code is passed in

    4. Execute eval(code). That is to execute this code

    Let's look at the value of code

    // index.js
    "use strict"
    var _add = _interopRequireDefault(require("./add.js"));
    var _minus = require("./minus.js");
    function _interopRequireDefault(obj) { return obj && obj.__esModule ? obj : { "default": obj }; }
    var sum = (0, _add["default"])(1, 2);
    var division = (0, _minus.minus)(2, 1);
    console.log(sum); console.log(division);
    

    Yes, when this code is executed, the require function is used again. At this time, the parameter of require is the path of add.js, hey, it is not an absolute path, it needs to be converted into an absolute path. So write a function absRequire to transform. How to achieve it? Let's look at the code

    (function (graph) {
        function require(file) {
            function absRequire(relPath) {
                return require(graph[file].deps[relPath])
            }
            (function (require,code) {
                eval(code)
            })(absRequire,graph[file].code)
        }
        require(file)
    })(depsGraph)
    

    In fact, a layer of interception is realized.

    1. Execute the require('./src/index.js') function

    2. Executed

    (function (require,code) {
        eval(code)
    })(absRequire,graph[file].code)
    
    1. Execute eval, that is, execute the code of index.js.

    2. The execution process will execute to the require function.

    3. At this time, this require will be called, which is the absRequire we passed in

    4. The execution of absRequire executes return require(graph[file].deps[relPath])this code, that is, executes the external require

    Here return require(graph[file].deps[relPath]), we have converted the path into an absolute path. Therefore, the absolute path is passed in when the external require is executed.

    1. After executing require("./src/add.js"), eval will be executed again, that is, the code of the add.js file will be executed.

    Is it a bit convoluted? It is actually recursive.

    This will integrate the code, but there is a problem, that is, when the code of add.js is executed, it will encounter the undefined problem of exports. As follows

    // add.js
    "use strict";
    Object.defineProperty(exports, "__esModule", {  value: true});
    exports["default"] = void 0;
    var _default = function _default(a, b) {  return a + b;};
    exports["default"] = _default;
    

    We found that it uses exports as an object, but this object has not been defined yet, so we can define an exports object ourselves.

    (function (graph) {
        function require(file) {
            function absRequire(relPath) {
                return require(graph[file].deps[relPath])
            }
            var exports = {}
            (function (require,exports,code) {
                eval(code)
            })(absRequire,exports,graph[file].code)
            return exports
        }
        require(file)
    })(depsGraph)
    

    We have added an empty object exports. When the add.js code is executed, some attributes will be added to this empty object.

    // add.js
    "use strict";
    Object.defineProperty(exports, "__esModule", {  value: true});
    exports["default"] = void 0;
    var _default = function _default(a, b) {  return a + b;};
    exports["default"] = _default;
    

    For example, after executing this code

    exports = {
      __esModule:{  value: true},
      default:function _default(a, b) {  return a + b;}
    }
    

    Then we return the exports object.

    var _add = _interopRequireDefault(require("./add.js"));
    

    It can be seen that the return value is received by _interopRequireDefault, and _interopRequireDefault returns the default attribute to _add, so_add = function _default(a, b) { return a + b;}

    Now I understand why the ES6 module introduces an object reference, because exports is an object.

    So far, processing; the function of the two keywords is complete.

    const fs = require('fs')
    const path = require('path')
    const parser = require('@babel/parser')
    const traverse = require('@babel/traverse').default
    const babel = require('@babel/core')
    const getModuleInfo = (file)=>{
        const body = fs.readFileSync(file,'utf-8')
        const ast = parser.parse(body,{
            sourceType:'module' //表示我们要解析的是ES模块
        });
        const deps = {}
        traverse(ast,{
            ImportDeclaration({node}){
                const dirname = path.dirname(file)
                const abspath = "./" + path.join(dirname,node.source.value)
                deps[node.source.value] = abspath
            }
        })
        const {code} = babel.transformFromAst(ast,null,{
            presets:["@babel/preset-env"]
        })
        const moduleInfo = {file,deps,code}
        return moduleInfo
    }
    const parseModules = (file) =>{
        const entry =  getModuleInfo(file)
        const temp = [entry]
        const depsGraph = {}
        for (let i = 0;i<temp.length;i++){
            const deps = temp[i].deps
            if (deps){
                for (const key in deps){
                    if (deps.hasOwnProperty(key)){
                        temp.push(getModuleInfo(deps[key]))
                    }
                }
            }
        }
        temp.forEach(moduleInfo=>{
            depsGraph[moduleInfo.file] = {
                deps:moduleInfo.deps,
                code:moduleInfo.code
            }
        })
        return depsGraph
    }
    // 新增代码
    const bundle = (file) =>{
        const depsGraph = JSON.stringify(parseModules(file))
        return `(function (graph) {
            function require(file) {
                function absRequire(relPath) {
                    return require(graph[file].deps[relPath])
                }
                var exports = {}
                (function (require,exports,code) {
                    eval(code)
                })(absRequire,exports,graph[file].code)
                return exports
            }
            require('${file}')
        })(${depsGraph})`
    
    }
    const content = bundle('./src/index.js')
    
    console.log(content);
    

    Let's execute it and see the effect

    It is indeed successful. Next, write the returned code into the newly created file

    //写入到我们的dist目录下
    fs.mkdirSync('./dist');
    fs.writeFileSync('./dist/bundle.js',content)
    

    At this point, our handwritten webpack core principles are over.

    Let's take a look at the generated bundle.js file

    It turns out that all the dependencies we collected earlier are passed as parameters to the immediate execution function, and then each dependent code is executed recursively through eval.

    Now we import the bundle.js file into index.html to see if it can be executed

    success. . . . . Surprise. .

    Thank you and congratulations for seeing this, can I humblely ask for a star! ! !

    https://github.com/Sunny-lucking/howToBuildMyWebpack

    Author: Sunshine is sunny

    Original address: https://juejin.im/post/6854573217336541192

    study Exchange

    • Follow the public account [Frontend Universe], get good article recommendations every day

    • Add WeChat, join the group to communicate


    "Watching and forwarding" is the greatest support

Guess you like

Origin blog.csdn.net/liuyan19891230/article/details/107970068