web: big refactoring from 10 to 1 compilation

Introduction: For large-scale front-end projects, the stability and ease of use of construction are very important. In the iterative process of Tencent documents, the complex project structure and compilation problems are increasing day by day, which greatly increases the number of newcomers getting started and daily. The cost of moving bricks. Coinciding with the launch of Webpack5, it is better to have a complete magic change~

1 Introduction

The webpack5 released by Tencent Documentation has undergone a major refactoring of compilation. As a large-scale project composed of multiple warehouses, the amount of code in any category exceeds one million. For a project like Tencent Documents that iterates rapidly, relies heavily on automated pipelines, and runs multiple large-scale and numerous small-scale projects in parallel throughout the year, stable and fast compilation is critical to development efficiency. This article is about the problems and thoughts specific to some large-scale projects that the author encountered during the recent refactoring and successfully optimized the daily development to 1s. I hope it can bring you some references and ideas in the optimization of front-end project construction. Inspired.

2. The pain of compiling large projects

With the gradual expansion of the project system, it is often encountered that the old compilation configuration cannot support the new features. Due to the reading debuffs that come with various config files, as well as the heavy technical debt, everyone tends not to modify the old configuration, and It is an attempt to add some configurations to revise the compilation system in the periphery. For similar reasons, the compilation and compilation of Tencent documents in the past was not elegant:

img

The multi-level sub-repository structure and complex compilation system cause high understanding and modification costs, and also bring high compilation time, which has a great impact on the development efficiency of the entire team.

3.All in One

In order to solve the problem of complex and slow compilation, it is crucial to prohibit nesting dolls: the multi-level mixed system must be abolished, and unified compilation is king. Among all compilation systems, Webpack has a strong advantage in the packaging of large projects, the plug-in system is the most abundant, and Webpack 5 brings new features of Module Federation, so I chose to use Webpack to unify the compilation of multiple sub-warehouses.

3.1. Integrating the lerna-based repository structure

Tencent documentation uses lerna to manage subpackages in the warehouse. The benefits of using lerna will not be expanded here. However, the general usage of lerna also brings certain problems. Lerna turns a warehouse into multiple warehouses in structure. If the default usage method is used, each warehouse will have its own compilation configuration, and the compilation of a single project will become It becomes more difficult to modify the configuration and incrementally optimize the compilation and debugging of multiple projects.

Although the purpose of using lerna is to make each sub-package relatively independent, in the compilation and debugging of the entire project, a collection of all packages is often required. Then, the author can ignore the physical isolation between the sub-packages and use the sub-warehouse as a sub-package. look at the directory. Without relying on lerna, what I need to solve is the reference problem between subpackages:

/** package/layout/src/xxx.ts **/
import {
    
     Stream } from '@core/model';
// 做求和

In fact, the author can achieve the corresponding effect through the alias attribute of resolve in the webpack configuration:

{
    
    
  resolve: {
    
    
    alias: {
    
    
      '@core/model': 'word/package/model/src/',
    }
  }
}

3.2. Managing files outside the packaging system

In large-scale projects, there are sometimes some special static code files, which often do not participate in the packaging system, but are directly imported into html by other means, or merged into the final result.

Such files are generally divided into the following categories:

img

  1. External sdk files with earlier loading times, themselves provided as minify files
  2. Other framework files that external files depend on, such as jquery
  3. some polyfills
  4. Some special independent logic that must be run early, such as initializing the sdk, etc.

Since polyfills and external sdk often run directly through the mode of hanging global variables, projects often refer to them by directly writing html script tags. However, with the increase of such files, direct use of label references is not friendly to version management and compilation processes, and some of their corresponding initialization logic cannot be added to the packaging process. In this case, the author suggests to manually create a js entry file, refer to the above file, and use it as an entry of webpack. In this way, these bulk files can be managed by code:

img

import 'jquery';
 
import 'raven.min.js';
import 'log.js';
// ...

However, some external js may depend on other sdk, such as jQuery, but the packaging system does not know the dependencies between them, resulting in jQuery not being exposed to the global in time, what should I do? In fact, webpack provides a very flexible solution to deal with these problems. For example, the author can expose jQuery to the global through expose-loader for third-party reference. In the Tencent documentation, some SDK components for remote cdn are also included, and these SDKs also need to reference some libraries, such as jQuery. Therefore, the author also separated jQuery through the configuration of splitChunks and placed it at an earlier loading time to ensure that the sdk loaded based on cdn can also be initialized normally.

img

Through code reference, on the one hand, the version management of dependent files can be well performed; on the other hand, since the compilation of the corresponding files is also added to the packaging process, all changes to the corresponding files can be dynamically monitored, which is conducive to subsequent additions. volume compilation. At the same time, due to the encapsulation characteristics of webpack, each library will be included in a special function of webpack_require, and the number of exposed global variables has become more controllable.

3.3. Customized webpack process

Webpack provides a very flexible html-webpack-plugin for html generation. It supports templates and a number of exclusive plugins. However, there are still some special requirements for the project, and the general plugin configuration either cannot meet these requirements, or it is suitable for The matching result is very difficult to understand. This is also the reason why Tencent Documents initially used gulp to generate html. In the gulp configuration, there are many custom processes to meet the publishing requirements of Tencent Documents.

Since gulp can customize the process to achieve html generation, I can also write a separate webpack plugin to implement the customized process.

Webpack itself is a very flexible system. It is a framework that executes according to a specific process. It provides different hooks at different stages of each process. Various plug-ins are used to implement the callbacks of these hooks to complete the packaging of the code. , in fact, webpack itself is composed of countless native plugins. Throughout this process, I can do all sorts of different things to customize it.

img

For the scenario of generating html, by adding a plug-in, in the stage of webpack processing the generated files, the generated js, css and other resource files, as well as the ejs template and special configuration are integrated together, and then added to the assets collection of webpack, you can Complete a custom html generation.

img

compilation.hooks.processAssets.tap({
    
    
  name: 'TemplateHtmlEmitPlugin',
  stage: Compilation.PROCESS_ASSETS_STAGE_ADDITIONAL,
}, () => {
    
    
  // custom generation
  compilation.emitAsset(
    `${
      
      parsed.name}.html`,
    new sources.RawSource(source, true),
  );
  compilation.fileDependencies.addAll(dependencies);
});

In the above code, you can notice the last sentence: compilation.fileDependencies.addAll(dependencies), through this sentence, the author can add all the files that the custom generation depends on into the webpack dependency system, then when these files change, webpack can automatically Trigger the corresponding generation process.

3.4. One-click development experience

So far, all compilations have been unified, I can compile the whole project with one webpack, and watch and devServer can also be high together. However, since compilation can be unified, why not integrate all operations?

Based on the assumption that node_modules should not be operated manually, the author can create a snapshot of dependencies in package.json, and judge whether it needs to be reinstalled according to the changes in the package each time, avoiding the manual judgment of developers after synchronizing the code, and skipping unnecessary steps.

public async install(force = false) {
    
    
  const startTime = performance.now();
  const lastSnapshot = this.readSnapshot();
  const snapshot = this.createSnapshot();
 
  const runs = this.repoInfos.map((repo) => {
    
    
    if (
      this.isRepoInstallMissing(repo.root)
      || (!repo.installed
      && (force || !isEqual(snapshot[repo.root], lastSnapshot[repo.root])))
    ) {
    
    
      // create and return install cmd
    }
    return undefined;
  }).filter(script => !!script);
 
  const {
    
     info } = console;
  if (runs.length > 0) {
    
    
    try {
    
    
      // 执行安装并保存快照
      await Promise.all(runs.map(run => this.exec(run!.cmd, run!.cwd, run!.name)));
      this.saveSnapshot(snapshot);
    } catch (e) {
    
    
      this.removeSnapshot();
      throw e;
    }
  } else {
    
    
    info(chalk.green('Skip install'));
  }
  info(chalk.bgGreen.black(`Install cost: ${
      
      TimeUtil.formatTime(performance.now() - startTime)}`));
}

Similarly, the local debugging of Tencent documents is based on a special test environment, through whislte for proxy, such steps can also be automated, then, for development, everything is very easy, one command, easy to move bricks~

img

However, as a complex system, it is always necessary to initialize it for the first time. If the dependencies of the compilation system have not been installed, and there is no chicken, how can it lay eggs?

In fact, otherwise, I might as well build a baby in the outer layer of the entire compilation system, and do front-end development. Node will always be installed first, right? Then, before executing the current compilation command, the author executes a script that only depends on node. This script will try to execute the main command. If the main command crashes directly, it means that the installation environment is not ready, then at this time, the compilation system is initialized It's ok. In this way, one-click development can really be achieved.

const cmd = '启动编译的命令';
const main = extraArg => childProcess.execSync(`${
      
      cmd} ${
      
      extraArg}`, {
    
     stdio: 'inherit', cwd: __dirname });
 
try {
    
    
  main('');
} catch (e) {
    
    
  // 初始化
  main('after-initialize');
}

3.5. Compile system coding

In this refactoring process, the author changed the original compilation configuration to ts calling webpack's nodeApi to perform compilation. A codified build system has many benefits:

  1. Using api calls, you can enjoy the code hints brought by the IDE, and you will no longer debug all day because you accidentally type a typo in the configuration file.
  2. Using the code api can better implement the compiled structure, especially when there are multiple outputs, it is better to manage than a simple combination of config files.
  3. Using a coded compilation system has a special effect, the compilation system can also write tests! What? Compile system to write tests? In fact, in the previous releases of Tencent's documents, there have been several inexplicable bugs. In the pre-launch test, the performance of the entire program suddenly became abnormal. No changes have been made to the relevant code. After a long time of carpet investigation, we found that the compilation results were slightly different from the previous ones. In fact, in the first five hours after the system test environment was generated, a plug-in relied on for compilation was silently updated with a small version, and the author used the default ^xx.xx for the plug-in in package.json, and the pipeline install To the latest version, resulting in a crash. At that time, the author came to a conclusion that compiling the relevant library needs to lock the version. However, locking the version does not really solve the problem. The components used in the compilation will always be upgraded one day. What if it is guaranteed that this upgrade will not cause problems? This is the realm of automated testing. In fact, if you look at the code of Webpack, you will find that they have also done a lot of test cases to compile consistency. However, there are various plugins for webpack, and not every author has enough investment in quality assurance. Therefore, Using automated testing to ensure the stability of the compilation system is also a topic that can be studied in depth.

4. Compilation speed up

In projects involving typescript compilation, the basic speed-up operation is asynchronous type checking, and the combination of the tranpsileOnly parameter of ts-loader and fork-ts-checker is never tiresome. However, for complex large-scale projects, the activation process of this set of combination punches may not be smooth sailing. You may wish to follow the author to see the bumpy road to enable fast compilation in Tencent documents.

4.1. The disappearing enum

Immediately after enabling the transpileOnly parameter, the compilation speed improved substantially, however, the results were not optimistic. After compiling, the page crashed before it was opened. According to the error report, it is found that an object imported from a dependency library has become undefined, causing the program to crash. This object that becomes undefined is an enum, which is defined as follows:

export const enumScope{
    
    
  VAL1= 0,
  VAL2= 1,
}

Why is it empty when I enable transpileOnly? This has to do with its special properties, it's not a normal enum, it's a const enum. As we all know, enumeration is the syntactic sugar of ts. Each enumeration corresponds to an object in js. Therefore, after converting an ordinary enumeration into js, ​​it will become like this:

// ts
export enum Scope {
    
    
  VAL1 = 0,
  VAL2= 1,
}
 
const a = Scope.VAL1;
 
// js
constScope= {
    
    
  VAL1: 0,
  VAL2: 1,
  0: 'VAL1',
  1: 'VAL2',
};
 
const a =Scope.EDITOR;

What if I add a const keyword to Scope? It will become like this:

// ts
export const enumScope{
    
    
  VAL1= 0,
  VAL2= 1,
}
 
const a = Scope.VAL1;
 
// js
const a = 0

That is to say, const enum is equivalent to macro, and it does not exist after translation into js. However, why does the compilation result work normally when transpileOnly is turned off? In fact, if you look carefully at the declaration file .d.ts of the external library, you will find that in this .d.ts file, the Scope is kept intact.

// .d.ts
export const enumScope{
    
    
 VAL1= 0,
 VAL2= 1,
}

Under the normal compilation process, tsc will check the .d.ts file, and it has already predicted this definition, so it can correctly perform macro conversion, and for the case of transpileOnly is turned on, all types are ignored, because the original The Scope no longer exists in the library module of the , so the compilation result cannot be executed normally (PS: tsc has officially stated that the compilation in transpile mode does not resolve. d.ts is a standard feature, and the loss of const enum is not a bug, so wait for official support is fruitless). Now that you know the cause, you can fix it. Four options:

  • Option 1 , follow the official guidance, for not exporting const enum, only const for the enumeration used internally, that is to say, the dependent library needs to be modified. Of course, all the dependent libraries in Tencent's document crash this time do belong to its own sdk, but what if an external library caused the problem? So the plan is not safe.
  • Option 2 , the perfect version, manually parses the .d.ts file, finds all const enums and extracts definitions. However, the compilation acceleration obtained by transpileOnly really benefits from ignoring the .d.ts file. If the author manually parses the .d.ts for an enum, the .d.ts file may have complex reference links, which is extremely time-consuming of.

img

  • Option 3 , string replacement, since const enum is a macro, I can manually achieve a similar effect through string-replace-loader. However, the string replacement method is still too violent. If a usage similar to Scope['VAL1'] is used, it may fail unexpectedly.

img

  • Option 4 is also the solution that the author finally adopted. Since the definition has disappeared, it is good to redefine it. Through the DefinePlugin of Webpack, the author can redefine the missing objects to ensure the normal parsing of the compilation.
new DefinePlugin({
    
    
  Scope: {
    
    VAL1: 0,VAL2: 1 },
})

4.2. Love-hate decorators and dependency injection

Unfortunately, just to solve the problem of missing compiled objects, the code still can't run. When the program is initialized, it still fails. After some debugging, it is found that there are some subtle differences in the initialization process. Obviously, when transpileOnly is turned on, the compilation result has changed.
To solve this problem, we need to take a closer look at the implementation of the transpileOnly pattern. The bottom layer of transpileOnly is implemented based on the transpileModule function of tsc. The role of transpileModule is to parse each file as an independent individual, and each import will be treated as a whole module. The compiler will not parse the module exports and files. The specific relationship, for example:

// src/base/a.ts
export class A {
    
    }
 
// src/base/b.ts
export class B {
    
    }
 
// src/base/index.ts
export * from './a';
export * from './b'
 
// src/app.ts
import {
    
     A } from './base';
 
const a = new A();

The above is a common code writing method. We often export the modules in base through an index.ts, so that in other modules, the author does not need to refer to the file. In the normal mode, the editor parses this code and will attach information to inform webpack that A is exported by a.ts. Therefore, when webpack is packaging, it can package A and B into different files according to specific scenarios. However, in the transpileModule mode, webpack only knows that the base module exports A, but it does not know which file A is exported from. Therefore, at this time, webpack will definitely pack A and B into one file , as a whole module, available to the App. For Tencent documents, this situation has changed as follows (modules are loaded in the order of 1, 2, 3, 4, and 5, and the visual size of the module indicates the size):

img

It can be seen that when transpileOnly is turned on, a large number of files are packaged into module 1 and loaded in advance. In general, though, where a module is packaged shouldn't affect how the code behaves, should it? After all, when code splitting is turned off, the code does not need to be unpacked. For the general case, this understanding is not wrong. However, for projects that use decorators, it doesn't work. In the era when the code will generally be converted to es5, the decorator will be converted into a __decorator function, which is a self-executing function when the code is loaded. If the order in which code is packaged changes, the execution order of self-executing functions may also change. So, how does this cause Tencent Documents to fail to start normally? This starts with the comprehensive introduction of dependency injection technology in Tencent documents.

In Tencent documents, each function is a feature. This feature will not be initialized manually, but injected into the DI framework of Tencent documents through a special decorator, and then created by the injection framework for unified instance creation. For example, in a normal change, there are three Features A, B, C, A, B are compiled in module 1, C is compiled into module 2. When module 1 is loaded, the workbench will perform a round of instance creation and initialization. At this time, the initialization of FeatureA brings a certain side effect. Then, module 2 is loaded, and the workbench performs another round of strength creation and initialization. At this time, the initialization of FeatureC relies on the side effects of FeatureA, but the first round of initialization has ended, so C is successfully instantiated.

img

When transpileOnly is turned on, everything changes. Since there is no way to distinguish exports, Feature A, B, and C are packaged into the same module. It is conceivable that when Feature C is initialized, the initialization of C fails because the side effect has not yet occurred.

img

Since transpileOnly is inherently incompatible with dependency injection, I need to find a way to fix it. If, the author replaces the reference in the app:

// src/app.ts
import {
    
     A } from './base/a';
 
const a = new A();

Is the analysis problem of module export solved? However, changing so many codes into such references is not only ugly, but also anti-human, and the workload is also huge. So, let me design a plugin/loader combination to solve the problem at compile time. In the initial stage of compilation, the author parses the project file through a plugin, extracts the export, finds the corresponding relationship between each export and the file, and stores it (here, you may be worried about the performance of IO read and write Considering that the current developers are all high-speed SSDs, this IO throughput is really nothing, the actual measurement of this export resolution is <1s), and then during the compilation process, the author passes a custom loader to the corresponding import statement Replace, so that the validity of transpileOnly parsing can be maintained without affecting the normal writing code.

img

After a lot of tossing, finally, the Tencent document was successfully run in the high-speed compilation mode, reaching the predetermined compilation speed.

5. Webpack5 upgrade road

5.1. Handling some compatibility issues

After all, Webpack5 is an incompatible upgrade. During the reconstruction of Tencent's document compilation system, it also encountered many problems.

5.1.1. SplitChunks custom ChunkGroups error

If you are also a heavy user of splitChunks, during the process of upgrading webpack5, you may encounter the following warning:

img

The description of this warning is not very clear. In vernacular terms, this prompt appears, indicating that in your chunkGroups configuration, a module belongs to both groups A and B (here A and B are two Entrypoints or two async module), but you explicitly specify the case where the module belongs to A. Why does Webpack5 report a warning at this time? Because in general, module belongs to two Entrypoint or asynchronous modules, module should be extracted as a public module, if module is belonged to A, then module B cannot be successfully loaded if it is loaded separately.

However, in general, if such a designation occurs, if it is not a configuration error, then there is already a clear loading order between A and B. But this loading order, Webpack does not know. For entrypoint, in webpack5, it is allowed to specify the dependencies between entries through the dependOn attribute. But for asynchronous modules, there is no such traversal setting. Of course, the author can also use custom plug-ins to ensure that webpack can know additional information about existing module dependencies and modifications before optimizing:

compiler.hooks.thisCompilation.tap('DependOnPlugin', (compilation) => {
    
    
  compilation.hooks.optimize.tap('DependOnPlugin', () => {
    
    
    forEach(this.dependencies, (parentNames, childName) => {
    
    
      const child = compilation.namedChunkGroups.get(childName);
      if (child) {
    
    
        parentNames.forEach((parentName) => {
    
    
        const parent = compilation.namedChunkGroups.get(parentName);
        if (parent && !child.hasParent(parent)) {
    
    
          parent.addChild(child);
          child.addParent(parent);
        }
        });
      }
    });
  });
});

5.1.2. The api that plugin depends on has been removed

After the release of Webpack5, all major mainstream plugins have been adapted one after another, and you only need to update the plugin to the latest version. However, there are also some plugins that are not updated in time for many reasons. (PS: At present, most of the plug-ins that do not match are already relatively small.) In short, this problem is relatively unsolved, but you can wait appropriately. It should be in the near future, most plug-ins will adapt to webpack5, in fact, webpack5 is also used. Many names have been changed, some interfaces have been transferred, and the calling method has changed, but not all of them have changed dramatically. Therefore, if you really can't wait for the little plugin, you might as well try to fork it yourself.

5.2. First experience of Module Federation

Usually, for a large project, the author will extract many common components to improve module sharing between projects. However, these modules will inevitably have some common dependencies, such as basic libraries such as React, ReactDOM, and JQuery. In this way, it is easy to cause a problem. After the common components are extracted, the project volume expands. With the increase of public components, the expansion of the project volume becomes very scary. In the traditional packaging model, the author has found a simple and effective method. For public components, the author uses external to cut out these public parts and turn them into a disabled component.

img

However, as the number of components increases, the number of hosts for shared components increases, which brings some problems:

  1. The Component needs to be specially packaged for the Host. It is not a component that can run independently. Each Host running the Component must carry a complete runtime. Otherwise, the Component needs to print different disability packages for different Hosts.
  2. If there is a large shared module between Component and Component, it cannot be resolved through external.

At this time, Module Federation appeared, which is a transformation of Webpack from static packaging to complete runtime. In Module Federation, the concepts of Host and Remote were proposed. The content in the Remote can be consumed by the Host, and during this consumption process, only the required part can be loaded through the dynamic loading runtime of webpack, and the existing part will not be loaded twice. (In the figure below, since jQuery, react, and dui are already included in the host, the Webpack runtime will only load Component1 in Remote1 and Component2 in Remote2.)

img

That is to say, as a Remote, the public component contains a complete runtime, and the Host does not need to know what runtime needs to be prepared to run the Remote, but the loader of Webpack ensures that the shared code is not loaded. In this way, many problems in the traditional external packaging mode are avoided. In fact, a component can be both Host and Remote, that is to say, a program can run both as the main process and as an online sdk repository. Regarding the implementation principle of Module Federation, I will not repeat it here. If you are interested, you can refer to the analysis of Module federation, a new feature of webpack5, in the application of Tencent Documents, and you can also refer to the examples in the module-federation-examples repository.

Webpack5's Module Federation relies on its dynamic loading mechanism, so in its demo instance, you can see this structure:

// bootstrap.js
import App from "./App";
import React from "react";
import ReactDOM from "react-dom";
 
ReactDOM.render(<App />, document.getElementById("root"));
 
// index.js
import('./bootstrap.js');

The entry configuration of Webpack is set on index.js. Here, it is because all dependencies need to be dynamically determined and loaded. If the entry is not turned into an asynchronous chunk, how to ensure that the dependencies can be loaded in order? After all, the core of implementing Moudle Federation is the dynamic loading system based on webpack_require. Since Module Federation requires the linkage of multiple warehouses, its advancement must be a relatively long process. So is it necessary for the author to directly transform the existing project into an index-bootstrap structure? In fact, I can still use Webpack's plug-in mechanism to dynamically implement this asynchronous process:

private bootstrapEntry(entrypoint: string) {
    
    
const parsed = path.parse(entrypoint);
const bootstrapPath = path.join(parsed.dir, `${
      
      parsed.name}_bootstrap${
      
      parsed.ext}`);
this.virtualModules[bootstrapPath] = `import('./${
      
      parsed.name}${
      
      parsed.ext}')`;
return bootstrapPath;
}

In the above bootstrapEntry method, the author creates a virtual file based on the original entrypoint file. The content of this file is:

	import('./entrypoint.ts');

Then, through the webpack-virtual-modules plugin, a virtual file is generated in the same directory, and the original entry is replaced, and the structure conversion of module-federation is completed. In this way, with some other corresponding configurations, the author can turn on and off module-federation through a simple parameter, and turn the project into a Webpack5 ready structure. When the related projects are successfully adapted one after another, they can happily go online together.

6. Postscript

The major refactoring of compilation is something that the author has planned for a long time. I remember that when I was working as a team, I came into contact with a project with such a complex compilation link for the first time. I deeply felt that my project experience was too shallow, and the compilation and packaging of contacts were so simple , Now I know it myself that in addition to many technical difficulties, the ancestral configuration that must be found in large projects is also a major obstacle to the progress of the project.

The Beta cycle of Webpack 5 is very long, so after the release of Webpack 5, there are not as many compatibility issues as expected, but the documentation of Webpack 5 is a bit pitted. The parameters given are still the old data of webpack4, which is not correct. So I had to bury my head in debugging the source code to find the correct way to configure it, but I also gained a lot from it. In addition, Webpack5 is still iterating, and there are still some bugs, such as the use of inappropriate configuration in Module Federation, which may lead to strange compilation results, and no error will be reported. Therefore, if you encounter a problem, you should boldly raise an issue. The experience of this refactoring is here for the time being. If there is any inappropriateness, please correct it.
insert image description here

Guess you like

Origin blog.csdn.net/weixin_51568389/article/details/125837149