A Preliminary Study on the Tree Shaking Mechanism in Flutter (Science)

background

In the process of exploring the integration of Flutter engineering by Xianyu technology, in order to achieve the best development experience, it is necessary to seamlessly connect the FaaS end code and the business Flutter code. A code can be deployed in FaaS or directly imported into the business code. In the main project, make it truly integrated.

In order to achieve this goal, we implemented code decoupling of the two parts of code through RPC calls, and the engineering decoupling relies on the Tree-Shaking mechanism of Flutter/Dart in the compilation process. In order to avoid stepping on pits, we need to understand how the entire Tree-Shaking works. This article combines the Flutter Engine source code to briefly explore this process.

Pre-knowledge

Tree Shaking is a dead code elimination (Dead Code Elimination) technology, this idea originated in LISP in the 1990s. The idea is: all possible execution flows of a program can be represented by a tree of function calls, so that functions that have never been called can be eliminated. The algorithm was first applied to JavaScript in Google Closure Tools, and then applied to the dart2js compiler also written by Google. In Flutter, there is also such a Tree Shaking mechanism to reduce the final packet size. Flutter provides three build modes. For each different mode, the Flutter compiler has different optimizations for the output binary files. The Tree-Shaking mechanism will not be triggered in the debug mode. Among the AOT products compiled in the Profile/Release mode, there are several more important products that allow us to more intuitively see the Tree-Shaking mechanism at work:

  • app.dill: This is the product of dart code through build, which is binary bytecode. You can  stringssee the content inside, which is actually the source code of our dart code.

  • snapshot_blob.bin.d: This file is a collection of all dart files involved in compilation, including our own business code, the code  pubspec.yamlof the tripartite library defined in it, and all the native flutter or dart packagecodes imported in our business  code.

Research on Tree Shaking Mechanism

Preliminary Study on Minimizing Demo

We write a simplest example, the code is as follows:

The code is very simple, it contains a _unused method that is not used . Below we compile in Profile mode, and use DevTools to view the final compiled product, as shown in the figure below

It can be seen in Funtions, there is no _unusedmethod described in the compilation process, this unused code is "Shake" out. In fact, in addition to Function, the imported lib and imported dart files have similar Tree-Shaking processing during Flutter compilation. Let's dive into the code to see how this is done.

Code analysis

Here we borrow flutter run the timing diagram of the command execution of the predecessors of Gityuan . The entire compilation process will be relatively long. The GenSnapshot.run() method will call the binary executable file gensnapshot (the corresponding source code is in the directory thirdparty/dart/runtime/bin/gensnapshot.cc) to generate machine code .

Use a magnifying glass to see the internal execution process of gensnapshot:

The tree-shaking mechanism occurs in the compilation phase, which is the CompileAll() method. Let's dive into the code to explore step by step how the Flutter compiler cuts the code.

The source code path is third_party/dart/runtime/vm/compiler/aot/precompiler.cc, readers can also check it by themselves.

Compile stage

The first is the necessary preparation work. The object pool needs to be retained until the end of AOT compilation. Therefore, a handle that can survive for so long must be used here, and StackZone is used.

In order to use Class Hierarchy Analysis (CHA), you need to ensure that the class hierarchy is stable before compiling. At the same time, you need to ensure that the function is not missed because the class of the function has not been finalized when looking for the entry point. CHA is a compiler optimization that can de-virtualize virtual calls into direct calls based on the analysis results of the class hierarchy.

Information such as pre-compiled constructors and calculation of optimized instructions can be used for inline functions.

Next stub code generated by StubCode::InterpretCallacquiring its object pool code obtained, recycling StubCode::Build, etc. The method of obtaining the results of a series of methods stored in the series object_store. Collecting dynamic function name of the method, after passing through AddRoots()the method, adding to the root from the dispensing occurs and C ++ calls starting point, while by AddAnnotatedRoots()the method to all @pragma ( 'vm: entry-point ') to be added to the root marked.

After that, the code starts to compile, which Iterate()is the core of the compilation. Here, the root found above is used as the target, and the caller who added the target is traversed.

Inside this method, the main call chain is as follows:

ProcessFunction ==> CompileFunction ==> PrecompileFunctionHelper ==> PrecompileParsedFunctionHelper.Compile

At this point, after the compilation is completed, the Tree-Shaking phase begins to simplify the useless code.

Tree shaking stage

In the above compilation process, call information such as functions/classes has been output. Based on this information, the compiler can know which ones are unnecessary codes. Take the processing of Function as an example to explain:

  • TraceForRetainedFunctions();

In this method, after obtaining the handles of Library, Class, etc., the code in each package is processed in the unit of Library, and the Functions in all classes are traversed for processing.

By AddTypesOf(constFunction&function)the method, a call to a function added to functions_to_retain_the pool, while the Function type parameter reading made by AddTypethe method to add these types of parameters corresponding to the typeargs_to_retain_pool and the types_to_retain_pool for the type of information TreeShaking (respectively DropTypeArguments and DropTypeParameters).

Class method of the same name in the information AddTypesOf(constClass&cls)carried in the process, the process relatively similar, not repeat them here, interested readers can own inspection

  • FinalizeDispatchTable ();

In this method, it will ensure that an entry for the serialization schedule is created before executing the Drop method, because the compiler may clear the reference to the Code object later. At the same time, delete the schedule generator to ensure that no more new entries are attempted after this.

  • ReplaceFunctionStaticCallEntries();

In this method StaticCallTableEntryFixer , the static function call entry is replaced by the declared anonymous inner class .

  • Drop

Next, a series of Drop methods will be executed. These methods will remove redundant methods, fields, classes, libraries, etc., as shown below:

  1. DropFunctions();

  2. DropFields();

  3. DropTypes();

  4. DropTypeParameters();

  5. DropTypeArguments();

  6. DropMetadata ();

  7. DropLibraryEntries();

  8. DropClasses();

  9. DropLibraries();

The specific call sequence is shown in the following figure:

As the realization of ideas inside these methods there are many similarities here for Function approach DropFunctionsan example to illustrate.

In this method, the core is functions_to_retain_ to judge whether Function has a root caller through the pool mentioned above . If the function object is not included in the pool, it means that this is a Function that can be discarded. After that, rewrite the remaining Function back to Class, and update the call table of Class.

The drop_function function is declared inside the method to "shake" the Function.

After that, use the function to traverse all the codes, and use the above-declared drop_function function code to mark and delete useless.

Rewrite the Funtion that needs to be retained into its own Class:

Regenerate the call table of the class, and delete the possible useless Functions in the call table at the same time:

Finally, there are some boundary conditions such as inline functions, which will not be repeated here. After completing the Drop phase, the code that can be dropped has entered the deletion pool, and then enters the final stage of compilation to further reduce the size of the binary file.

Finishing stage

After the end of Tree-Shaking, the compilation and finishing work, including code obfuscation, garbage collection, etc.

It is worth noting that the Dedup method, the key code code is as follows:

A lot of data deduplication is performed in this method; in AOT mode, the binder is run after Tree Shaking, during this period, all targets have been compiled, so the binder will use direct calls to the target instead of all static Call, further reducing the compiled binary file. So far all the compilation work is completed, and Tree-Shaking has completed his mission.

expand

In version 1.20 of Flutter, icon fonts that are not used in the project are removed through the Tree-Shaking mechanism, which further reduces the package size (about 100KB). However, the implementation of this method is not in the compilation phase described above, but in the build_system , Optimized the assets. The relevant PR can be viewed at github.com/flutter/flutte/pull/49737.

summary

This article mainly combines the Flutter Engine source code, starting from the compilation stage, and exploring the operating mechanism of Tree-Shaking in the process. Because of the existence of such a mechanism, it provides a theoretical basis for engineering decoupling, making the realization of engineering integration easier, and at the same time it is inspiring for us to further optimize the packet size.

Guess you like

Origin blog.csdn.net/weixin_38912070/article/details/109505605