Douyin Swift compilation optimization - 60% faster compilation based on custom Toolchain

Hands-on attention

3aba04e9778eb2192f8657cd8f090376.gif

Dry goods do not get lost

This article focuses on the dependency parsing bottleneck brought about by all modularization, mainly including incremental compilation and analysis of header files.

The optimization scheme is based on the source code of Swift Toolchain. This article does not discuss the basic concepts and configuration process of Toolchain, but only focuses on the scheme itself.

background

As more and more business scenarios are implemented in the mix, the performance pain points in the development begin to appear. The problem is obviously concentrated in the header file changes of the OC warehouse that the Swift environment depends on. Therefore, the infrastructure architecture focuses on the performance analysis of interface layer dependencies, and strives to solve performance bottlenecks.

With the help of custom Toolchain capabilities, Douyin's basic technology team cuts the content specified by Clang Header by customizing compilation parameters, and finally achieves a 60% increase in compilation speed.

This program has been launched at the end of November 2022 and has been running stably on Douyin for nearly 5 months. Let us review the whole process of the whole plan from proposal to implementation.

initial analysis

In the mixed compilation scenario, to ensure that OC and Swift interoperate as fully as possible, the enabling of modularization cannot be used only in the context of Swift compilation—the Clang Header exported by Swift compilation appears in the form of the project, and $(project_name)-Swift.hwill The OC dependencies that need to be referenced by re-export are exported in the form of modules, which means that if OC compilation does not enable modularization, the header files provided by Swift cannot be used correctly.

1b10a66a78fa7e54e8cc3b92279caf0b.png

@import A;As shown in the figure, the two cannot have both. Objc Pod D is introduced in order to be able to parse sentences A.modulemap, so its interoperability with A can no longer be based on the logic of text import, and it will turn to modularization in an all-round way.

For Douyin, the historical burden of passing dependencies on a large number of header files of giant OC projects makes it a disaster to introduce modularity in OC compilation. In a modular environment, it takes longer for the cache system to decide whether to hit the .o cache than it takes to recompile in a text environment; when incremental compilation is performed, extensive module recompilation will also occur, and changing a header file will Wait a few minutes.

Transitive dependency management is a long-term project, but compilation optimization can't wait that long, and we need a quick solution.

Optimization effect

Before introducing the program, draw conclusions first.

Select the OC&Swift mixed warehouse with the largest code volume in the Douyin project for testing:

  • OC incremental compilation: Select the OC interface layer header files that Swift relies on to modify, and the compilation time is reduced by 60%

  • Swift incremental compilation: Select the Swift public class that OC depends on for modification, the compilation time is similar, and there is no change

  • Full compilation: clear the local compilation cache for clean build, reducing compilation time by 17%

It can be seen that this scheme has greatly improved the compilation speed. Next, let us review the process of the entire program from pre-research to launch.

Scheme principle

The key to solving the problem is to reduce the time spent on precompiling OC header files. Here are two ideas:

  • Long-term: The root cause of time-consuming module parsing is transitive dependencies. The characteristics of modules lead to transitive dependencies of header files contained in different modules, which will greatly expand the scope of influence of module incremental recompilation. The business library has strictly controlled the transitive dependency of the interface layer under the existing engineering architecture system, so the long-term plan will gradually promote the governance of the transitive dependency of the basic library.

  • Short-term: Convert OC header file precompilation back to text import, that is, clip -fmodule-map-filesinjection, but still retain support for OC calling Swift code

22170fccac99ea80864c9442c5a5c88b.png

Swift will declare the C/OC modules used by its own interface layer (that is, public/open), and give them xxx-Swift.hin @import aaathe form of , which requires that these modules should also be visible to the OC side when the OC side uses the header file. We want To achieve the purpose, these declarations need to be tailored. This requires the support of a custom toolchain.

This optimization plan effect test is aimed at short-term plans.

By modifying the compiler, trim the Clang Header Interface generated by Swift compilation, delete @import other than the system library, and manually complete the dependencies where the OC side references the header file. That is, at the expense of temporarily sacrificing the interface self-contained, the OC side no longer needs to care about module-related factors. In order to support finer-grained control, compile parameters are injected into the compiler to control the enabling of this function for different components, and to achieve more specific tailoring content.

And for -fmodule-map-filesthe clipping is relatively easy, just modify the injection OTHER_CFLAGSthat can be turned off -fmodule-map-files.

Pre-research

Program dismantling

Let's first disassemble the entire program, which can analyze the dependencies of each part and save time in the pre-research stage.

A tool chain-related landing solution must ensure its stability, so it must be possible to perform external control switches in a simple way.

From the perspective of release, tool chain release is not like business code, which can be released flexibly like the configuration stored in the development warehouse. Therefore, it is necessary to ensure the stability of tool chain code as much as possible and not modify it unless necessary.

Based on these two principles, we can decompose it into:

1. swiftcAnalyze the parameter parsing mechanism of , and splicing new custom parameters in the parameter list at compile time to control the clipping ability. swiftcIt is an entry swift-frontendof . It will be mentioned in detail below that the list of parameters injected into swiftc swift-frontenddoes not always appear in the same complete set in each subtask, and the mechanism of action needs further analysis.

2. Based on the consideration of fine-grained control, parameter selection is passed into a configuration file, including a white list, to determine which ones @import Modulecan be left. We have also considered the blacklist, but the actual project dependencies are complex. Whether it is Cocoapods or seer, they can only describe the project-level dependencies, but cannot guarantee the actual compile-time dependencies. It is difficult to build a comprehensive business blacklist. list. The system library whitelist is relatively fixed and does not require frequent maintenance.

3. Look for the specific function of Generate-Swift.h, and @import Module;the logic written for tailoring.

4. Load the whitelist file and filter it at the writing logic.

5. Pass the local verification, complete the verification of the non-aware delivery Toolchain, and open the test Toolchain.

6. Gray scale verification.

7. The combined code release version is online.

quick verification

If you want to verify whether the direction is correct, and at the same time give confidence to business students who are troubled by time-consuming compilation, you need to find the most critical point for quick verification.

Therefore, we decided to turn off all the generation logic of -Swift.h directly @import Module;. At this time, our understanding of the overall Swift source code is still relatively vague, but we only need to find similar << "@import"or other logic for writing files and then filter them. Fortunately, this process did not take too long.

a91ec8c64ea7cbd876825dafdbe11b93.png

We quickly found this piece of logic, commented out << "@import " << Name.str() << ";\n";it out, packaged and verified it successfully, and issued the data report at the beginning of this article, giving business students a reassurance.

Next, we can steadily and step by step to perform other tasks.

development, debugging

Swift-frontend parameter parsing process

So we turned our attention to other native parameters applied at the front-end level, and referred to their writing methods. Soon we locked our eyes on module-cache-paththat this is a required parameter for Swift front-end compilation, specifying the location of the module cache, and passing in a path later, which fully meets our requirements.

According to the analysis of this parameter, the parameter analysis process of the -frontend stage can be obtained. The specific investigation process will not be expanded, and the process will be simply followed.

daf2808c8bad11f267229ddb89b45e46.png

The simple process is as shown in the figure above, and the code position of modifying the parameter parsing process will be detailed below.

definition

40f8429627bf8a4f02cf20f906ede745.png

A very python-like TableGen (https://llvm.org/docs/TableGen/) language introduced by LLVM is used here. For the following flags, what we need is

  • FrontendOption front-end parameters, only with this flag can enter the front-end parameter parsing process, and the process of Clang Header generation occurs in the front-end process

  • The ArgumentIsPath parameter is a path, which tells the compiler to carry a path string as a parameter after this parameter

Custom parameters modeled on this form:

20fade1b45c2bdf07889d9aca9cd4b21.png

The second EQ definition is actually an Alias, which defines that the form of "flag=arg" can be used for parameter passing without any other additional effects.

Through tablegenthe tool, generate the content of Options.td as Options.inc, as shown below

2638627fdbfc331ff2d90f4f5e1ec294.png

Combined with the OPTION definition of Options.h in the Swift source code, it is introduced and provided for the cpp code to use

23c342540db30723734f8181224df75a.png

analyze

The parsing process occurs in the parameter parsing process of CompilerInvOCation

93075e3552ef75a2b213fa25185bb624.png

In the ArgsToFrontendOptionsConverter method, read the required information from the parameter list and assign it to Opts

d9ae6511f93bbaa3c478462c1953c101.png

Opts is an instance of the FrontOptions type, we need to define a string here to store the parameters we need

d871bac9ea67a7b08900842a11bdac08.png

Opts will flow through the entire front-end process, providing necessary parameters for each link.

Clang Header generation process

The flow chart of the calling process is as follows. PrintAsClang is a relatively independent module. We only need to pay attention to these two red links.

ea20cf938599f3e2a1408d4fb3062d3a.png

Add input definition

Add two parameters to the original method definition, which are the whitelist file path we passed in, and diagnostic information. The diagnostic information will be mentioned later to prompt some custom errors.

32d941b1f589164c0ff43e21e6790518.png

Here is the same, add two parameter definitions.

d48bac1b898e6bf585cd9262ce0b4ca4.png

Whitelist analysis

printAsClangHeader This is one of our main modifications. In this function_ref, we parse the content of the file pointed to by the allow list path, get the module name specified by the whitelist, and pass it to the next link as a parameter.

d1c157b076b2303d1f66c4644770b36c.png

The writeImports method adds a function_ref on the original basis, which can be understood as lambdaan expression , which is the process of parsing the whitelist we just did.

09c27b581abbae4c274d1a1ca9db655f.png

@import Module;Perform whitelist screening at the specific writing location, allow writing in the whitelist, otherwise skip.

d8830d380a6d21888be5002fe484b0ee.png

custom diagnostic information

Two custom entries are added to DiagnosticsClangImporter.def, error is used to prompt parsing errors, and note only prompts that the whitelist is empty, which is an allowed operation, and degenerates to the default logic at this time.

6e64d4afb24fd8ee9e1a8a916930757c.png

Earlier, we passed in the Diags instance in the method definition. If you want to prompt information, you only need to simply call it. The note will only be output to the log, and the error will interrupt the compilation process.

7490e57486bdf4e6f66ec922541df50e.png

Verification, online

You can use the cloud build machine to play the test Toolchain, download it locally, integrate it into Xcode and verify it on Douyin.

Add custom parameters to the compilation parameters of the specified hybrid component to build successfully.

postscript

Swift toolchain customization is a direction with unlimited possibilities, including compilation optimization and other efficiency improvement work, etc., which can perform in-depth optimization at the bottom layer that is difficult to perform at the traditional architecture layer. Follow-up for this can be done There are many more, I believe there are more experiences that can be shared with you.

6dfa4a4d22a76ec32288847eb58210b3.png Click "Read the original text" to join us!

Guess you like

Origin blog.csdn.net/ByteDanceTech/article/details/130120509