The first issue of Wang Zhongyang's Go book donation event: "TVM Compiler Principles and Practice"

Preface

With the development of artificial intelligence, the demand in fields such as computer vision, natural language processing and speech recognition continues to increase. In order to better meet these needs, many deep learning frameworks have been developed, among which TVM (TVirtual Machine) is an excellent compiler that can compile deep learning models into efficient machine code. Moreover, the core idea of ​​the TVM compiler is to convert the deep learning model into an efficient computing graph and optimize the computing nodes in the graph. In this way, the calculation time when running the model will be greatly reduced, while the power consumption efficiency of the model can also be improved. The implementation process of TVM compiler can be divided into three main parts: front-end, middle layer and back-end.

Implementation process of TVM compiler

The implementation process of the TVM compiler is divided into three core contents. The first is the front-end part, which is responsible for converting the model in the deep learning framework into an abstract computational graph. During this process, the front end can perform some preprocessing operations based on the structure and characteristics of the model, such as graph optimization and pruning. Next, the middle layer will receive the calculation graph passed from the front end and perform a series of optimization operations. These operations include graph transformation, graph pruning, data layout, memory optimization, etc. Finally, the optimized calculation graph will be passed to the back-end part, and the back-end will generate efficient machine code based on the characteristics of the target hardware.

The practical process of TVM compiler needs to be combined with a specific deep learning framework and hardware platform. For example, we need to choose a suitable deep learning framework and develop and train models in this framework. For another example, we can use the front-end interface provided by TVM to convert the model into a calculation graph and perform a series of optimization operations. Another example is the need to choose a suitable backend. TVM supports a variety of hardware platforms, including CPU, GPU and FPGA. According to the characteristics of the target platform, we can use the backend interface provided by TVM to generate efficient machine code and perform performance testing and optimization.

In practice, TVM compilers have many advantages. For example, TVM can be optimized for specific hardware platforms and can fully utilize the computing power of the hardware. Another example is that TVM provides rich optimization functions and can perform flexible optimization operations on calculation graphs. Effectively improve the operating efficiency of the model; for example, TVM also supports a variety of deep learning frameworks and programming languages, which is convenient for developers to use, and TVM has a low learning curve, so developers can quickly get started and compile and optimize the model.

About "TVM Compiler Principles and Practice"

Next, I would like to recommend an essential book about deep learning. This is a book about the principles and practical practices of the TVM compiler. The specific information is as follows.

Insert image description here

Editor's Choice

Target audience: Engineering and technical personnel engaged in AI algorithm, software, AI chip, and compiler development

Artificial Intelligence (AI) has been widely used in the information industry around the world. Deep learning models have promoted the AI ​​technology revolution, such as TensorFlow, PyTorch, MXNet, Caffe, etc. Most existing system frameworks are only optimized for a small range of server-class GPUs, and therefore require a lot of optimization efforts to be deployed on other platforms such as automotive, mobile phones, IoT devices, and specialized accelerators (FPGAs, ASICs) . As the number of deep learning models and hardware backends increases, TVM builds a unified solution based on intermediate representation (IR). TVM not only automatically optimizes deep learning models, but also provides an efficient open source deployment framework across platforms. The popularity of large models is gradually increasing. TVM is a good bridge to transform artificial intelligence theory and algorithm framework into practical project implementation. Therefore, this book will be loved by a large number of readers.

brief introduction

TVM (Tensor Virtual Machine, Tensor Virtual Machine) is an open source model compilation framework that aims to automatically compile machine learning models into machine language that can be executed by the underlying hardware, thereby utilizing various types of computing power. Its working principle is to first optimize the inference, memory management and thread scheduling of the deep learning model, and then use the LLVM framework to deploy the model on hardware devices such as CPU, GPU, FPGA, and ARM.

This book comprehensively analyzes the main functions of TVM, helps readers understand the working principle of TVM, and uses TVM to optimize and deploy deep learning and machine learning.

This book combines the author's many years of work and study experience, and strives to integrate TVM basic theory and case practice to explain in detail. The book has 9 chapters in total, including basic knowledge of TVM, development using TVM, operator fusion and graph optimization, TVM quantification technology, TVM optimization scheduling, Relay IR, code generation, back-end deployment and OpenCL (Open Computing Language, Open Computing Language). Automatic scheduling, automatic search and cost model. In addition to containing important knowledge points and practical skills, each chapter is also equipped with carefully selected typical cases.

This book is suitable for reading by engineering technicians, scientific research staff, and technical managers engaged in AI algorithms, software, compiler development, and hardware development. It can also be used as a reference book for teachers and students in colleges and universities related to compilers.

About the Author

Wu Jianming, graduated from Shanghai Jiao Tong University with a PhD in pattern recognition and intelligent systems. He has been engaged in artificial intelligence chip design for a long time, and is especially good at theoretical research and technological innovation in TVM/LLVM compilers, AI frameworks, autonomous driving, chip manufacturing, embedded systems and other fields. He has been working on the front line for a long time, including product design and code implementation, and has presided over and participated in the research and development of more than 30 products. He has also participated in projects of the National Natural Science Foundation and the Shanghai Municipal Science and Technology Commission, and has published 8 papers in core journals, 6 of which he is the first author of.

Book catalog

Chapter 1 Basic knowledge of TVM/

1.1 Basic principles of TVM/

1.1.1TVM overview/

1.1.2 Overview of TVM model optimization deployment/

1.2TVM compilation process/

1.2.1 Compilation process/

1.2.2TVM compiled data structure/

1.2.3TVM compiled data processing/

1.2.4TVM Pass process/

1.3TVM open source engineering logic architecture/

1.3.1 Code base code structure/

1.3.2 Code automatic kernel/

1.4TVM application support/

1.4.1TVM workflow/

1.4.2 Support multi-language and multi-platform/

1.4.3TVM application scenarios/

1.4.4TVM optimization model inference/

1.4.5TVM compiler and runtime components/

1.4.6TVM runtime main modules/

1.4.7TVM simple code generation and compilation example/

1.4.8 Relationship between TVM modules/

1.5TVM features and challenges/

1.5.1TVM Features/

1.5.2 Support multiple back-end devices/

1.5.3 Challenges faced by TVM/

Chapter 2 Development using TVM/

2.1 Configure TVM environment/

2.1.1apache TVM source code download/

2.1.2 Configure TVM development environment/

2.1.3 How to use TVM conda environment/

2.1.4 Compilation and implementation/

2.1.5 Import model method/

2.2 Compile and optimize TVM yolov3 example in conda environment/

2.3 Calling relationship between Python and C++/

2.3.1The underlying C++ data structure in TVM/

2.3.2 Function registration/

2.3.3 Upper-layer Python call/

2.4TVM custom code example/

2.4.1How to add code to TVM/

2.4.2TVM code generation implementation example/

2.5 Use TVM to implement the entire algorithm process/

2.5.1 Configuring tensors and creating schedules/

2.5.2 Optimize the downgrade operator/

2.5.3 Build host target program/

2.5.4 Implement back-end code generation/

Chapter 3 Operator Fusion and Graph Optimization/

3.1 Overview of operators/

3.1.1TVM Fusion Component Example/

3.1.2 Optimization calculation graph/

3.2 Graph GCN fusion/

3.2.1 Concept of graph/

3.2.2 New features of deep learning/

3.3 Graph fusion GCN example/

3.3.1 PyTorch implementation of GCN/

3.3.2 Fusion of BN and Conv layers/

3.4TVM graph optimization and operator fusion/

3.4.1 Graph and operator optimization/

3.4.2 Custom operator/

3.4.3 Operator fusion steps/

3.4.4 Add operator/ to Relay

3.5 End-to-end optimization/

3.5.1 Overview of AI Framework/

3.5.2 Computational graph optimization layer/

3.5.3 Four methods of TVM operator fusion/

3.5.4 Data layout conversion/

3.5.5 Tensor Expression Language/

3.5.6 Scheduling space analysis/

3.6 Analysis of TVM graph optimization and operator fusion solutions/

3.6.1 Graph optimization framework analysis/

3.6.2 Basic analysis of TVM optimization/

3.6.3TVM optimization parameters/

3.6.4 Operator optimization diagram/

3.6.5 Customized graph-level optimization/

3.7 Dominance tree technology/

3.7.1 Overview of Dominance Tree/

3.7.2 Operator fusion scheme and examples/

3.8 Control flow and optimizer/

3.8.1 Control flow/

3.8.2 Optimizer/

3.9TVM storage and scheduling/

3.9.1TVM compiler optimization/

3.9.2 Basic optimization of graph structure/

3.9.3 Tensor calculation/

3.10 Multifunctional tensor accelerator VTA/

3.10.1VTA-TVM Hardware-Software Stack/

3.10.2VTA main functions/

3.10.3VTA example/

3.10.4VTA calculation module/

3.10.5VTA control/

3.10.6microTVM model/

3.11TVM code base structure and examples/

3.11.1 Code base structure/

3.11.2 Tensor addition example/

3.12 Host driver execution/

3.12.1 firmware binaries/

3.12.2 Calculation statement/

3.12.3 Data tiling/

3.12.4 Convolution operation/

3.12.5 Space filling/

Chapter 4 TVM Quantification Technology/

4.1TVM Quantitative Overview/

4.1.1TVM quantification status/

4.1.2TVM quantification principle/

4.2int8 quantization and TVM execution/

4.2.1 Two main quantification schemes/

4.2.2int8 quantitative principle analysis/

4.2.3KL divergence calculation/

4.2.4 Implement int8 quantization/

4.3 Low-precision training and inference/

4.4NN quantization/

4.4.1 Overview of Neural Network Quantification/

4.4.2 Optimize data and network/

4.4.3 Forward inference and backpropagation/

4.5 Entropy Calibration Example/

4.6TVM quantification process/

4.6.1Two parallel quantization of Relay/

4.6.2Relay optimization Pass method/

4.6.3 Quantization processing hardware description/

4.6.4 Threshold estimation scheme/

4.6.5 Analog quantization error/

4.6.6 Scale calculation/

4.6.7 Data type allocation/

4.6.8 Data type allocation log/

4.6.9 Low-precision quantification of neural networks/

4.7TVM Quantitative Program Analysis/

Chapter 5 TVM Optimization Scheduling/

5.1TVM runtime system/

5.1.1TVM runtime system framework/

5.1.2PackedFunc compilation and deployment/

5.1.3 Build PackedFunc module/

5.1.4 Remote deployment method/

5.1.5TVM Object and Compiler Analysis/

5.2 Automatic differential static graphs and dynamic graphs/

5.2.1 Computational graph classification/

5.2.2 Dynamic graph implementation example/

5.3 Machine learning automatic differentiation/

5.3.1 Differential method/

5.3.2 Manual Differentiation/

5.3.3 Numerical Differentiation/

5.3.4 Symbolic differential/

5.3.5 Automatic differentiation/

5.3.6 Automatic Differentiation Implementation Example/

5.4 Sparse Matrix Analysis/

5.4.1 Sparse matrix concept/

5.4.2 Sparse matrix optimization/

5.4.3 Specific matrix compression storage/

5.4.4 Sparse matrix implementation example/

5.5TVM tensor calculation analysis/

5.5.1 Generating tensor operations/

5.5.2 Nested parallelism and collaboration/

5.5.3 Quantitative calculation/

5.5.4 Explicit memory latency hiding/

Chapter 6 Relay IR/

6.1TVM data introduction/

6.1.1 Introduction to TVM module framework/

6.1.2 Introduction to Relay IR principle/

6.1.3 Construct calculation graph/

6.1.4let binding and scope/

6.2IR code generation/

6.2.1 Front-end optimization/

6.2.2 Node optimization/

6.2.3 Algebraic optimization/

6.2.4 Data flow level optimization/

6.3 Register operators in Relay/

6.3.1 Add nodes and define compilation parameters/

6.3.2 Analysis of operation type relationship/

6.3.3 RELAY_REGISTER_OP macro registration in C++/

6.3.4 Operator registration and scheduling/

6.3.5 Registration function API analysis/

6.3.6 Package Python API/

6.3.7 Unit test analysis/

6.4 IR example in TVM/

6.4.1IRModule technical analysis/

6.4.2TVM Runtime analysis/

6.4.3 Predictive deployment implementation/

6.4.4 Dynamic graph implementation/

Foreword/preface to the book

Artificial Intelligence (AI) has been widely used in the information industry around the world. Deep learning models have promoted the AI ​​technology revolution, such as TensorFlow, PyTorch, MXNet, Caffe, etc. Most existing system frameworks are only optimized for a small range of server-class GPUs, and therefore require a lot of optimization efforts to be deployed on other platforms such as automotive, mobile phones, IoT devices, and specialized accelerators (FPGAs, ASICs) . As the number of deep learning models and hardware backends increases, TVM builds a unified solution based on intermediate representation (IR). TVM not only automatically optimizes deep learning models, but also provides an efficient open source deployment framework across platforms.

With the help of TVM, deep learning models can be easily run on mobile phones, embedded devices, and even browsers with very little customization work. TVM also provides a unified optimization framework for deep learning computing on multiple hardware platforms, including some dedicated accelerators with independently developed computing primitives. TVM is a deep learning compiler that allows everyone to use the open source framework to learn and develop anytime and anywhere. A diverse community has formed around TVM, with community members including hardware vendors, compiler engineers, and machine learning researchers working together to build a unified programmable software stack that enriches the entire machine learning technology ecosystem.

TVM is a new type of AI compiler that is widely used in various product development and has a great influence in corporate and academic research. However, there are currently very few books on TVM on the market, and this book attempts to fill this gap. The features of the book are summarized as follows:

First, starting from the concept of TVM, the basic principles and key supporting technologies of TVM are analyzed.

Second, it gradually unfolds from TVM environment construction to case practice, and analyzes how to use TVM for practical development.

Third, it introduces the important key technologies of TVM, such as operator and graph fusion, quantization technology, Relay IR (intermediate representation), optimized scheduling, compilation and deployment, etc., and analyzes the theory and case practice of these modules.

Fourth, TVM analyzes and practices back-end related technologies, including code generation, automatic scheduling, automatic search and cost models, etc.

During the writing process of this book, I received the full support of my family, and I would like to express my deep gratitude to them. I would also like to thank the editors of the Machinery Industry Press. Because of their hard work and dedication, this book was successfully published. Due to the editor's limited technical ability, there are bound to be errors in the book, and we hope that readers will give us your advice.

A quick overview of the book "TVM Compiler Principles and Practice"

Insert image description here

Start the portal of "TVM Compiler Principles and Practice":

https://item.jd.com/13978563.html , I personally think this book is very good, especially for developers in the field of artificial intelligence. It is a rare and good book that is worth owning and studying.

Conclusion

Through the introduction of this article, in general, the TVM compiler is an excellent deep learning model compilation tool that can optimize the model into efficient machine code. Its principles and practices can help us quickly develop and optimize deep learning models. Improve the operating efficiency and power consumption efficiency of the model. In the future development, TVM is expected to become an important tool in the field of deep learning and make greater contributions to the development of artificial intelligence. Therefore, friends who are in the field of artificial intelligence or will be engaged in artificial intelligence-related work need to seize the time to learn and understand TVM compiler, keep up with the pace of technological development to avoid being "eliminated".

Participate in the lottery

Add me on WeChat: wangzhongyang1993, note: CSDN lottery, invite you to join the lottery group to participate in the lottery.

Guess you like

Origin blog.csdn.net/w425772719/article/details/135452225