Article Directory
0. Preface
-
brief introduction:
- User Guide: User manual, mainly including installation, migration and some basic concepts
- TensorRT API Reference
- UFF Converter API Reference
- GraphSurgeon API Reference
1. User Guide
- User manual, the main content includes
- installation. In fact, it is a direct link to the documentation page of TensorRT, and it is to install PyCUDA.
- Migrate from TensorRT 4. I don't care about this part.
- Core (basic) concepts. This part is very important to me, so I will focus on learning.
1.1. Core Concepts
- TensorRT Workflow (workflow, basic workflow, three steps in total)
- The first step: model analysis and construction.
- Building
tensorrt.INetworkDefinition
objects - It can be built through a parser (such as ONNX Parser) or TensorRT Network API.
- Can
tensorrt.Builder
be constructed blanktensorrtINetworkDefinition
- Building
- Step 2: Model optimization (Engine optimization)
- Use
tensorrt.Builder
and created goodtensorrt.INetworkDefinition
to createtensorrt.ICudaEngine
- The optimized engine can be serialized into memory or a local file (
.trt
).
- Use
- Step 3: Implementation
- By creating good
tensorrt.ICudaEngine
newtensorrt.IExecutionContext
object model to achieve reasoning. - The main job is to allocate resources?
- By creating good
- The first step: model analysis and construction.
- Classes overview
- Logger: Log, nothing to say
- Engine and Context: It is the
tensorrt.ICudaEngine
object and thetensorrt.IExecutionContext
object. The former feels like an optimized model (may be incorrectly understood), and the latter is the context required for the model to run (it can be understood as the resource required for the model to run) - Builder: used to create
tensorrt.ICudaEngine
objects, need to betensorrt.INetworkDefinition
used as input - Network: An
tensorrt.INetworkDefinition
object, which represents a calculation graph. It is necessary to convert other deep learning framework models into this form. - Parsers: parsers, convert other forms of models into
tensorrt.INetworkDefinition
objects
2. TensorRT API Reference
- Foundational Types: Basic data structure in TensorRT
-
Core: The core components look like something related to the running process.
-
Network: Build network related, various layers and tensor related.
-
Plugin: Component, I guess it is mainly related to custom op.
-
Int8: Looking at the name is the quantification of the model, but I didn't look at the content inside.
-
UFF/Caffe/Onnx Parser: model converter
3. UFF Converter API Reference
- What is UFF?
- A model saving format similar to onnx, the API is mainly used for TensorFlow.
- I haven't used it either, and I heard it's not that easy to use.
- Doesn't TF have TF-TRT? Why do they need UFF? I don't know much, I won't know until I use it in the future.
- API includes two aspects
- Conversion tool: convert tf to uff
- operators: similar to Onnx, a bunch of layers.
4. GraphSurgeon API Reference
- What is GraphSurgeon?
- A tool specially used to process and convert TF calculation graphs.
- To put it plainly, API means adding, deleting, modifying and checking TF calculation graphs, including dynamic graphs and static graphs.
- But I don't use TF much now, so I don't know the details.