All done! TensorRT-based CNN/Transformer/detection/BEV model four deployment codes + CUDA acceleration!

Deploy on board! This word must be the top priority of all major autonomous driving companies in 2023. Through model deployment optimization, the floating-point model we trained can run on the car hardware faster and maintain high performance! Friends who are just starting to get into the pit will definitely think about parallel processing and what is CUDA? How is CNN deployed? What should I do if Transformer takes too much time? What should I do if the NMS is too slow? How to do post-processing optimization? How to deploy the BEV model on the vehicle? It's all about the questions, it's all about the details!

Since there are many perception modules involved in autonomous driving, such as classification, segmentation, 2D/3D detection, lane lines, key points, tracking, etc., but in the end it must go through model deployment and optimization before it can actually be implemented. Therefore, the deployment and optimization of models is one of the most challenging directions in the field of autonomous driving, or computer vision, and it is also the direction that best reflects engineering capabilities! Moreover, as the country vigorously supports the development of the new energy vehicle industry, related CV/autonomous driving/smart driving start-up companies have appeared in clusters in the past two years.

The economy will recover in 2023, and major companies will also increase their recruitment for relevant positions. I just saw a recruitment website, and the average monthly salary of related positions has reached 40,000, and the annual salary is 600,000. There are many high-level positions with an annual salary of one million!

9a04870d6dbaca689f926ee12ed2412f.png
From a recruitment website

Learning is difficult?

During this period of time, many friends have consulted about model deployment. In fact, we are also very interested in model deployment. The quality of learning materials related to model deployment on the market is uneven. Stepped on more pits:

The understanding of GPU parallel processing is not thorough enough, and I don't know how to start CUDA programming...

Trying to understand the principle, but don't know in which scene it should be used;

The expected acceleration and optimization effect in the actual measurement is far from the expected result, and it is still impossible to find out what went wrong.

In the process of self-study, you will encounter various problems. Transformer ONNX fails to transfer to TRT, CUDA and TRT versions do not match, various segment faults, run into problems with open source code, and cannot find a solution. The friends all have a deep understanding;

... ...

After analyzing the pain points of everyone in the learning process, the Heart of Autopilot and Dr. Han Jun from Waseda University jointly produced the course "CUDA Acceleration and TensorRT Deployment". If you want to learn model deployment, or you are suffering from the serious drop of the model you deploy, you don’t know how to optimize time-consuming, and you lack practical project experience, then you must study this course. The course content introduces in detail parallel processing, Actual deployment of GPU, CUDA and TensorRT.

The course starts from the most basic parallel processing and GPU architecture, and then to the introduction of CUDA and cuDNN, writing your first CUDA program, and then to the basic introduction of TensorRT and API usage. The actual combat involves classifier deployment (CNN+Transformer), YOLO series Detector deployment, and a detailed explanation of the heavyweight BEVFusion model deployment! After the course, it is also planned to add the process of building TensorRT plugin, explain TensorRT's Parser, TVM and other compilers in detail, and deploy on Edge device! Full of dry goods, it really helps students with zero foundation to learn efficiently and quickly master every knowledge point. The course outline is as follows:

dc5d95945f0ec77e01dde10a17817647.png

course features

  • Full coverage of CV direction

Directly attack model deployment problems such as image classification, Transformer, object detection, and BEV perception, taking into account the fields of computer vision and autonomous driving!

  • Combining theory with practice

Combination of actual combat and theory of the project, and the actual combat code after class of the actual combat course, which can be mastered quickly after learning and practicing.

A total of 5 major combat projects

The course includes a complete [teacher teaching] + [teaching assistant answering questions] service to ensure that every little partner can learn knowledge happily.

  • Practical combat 1: CUDA programming, optimization of matrix calculation, optimization of pre-processing and post-processing, and sharing of pitfalls;

  • Combat 2: TensorRT C++ API usage introduction and Nsight model performance analysis;

  • Combat 3: Classifier deployment and optimization: CNN deployment, Transformer deployment and optimization;

  • Combat 4: Deployment and optimization of YOLO v8: deployment of detection/segmentation, pre/post-processing optimization, model bottleneck analysis and optimization strategy;

  • Combat 5: Deployment and optimization of open source project BEVFusion: Detailed explanation of BEVFusion framework, NVIDIA-AI-IOT deployment and analysis of BEVFusion!

Courseware codes are readily available

For detailed explanation, not only theory, but also code and practice must be explained thoroughly! Through a full set of video explanations, it will help you build the basic framework of the model in your mind, and thoroughly understand every knowledge point, thereby improving the efficiency and speed of writing code.

dfcb23b7690bbb8a2bfc815bdcfdb803.png

b59d402c7ea55c2fc8fbd87fbe35aa46.png

Instructors

Han Jun, graduated from Waseda University with a Ph.D., is currently a researcher and lecturer at Waseda University, and is affiliated to the deep learning R&D department of a leading autonomous driving company in Japan. During his doctoral period, he focused on compiler optimization, parallel processing, logic programming and mathematical verification. Currently engaged in deep learning high-performance deployment, Transformer hardware development, Multi-task training, Active learning, Apollo, Autoware and other fields

After learning the course

  1. Have a deep understanding of TensorRT model deployment, and have greatly improved model deployment and optimization;

  2. Master the model deployment and optimization of classification, detection and BEV perception, and deeply understand the pain points and difficulties of deployment optimization;

  3. After completing this course, you can reach the level of a model deployment engineer for about one year;

  4. Get to meet many industry practitioners and study partners!

suitable for the crowd

  1. Undergraduate/Master/PhD in the research direction of computer vision and autonomous driving perception;

  2. CV and autonomous driving 2D/3D perception related algorithm engineers;

  3. Algorithm engineers who need CUDA acceleration;

  4. Small partners who have needs for model deployment and optimization;

The basics required for this course

  1. Have a certain foundation of python, pyTorch, Makefile, docker, familiar with C/C++, familiar with some basic algorithms commonly used in deep learning;

  2. Have a certain understanding of the application and basic solutions of GPU, CUDA, object detection, segmentation, Transformer, and BEV perception;

  3. A certain foundation of linear algebra and matrix theory;

  4. The computer needs to have its own GPU, which can be deployed through CUDA (the teacher’s teaching configuration is RTX3080 10G, Jeston AGX Xavier);

Class time and learning style

On July 15, 2023, the learning journey will officially start. After two months, offline video lectures will be given. The lecturer answers questions in the WeChat learning group, and solves problems such as algorithms, codes, and environment configuration in the course one by one!

course consultation

Class discount! Scan the QR code to study the course together!

f1f67c4049960b619102b9391e511613.png

Scan the code to add assistant consulting courses!

(WeChat: AIDriver004)

65bce6be5e29b17b611b8a01428e0979.jpeg

Guess you like

Origin blog.csdn.net/CV_Autobot/article/details/131238551