[PaddleInferenceSharp] Deploy the PaddlePaddle model based on C# and Paddle Inference

insert image description here

1. Project introduction

  Paddle Inference is Paddle's native inference library, which provides server-side high-performance inference capabilities and is directly based on Paddle's training operators, so it supports the inference of all models trained by Paddle; Paddle Inference has rich features and excellent performance. Different platforms and different application scenarios have been deeply adapted and optimized to achieve high throughput and low latency, ensuring that the flying paddle model can be trained and used on the server side and quickly deployed.

  However, since Paddle Inference currently only provides Python, C++, C, and Go language method interfaces, C# cannot be used directly. In recent years, the C# language has developed rapidly and has become one of the top programming language rankings. In order to call the Paddle Inference model inference library in the C# language, PaddleInferenceSharp is launched according to the principle of the C++ dynamic link library. The C# platform calls Paddle Inference to deploy the deep learning model. Its realization principle can refer to the figure below:

insert image description here

2. Project environment configuration

 In order to prevent problems with the recurring code, the following code development environments are listed, which can be set according to your needs.

  • OS: Windows 11
  • CUDA:11.4
  • cuDNN:8.2.4
  • TensorRT:8.4.0.6
  • OpenCV:4.5.5
  • Visual Studio:2022
  • C# framework: .NET 6.0
  • OpenCvSharp:OpenCvSharp4

 The most important thing here is to install Paddle Inference C++ version. For the specific installation method, please refer to the following link: Paddle Inference C++ Dependent Library Installation (Windows) . For other dependency installations, please refer to the following links: NVIDIA TensorR Installation (Windows C++) , OpenCV C++ Installation and Configuration .

3. Project download method

 The source code used in the project has been open sourced on Github and Gitee.

Github:

git clone https://github.com/guojin-yan/PaddleInferenceSharp.git

Gitee:

git clone https://gitee.com/guojin-yan/PaddleInferenceSharp.git

4. PaddleInfer class

4.1 API methods

serial number API parameter explanation illustrate
1 method PaddleInfer() Constructor, initialize the inference core, read the local model
parameter string model_path Static graph model file
string params_path Model configuration file information, default is empty
2 method void set_divice() Set up an inference device Support CPU, GPU, ONNX runtime, oneDNN
parameter Division division Device Name Selection
whether _ For CPU and ONNX runtime, it represents the number of threads, and the default is 10;
for GPU, it represents the graphics card number, and the default is 0;
for oneDNN, it represents the number of caches, and the default is 1
head memory_init_size Memory allocation space (effect when using GPU), the default is 500
int workspace_size Video memory workspace (effect when using GPU), the default is 30
3 method List <string> get_input_names() Get the input node name
4 method void set_input_shape() Set input node shape According to the node dimension setting
parameter int[] input_shape shape array
string input_name node name
5 method void load_input_data() Set image/common input data method overloading
parameter string input_name Enter a node name
float[] input_data Input data
parameter string input_name Enter a node name
byte[] image_data image data
ulong image_size picture length
int type Data processing type:
type == 0: mean variance normalization, direct scaling
type == 1: ordinary normalization, direct scaling
type == 2: mean variance normalization, affine transformation
6 method void infer() model reasoning
7 method List <string> get_output_names() Get output node name
8 method List <int> get_shape() Get the specified node shape
parameter string node_name node name
9 method void T[] read_infer_result <T>() Read inference result data Support reading Float32, Int32, Int64 format data
parameter string output_name output node name
int data_size output data length
10 method void delet() delete memory address

4.2 Enumeration

serial number name enum variable meaning
1 Device
device name
CPU Use CPU inference
GPU Use GPU inference
ONNX_runtime Inference with ONNX_runtime
oneDNN Inference with oneDNN

 Regarding the use of the above methods, the county magistrate’s case tutorial and detailed technical documents will be updated in the future, so stay tuned.

Guess you like

Origin blog.csdn.net/grape_yan/article/details/127689090