C++-based deep learning model deployment

1.1 Introduction

 As an end-to-end deep learning framework, PyTorch has good production environment deployment conditions after version 1.0. In addition to writing REST API for deployment on the web side (reference), there are also extensive requirements for software side deployment. Especially the recently released version 1.5 provides a more stable C++ front-end API.

     The biggest difference between the industrial world and the academic world is that the industrial model needs to be deployed on the ground. The academic world is more concerned about the accuracy requirements of the model, and less concerned about the deployment performance of the model. Generally speaking, after we train a model with a deep learning framework, Python is enough to implement a simple reasoning demonstration. But in a production environment, Python's portability and speed performance are far inferior to C++. Therefore, for deep learning algorithm engineers, Python is usually used for the rapid implementation of ideas and model training, and C++ is used as a production tool for models. At present, PyTorch can perfectly combine the two. The core technology components that implement PyTorch model deployment are TorchScript and libtorch .

     Therefore, the PyTorch-based deep learning algorithm engineering process is roughly as shown in the following figure:

                                                                                                                                          

1.2 TorchScript

     TorchScript can be regarded as an intermediate representation of the PyTorch model, and the PyTorch model represented by TorchScript can be directly read in C++. PyTorch can use TorchScript to build serialized models after version 1.0. TorchScript provides two application methods: Tracing and Script.

     Tracing application examples are as follows:

class MyModel(torch.nn.Module):
    def __init__(self):
        super(MyModel, self).__init__()
        self.linear = torch.nn.Linear(4, 4)
 
 
    def forward(self, x, h):
        new_h = torch.tanh(self.linear(x) + h)
        return new_h, new_h
 
 
# 创建模型实例 
my_model = MyModel()
# 输入示例
x, h = torch.rand(3, 4), torch.rand(3, 4)
# torch.jit.trace方法对模型构建TorchScript
traced_model = torch.jit.trace(my_model, (x, h))
# 保存转换后的模型
traced_model.save('model.pt')

     In this code, we first define a simple model and create a model instance, and then given an input example, the most critical step of the Tracing method is to use the torch.jit.trace method to transform the model into TorchScript. We can obtain the transformed traced_model object to obtain its computational graph attributes and code attributes. Calculation graph properties: 

print(traced_model.graph)
graph(%self.1 : __torch__.torch.nn.modules.module.___torch_mangle_1.Module,
      %input : Float(3, 4),
      %h : Float(3, 4)):
  %19 : __torch__.torch.nn.modules.module.Module = prim::GetAttr[name="linear"](%self.1)
  %21 : Tensor = prim::CallMethod[name="forward"](%19, %input)
  %12 : int = prim::Constant[value=1]() # /var/lib/jenkins/workspace/beginner_source/Intro_to_TorchScript_tutorial.py:188:0
  %13 : Float(3, 4) = aten::add(%21, %h, %12) # /var/lib/jenkins/workspace/beginner_source/Intro_to_TorchScript_tutorial.py:188:0
  %14 : Float(3, 4) = aten::tanh(%13) # /var/lib/jenkins/workspace/beginner_source/Intro_to_TorchScript_tutorial.py:188:0
  %15 : (Float(3, 4), Float(3, 4)) = prim::TupleConstruct(%14, %14)
  return (%15)

 Code attributes:

print(traced_cell.code)
def forward(self,
    input: Tensor,
    h: Tensor) -> Tuple[Tensor, Tensor]:
  _0 = torch.add((self.linear).forward(input, ), h, alpha=1)
  _1 = torch.tanh(_0)
  return (_1, _1)

     In this way, we can save the entire model to the hard disk, and the model saved in this way can be loaded into other language environments.

     Another implementation of TorchScript is the Script method, which can be regarded as a supplement to the Tracing method. When the model code contains control flow programs such as if or for-loop, the Tracing method is invalid. At this time, the Script method can be used to implement TorchScript. The implementation method is not much different from Tracing. The key is to replace jit.tracing with jit.script method. The example is as follows.

scripted_model = torch.jit.script(MyModel)
scripted_model.save('model.pt')

      In addition to Tracing and Script, we can also mix the two methods, which will not be detailed here. In short, TorchScript provides us with a form of representation that can be optimized by the compiler to provide more efficient execution.

1.3 libtorch

     After converting the trained model in the Python environment, we need PyTorch in the C++ environment to read the model and compile and deploy. PyTorch in this C++ environment is libtorch. Because libtorch is usually used as the C++ interface of the PyTorch model, libtorch is also called the C++ front end of PyTorch.

     We can download the compiled libtorch installation package directly from the PyTorch official website. Of course, we can also download the source code and compile it ourselves. It should be noted here that the installed version of libtorch must be consistent with the version of PyTorch in the Python environment.

     After installing libtorch, you can simply test whether it is normal. For example, we use TorchScript to convert a pre-trained model, an example is as follows: 

import torch
import torchvision.models as models
vgg16 = models.vgg16()
example = torch.rand(1, 3, 224, 224).cuda() 
model = model.eval()
traced_script_module = torch.jit.trace(model, example)
output = traced_script_module(torch.ones(1,3,224,224).cuda())
traced_script_module.save('vgg16-trace.pt')
print(output)

The output is: 

tensor([[ -0.8301, -35.6095, 12.4716]], device='cuda:0',
        grad_fn=<AddBackward0>)

     Then switch to the C++ environment and write the CmakeLists file as follows:

cmake_minimum_required(VERSION 3.0.0 FATAL_ERROR)
project(libtorch_test)
find_package(Torch REQUIRED)
message(STATUS "Pytorch status:")
message(STATUS "libraries: ${TORCH_LIBRARIES}")
add_executable(libtorch_test test.cpp)
target_link_libraries(libtorch_test "${TORCH_LIBRARIES}")
set_property(TARGET libtorch_test PROPERTY CXX_STANDARD 11)

 Continue to write test.cpp code as follows:

#include "torch/script.h"
#include "torch/torch.h"
#include <iostream>
#include <memory>
using namespace std;
 
 
int main(int argc, const char* argv[]){
    if (argc != 2) {
        std::cerr << "usage: example-app <path-to-exported-script-module>\n";
        return -1;
    }
 
 
    // 读取TorchScript转化后的模型
    torch::jit::script::Module module;
    try {
        module = torch::jit::load(argv[1]);
    }
 
 
    catch (const c10::Error& e) {
        std::cerr << "error loading the model\n";
        return -1;
    }
 
 
    module->to(at::kCUDA);
    assert(module != nullptr);
    std::cout << "ok\n";
 
 
    // 构建示例输入
    std::vector<torch::jit::IValue> inputs;
    inputs.push_back(torch::ones({1, 3, 224, 224}).to(at::kCUDA));
 
 
    // 执行模型推理并输出tensor
    at::Tensor output = module->forward(inputs).toTensor();
    std::cout << output.slice(/*dim=*/1, /*start=*/0, /*end=*/5) << '\n';}

     Compile test.cpp and execute it. The output is as follows. Comparing the running results in the Python environment, we can find that they are basically the same, which also shows that there is no problem with the installation of libtorch in the current environment. 

ok
-0.8297, -35.6048, 12.4823
[Variable[CUDAFloatType]{1,3}]

 1.4 Complete deployment process

     Through the previous description of TorchScript and libtorch, in fact, we have basically talked about the C++ deployment of PyTorch, and here we will take a complete look at the entire process. The deployment process of the PyTorch model based on C++ is as follows.

first step:

     Use the torch.jit.trace method to convert the PyTorch model to TorchScript. An example is as follows:

import torch
from torchvision.models import resnet18
model =resnet18()
example = torch.rand(1, 3, 224, 224)
tracing.traced_script_module = torch.jit.trace(model, example)

The second step:

     Serialize TorchScript to .pt model file.

traced_script_module.save("traced_resnet_model.pt")

third step:

     Import the serialized TorchScript model in C++, for this we need to write the cpp file containing the calling program, the CMakeLists.txt file for configuration and compilation. The sample content of the CMakeLists.txt file is as follows:

cmake_minimum_required(VERSION 3.0 FATAL_ERROR)
project(custom_ops)
find_package(Torch REQUIRED)
add_executable(example-app example-app.cpp)
target_link_libraries(example-app "${TORCH_LIBRARIES}")
set_property(TARGET example-app PROPERTY CXX_STANDARD 14)

      The example code of example-app.cpp that contains the model calling program is as follows:

#include <torch/script.h> // torch头文件.
#include <iostream>#include <memory>
 
 
int main(int argc, const char* argv[]) {
  if (argc != 2) {
    std::cerr << "usage: example-app <path-to-exported-script-module>\n";
    return -1;
  }
 
 
  torch::jit::script::Module module;
  try {
    // 反序列化:导入TorchScript模型
    module = torch::jit::load(argv[1]);
  }
 
 
  catch (const c10::Error& e) {
    std::cerr << "error loading the model\n";
    return -1;
  }
  std::cout << "ok\n";}

     After the two files are written, they can be compiled:

mkdir example_test
cd example_test
cmake -DCMAKE_PREFIX_PATH=/path/to/libtorch ..
cmake --example_test . --config Release

the fourth step:

Add model inference code to example-app.cpp and execute:

std::vector<torch::jit::IValue> inputs;inputs.push_back(torch::ones({1, 3, 224, 224}));
// 执行推理并将模型转化为Tensor
output = module.forward(inputs).toTensor();std::cout << output.slice(/*dim=*/1, /*start=*/0, /*end=*/5) << '\n';

     The above is the whole process of deploying PyTorch model in C++. For related tutorials, please refer to the official PyTorch: https://pytorch.org/tutorials/

to sum up

     Model deployment is very important to algorithm engineers, and it is related to whether your work can generate actual value. Correspondingly, you also need to have sufficient engineering capabilities, such as MySQL, Redis, C++, some knowledge and development techniques of front-end and back-end, all algorithm engineers need to be able to understand and be able to use it.

Guess you like

Origin blog.csdn.net/wzhrsh/article/details/109552923