C++-based PyTorch model deployment

PyTorch

Author：louwill

Machine Learning Lab

introduction

As an end-to-end deep learning framework, PyTorch has good production environment deployment conditions after version 1.0. In addition to writing REST APIs for deployment on the web side (for reference), there are also extensive requirements for deployment on the software side. Especially the recently released version 1.5 provides a more stable C++ front-end API.

The biggest difference between industry and academia is that models in industry need to be deployed on the ground. Academia is more concerned with the accuracy requirements of the model than about the deployment performance of the model. Generally speaking, after we have trained a model with a deep learning framework, using Python is enough to implement a simple inference demonstration. But in a production environment, Python is far less portable and faster than C++. Therefore, for deep learning algorithm engineers, Python is usually used for fast implementation of ideas and model training, and C++ is used as a production tool for models. At present, PyTorch can perfectly combine the two. The core technical components that implement PyTorch model deployment are TorchScript and libtorch.

Therefore, the engineering process of deep learning algorithm based on PyTorch is generally shown in the following figure:

TorchScript

TorchScript can be regarded as an intermediate representation of the PyTorch model, and the PyTorch model represented by TorchScript can be directly read in C++. PyTorch can use TorchScript to build serialized models after version 1.0. TorchScript provides Tracing and Script two application methods.

Examples of tracing applications are as follows:

class MyModel(torch.nn.Module):
    def __init__(self):
        super(MyModel, self).__init__()
        self.linear = torch.nn.Linear(4, 4)


    def forward(self, x, h):
        new_h = torch.tanh(self.linear(x) + h)
        return new_h, new_h


# 创建模型实例 
my_model = MyModel()
# 输入示例
x, h = torch.rand(3, 4), torch.rand(3, 4)
# torch.jit.trace方法对模型构建TorchScript
traced_model = torch.jit.trace(my_model, (x, h))
# 保存转换后的模型
traced_model.save('model.pt')

In this code, we first define a simple model and create a model instance, and then given an input example, the most critical step of the Tracing method is to use the torch.jit.trace method to TorchScript transformation of the model. We can obtain the transformed traced_model object to obtain its computational graph properties and code properties. Computational graph properties:

print(traced_model.graph)

graph(%self.1 : __torch__.torch.nn.modules.module.___torch_mangle_1.Module,
      %input : Float(3, 4),
      %h : Float(3, 4)):
  %19 : __torch__.torch.nn.modules.module.Module = prim::GetAttr[name="linear"](%self.1)
  %21 : Tensor = prim::CallMethod[name="forward"](%19, %input)
  %12 : int = prim::Constant[value=1]() # /var/lib/jenkins/workspace/beginner_source/Intro_to_TorchScript_tutorial.py:188:0
  %13 : Float(3, 4) = aten::add(%21, %h, %12) # /var/lib/jenkins/workspace/beginner_source/Intro_to_TorchScript_tutorial.py:188:0
  %14 : Float(3, 4) = aten::tanh(%13) # /var/lib/jenkins/workspace/beginner_source/Intro_to_TorchScript_tutorial.py:188:0
  %15 : (Float(3, 4), Float(3, 4)) = prim::TupleConstruct(%14, %14)
  return (%15)

Code Properties:

print(traced_cell.code)



def forward(self,
    input: Tensor,
    h: Tensor) -> Tuple[Tensor, Tensor]:
  _0 = torch.add((self.linear).forward(input, ), h, alpha=1)
  _1 = torch.tanh(_0)
  return (_1, _1)

In this way, we can save the entire model to the hard disk, and the model saved in this way can be loaded into other language environments.

Another implementation of TorchScript is the Script method, which can be regarded as a supplement to the Tracing method. When the model code contains control flow programs such as if or for-loop, it is invalid to use the Tracing method. In this case, the Script method can be used to implement TorchScript. The implementation method is not much different from Tracing. The key is to replace jit.tracing with jit.script method. The example is as follows.

scripted_model = torch.jit.script(MyModel)
scripted_model.save('model.pt')

In addition to Tracing and Script, we can also mix these two methods, which will not be described in detail here. In conclusion, TorchScript provides us with a representation in which code can be compiler optimized to provide more efficient execution.

libtorch

After converting the trained model in the Python environment, we need PyTorch in the C++ environment to read the model and compile and deploy it. PyTorch in this C++ environment is libtorch. Because libtorch is often used as a C++ interface to PyTorch models, libtorch is also known as a C++ front-end for PyTorch.

We can download the compiled libtorch installation package directly from the PyTorch official website, or of course, we can download the source code and compile it ourselves. It should be noted here that the installed libtorch version should be the same as the PyTorch version in the Python environment.

After installing libtorch, you can simply test whether it is normal. For example, we use TorchScript to convert a pre-trained model, the example is as follows:

import torch
import torchvision.models as models
vgg16 = models.vgg16()
example = torch.rand(1, 3, 224, 224).cuda() 
model = model.eval()
traced_script_module = torch.jit.trace(model, example)
output = traced_script_module(torch.ones(1,3,224,224).cuda())
traced_script_module.save('vgg16-trace.pt')
print(output)

The output is:

tensor([[ -0.8301, -35.6095, 12.4716]], device='cuda:0',
        grad_fn=<AddBackward0>)

Then switch to the C++ environment and write the CmakeLists file as follows:

cmake_minimum_required(VERSION 3.0.0 FATAL_ERROR)
project(libtorch_test)
find_package(Torch REQUIRED)
message(STATUS "Pytorch status:")
message(STATUS "libraries: ${TORCH_LIBRARIES}")
add_executable(libtorch_test test.cpp)
target_link_libraries(libtorch_test "${TORCH_LIBRARIES}")
set_property(TARGET libtorch_test PROPERTY CXX_STANDARD 11)

Continue to write the test.cpp code as follows:

#include "torch/script.h"
#include "torch/torch.h"
#include <iostream>
#include <memory>
using namespace std;


int main(int argc, const char* argv[]){
    if (argc != 2) {
        std::cerr << "usage: example-app <path-to-exported-script-module>\n";
        return -1;
    }


    // 读取TorchScript转化后的模型
    torch::jit::script::Module module;
    try {
        module = torch::jit::load(argv[1]);
    }


    catch (const c10::Error& e) {
        std::cerr << "error loading the model\n";
        return -1;
    }


    module->to(at::kCUDA);
    assert(module != nullptr);
    std::cout << "ok\n";


    // 构建示例输入
    std::vector<torch::jit::IValue> inputs;
    inputs.push_back(torch::ones({1, 3, 224, 224}).to(at::kCUDA));


    // 执行模型推理并输出tensor
    at::Tensor output = module->forward(inputs).toTensor();
    std::cout << output.slice(/*dim=*/1, /*start=*/0, /*end=*/5) << '\n';}

Compile test.cpp and execute it, the output is as follows. Comparing the running results in the Python environment, it can be found that they are basically the same, which also shows that there is no problem with the installation of libtorch in the current environment.

ok
-0.8297, -35.6048, 12.4823
[Variable[CUDAFloatType]{1,3}]

Complete deployment process

Through the previous description of TorchScript and libtorch, in fact, we have basically talked about the C++ deployment of PyTorch, here we will take a complete look at the whole process. The C++-based PyTorch model deployment process is as follows.

first step:

Convert the PyTorch model to TorchScript through the torch.jit.trace method, the example is as follows:

import torch
from torchvision.models import resnet18
model =resnet18()
example = torch.rand(1, 3, 224, 224)
tracing.traced_script_module = torch.jit.trace(model, example)

Step 2:

Serialize TorchScript to .pt model file.

traced_script_module.save("traced_resnet_model.pt")

third step:

Import the serialized TorchScript model in C++. For this, we need to write the cpp file containing the calling program, and the CMakeLists.txt file for configuration and compilation. The sample content of the CMakeLists.txt file is as follows:

cmake_minimum_required(VERSION 3.0 FATAL_ERROR)
project(custom_ops)
find_package(Torch REQUIRED)
add_executable(example-app example-app.cpp)
target_link_libraries(example-app "${TORCH_LIBRARIES}")
set_property(TARGET example-app PROPERTY CXX_STANDARD 14)

The example example-app.cpp containing the model invoker is coded as follows:

#include <torch/script.h> // torch头文件.
#include <iostream>#include <memory>


int main(int argc, const char* argv[]) {
  if (argc != 2) {
    std::cerr << "usage: example-app <path-to-exported-script-module>\n";
    return -1;
  }


  torch::jit::script::Module module;
  try {
    // 反序列化：导入TorchScript模型
    module = torch::jit::load(argv[1]);
  }


  catch (const c10::Error& e) {
    std::cerr << "error loading the model\n";
    return -1;
  }
  std::cout << "ok\n";}

Once the two files are written, they can be compiled:

mkdir example_test
cd example_test
cmake -DCMAKE_PREFIX_PATH=/path/to/libtorch ..
cmake --example_test . --config Release

the fourth step:

Add model inference code to example-app.cpp and execute:

std::vector<torch::jit::IValue> inputs;inputs.push_back(torch::ones({1, 3, 224, 224}));
// 执行推理并将模型转化为Tensor
output = module.forward(inputs).toTensor();std::cout << output.slice(/*dim=*/1, /*start=*/0, /*end=*/5) << '\n';

The above is the whole process of deploying the PyTorch model in C++. For related tutorials, please refer to the official PyTorch:

https://pytorch.org/tutorials/

References:

https://pytorch.org/tutorials/

https://pytorch.org/features/

Past highlights:

PyTorch deep learning training visualization tool tensorboardX

Write a PyTorch-like DataLoader method in Keras

PyTorch deep learning training visualization tool visdom

PyTorch Data Pipeline Standardized Code Template

The growth path of an algorithm engineer

Long press the QR code. Follow the machine learning lab

喜欢您就点个在看！

C++-based PyTorch model deployment

Guess you like