Libtorch installation and deployment model, and common operations

Install:

Environmental requirements:

        win10

        vs2019 (use 2019 and above, otherwise there may be problems such as unsupported)

        cuda11.1 (the version must correspond to the downloaded libtorch version)

        cuda10.2, cuda11.3, cuda11.6, cuda11.7 and other current mainstream versions.

1. Find the corresponding version on the pytorch official website

1019dae7cb544a5b9c78da3538ce494a.png

2. Vs2019 creates a new project and adjusts it to release, under x64

f47114644226c2faa713bf90aba02a46.png

3. Add in the VC++ directory---include directory

535b1c9b8ce22438ac7d0205a250db56.png

4. Add in the VC++ directory---library directory

024487ac4e8baa4ab3c3196e66cfc39e.png

5. Add lib dependencies in linker---input---additional dependencies (add all lib endings in D:\libtorch\libtorch-win-shared-with-deps-1.10.1+cu111\libtorch\lib ) different versions have different libs.

9d3d7a3ade099b59f277f1728139b126.png

6. Add in debugging---environment

8940cd92250a3d53924e4f51bb0fdaa9.png

At this point the environment has been set up, you can check whether libtorch can run successfully. (The cpu version is already available at this time, but cuda is not yet available).

#include <iostream>
#include <torch/torch.h>
#include<torch/script.h>
using namespace std;


int main()
{
  cout << "cuda是否可用:" << torch::cuda::is_available() << endl;
  cout << "cudnn是否可用:" << torch::cuda::cudnn_is_available() << endl;
  cout << torch::cuda::device_count() << endl;


  torch::Tensor tr = torch::arange(0, 9, torch::kFloat32).reshape({ 3,3 });
  cout << tr << endl;
  return 0;
}

result:

5e5d1b9e8a686b8565f65d7bcbc738b7.png

7. Turn on the cuda function

Enter /INCLUDE:?warp_size@cuda@at@@YAHXZ at the linker --- command line 

The commands entered here are different for different versions.

cuda10.2:/INCLUDE:?warp_size@cuda@at@@YAHXZ

cuda11.1:/INCLUDE:?warp_size@cuda@at@@YAHXZ /INCLUDE:?searchsorted_cuda@native@at@@YA?AVTensor@2@AEBV32@0_N1@Z

cuda11.3 cuda11.6 cuda11.7 :/INCLUDE:?warp_size@cuda@at@@YAHXZ /INCLUDE:?_torch_cuda_cu_linker_symbol_op_cuda@native@at@@YA?AVTensor@2@AEBV32@@Z

Run the cuda version test code again

#include <iostream>
#include <torch/torch.h>
#include<torch/script.h>
using namespace std;


int main()
{
  cout << "cuda是否可用:" << torch::cuda::is_available() << endl;
  cout << "cudnn是否可用:" << torch::cuda::cudnn_is_available() << endl;
  cout << torch::cuda::device_count() << endl;


  torch::Tensor tr = torch::arange(0, 9, torch::kFloat32).reshape({ 3,3 }).to(torch::kCUDA);
  cout << tr << endl;
  return 0;
}

result:

f6a981c89b0d38bf48539e3a6cc56201.png

Model deployment:

1. Convert the model into a libtorch model through pytorch's own torch.jit.trace or torch.jit.script.

2. Use libtorch to load and reason the model.

#include <iostream>
#include <torch/torch.h>
#include<torch/script.h>
using namespace std;


int main()
{
  cout << "cuda是否可用:" << torch::cuda::is_available() << endl;
  cout << "cudnn是否可用:" << torch::cuda::cudnn_is_available() << endl;
  cout << torch::cuda::device_count() << endl;


  //1.加载模型
    torch::jit::script::Module SeedModule = torch::jit::load("./model/model.pt");
    SeedModule.to(torch::kCUDA);
    SeedModule.eval();


    //2.图像预处理
    //3.转化为tensor
    torch::Tensor tensor_img = torch::arange(0, 3*256*256).toType(torch::kFloat32).reshape({1,3,256,256}); //自己先造一个数据
    tensor_img = tensor_img.to(torch::kCUDA);   //[b,c,z,y,x]


    //4.网络前向计算
     auto seed_pred_501 = SeedModule.forward({ tensor_img }).toTensor().squeeze().to(torch::kCPU).detach();
  return 0;
}


Common operations of libtorch

1. Turn tensor into ordinary data

torch::Tensor a = torch::rand({ 2,3 });
float bb = a[0][0].item().toFloat();  //将tensor变为普通数据  方式1
float bbb = a[0][0].item<float>();   //将tensor变为普通数据  方式2

2.vector becomes tensor

vector<float> vvf;
for (int i = 0; i < 30; i++)
{
    vvf.push_back(i);
}
torch::Tensor ttt_v = torch::from_blob(vvf.data(), { 10,3 });

3. Tensor becomes vector

auto v1 = torch::arange(0, 15, torch::kInt32).reshape({ 5,3 });
vector<int> ggg(v1.data_ptr < int >(), v1.data_ptr < int >() + v1.numel());

4. The pointer becomes tensor

float* fff = new float[10 * 3];
for (int i = 0; i < 30; i++)
{
    fff[i] = i*2.0;
}
torch::Tensor ttt_p = torch::from_blob(fff, { 10,3 });

5. Slicing operation

torch::Tensor aa = torch::arange(0, 125, 1).reshape({ 5,5,5 });
//获得aa[1:4,1:4,1:4]数据
auto bb =  aa.index({ torch::indexing::Slice(1,4,1),torch::indexing::Slice(1,4,1),
                    torch::indexing::Slice(1,4,1) });

Solve libtorch memory leak

        After experimenting, it was found that libtorch is in the forward reasoning model, and memory leaks will occur if different batchsizes are used. Therefore, it is necessary to ensure that the batchsize of each inference must be the same. If the last inference data is not enough for a batch size, you can choose to discard the data or fill it with 0.

Guess you like

Origin blog.csdn.net/weixin_41202834/article/details/122295530