Libtorch Manual [Transfer]

Examples of commonly used API functions in libtorch (the most complete and detailed in history)

In fact, libtorch has all the functions of pytorch, but there are some differences in the writing method.
Libtorch's official documentation link
class tensor

It's just that the official document is just similar to the function declaration. It doesn't tell you what it does. You can only guess it through the function name. For example, I want each function to have the same shape as a known torch::Tensor variable, but just fill in the specified value. I remember seeing a function starting with full somewhere, and then I searched for full and found a The function full_like seems to be what I need. (see 0)

Table of contents

Debugging tips:

torch::Tensor box_1 = torch::rand({5,4});
std::cout<<box_1<<std::endl; //可以打印出数值
box_1.print();//可以打印形状

CMakeLists.txt

cmake_minimum_required(VERSION 3.0 FATAL_ERROR)
project(main)
SET(CMAKE_BUILD_TYPE "Debug")

set(CMAKE_PREFIX_PATH "/data_2/everyday/0429/pytorch/torch")
find_package(Torch REQUIRED)

set(CMAKE_PREFIX_PATH "/home/yhl/software_install/opencv3.2")
find_package(OpenCV REQUIRED)

add_executable(main main.cpp)
target_link_libraries(main "${TORCH_LIBRARIES}")

target_link_libraries(main ${OpenCV_LIBS})
set_property(TARGET main PROPERTY CXX_STANDARD 11)

0.torch::full_like

static Tensor at::full_like(const Tensor &self, Scalar fill_value, const TensorOptions &options = {}, c10::optional memory_format = c10::nullopt)
Then try it yourself:

#include <iostream>
#include "torch/script.h"
#include "torch/torch.h"
using namespace std;

int main() {   
    torch::Tensor tmp_1 = torch::rand({2,3});
    torch::Tensor tmp_2 = torch::full_like(tmp_1,1);
    
    cout<<tmp_1<<endl;
    cout<<tmp_2<<endl;
}

The printed results are as follows:
0.8465 0.5771 0.4404
0.9805 0.8665 0.7807
[ Variable[CPUFloatType]{2,3} ]
1 1 1
1 1 1
[ Variable[CPUFloatType]{2,3} ]

1. Create and initialize tensor 1.1 torch::rand 1.2 torch::empty 1.3 torch::ones 1.4 torch::Tensor keep = torch::zeros({scores.size(0)}).to(torch::kLong) .to(scores.device()); 1.5 torch::Tensor num_out = torch::full({ 2,3 }, -2, torch::dtype(torch::kLong)); torch::full creates tensor specified Form 1.6 torch::Tensor a = torch::ones({3,2}).fill_(-8).to(torch::kCUDA); 1.7. torch::full_like (see 0) creates a tensor with a known The same shape and filled with the specified val

1.1 torch::rand

torch::Tensor input = torch::rand({ 1,3,2,3 });

(1,1,.,.) =
0.5943 0.4822 0.6663
0.7099 0.0374 0.9833

(1,2,.,.) =
0.4384 0.4567 0.2143
0.3967 0.4999 0.9196

(1,3,.,.) =
0.2467 0.5066 0.8654
0.7873 0.4758 0.3718
[ Variable[CPUFloatType]{1,3,2,3} ]

1.2 torch::empty

   torch::Tensor a = torch::empty({2, 4});
    std::cout << a << std::endl;

7.0374e+22 5.7886e+22 6.7120e+22 6.7331e+22
6.7120e+22 1.8515e+28 7.3867e+20 9.2358e-01
[ Variable[CPUFloatType]{2,4} ]

1.3 torch::ones

    torch::Tensor a = torch::ones({2, 4});
    std::cout << a<< std::endl;

1 1 1 1
1 1 1 1
[ Variable[CPUFloatType]{2,4} ]

1.4 torch::zeros

 torch::Tensor scores;
 torch::Tensor keep = torch::zeros({scores.size(0)}).to(torch::kLong).to(scores.device());

1.5 torch::full
inline at::Tensor full(at::IntArrayRef size, at::Scalar fill_value, c10::optional names, const at::TensorOptions & options = {})
inline at::Tensor full(at::IntArrayRef size, at::Scalar fill_value, const at::TensorOptions & options = {})

    torch::Tensor num_out = torch::full({ 2,3 }, -2, torch::dtype(torch::kLong));
    std::cout<<num_out<<std::endl;

1.6 torch::Tensor a = torch::ones({3,2}).fill_(-8).to(torch::kCUDA);

    torch::Tensor a = torch::ones({3,2}).fill_(-8).to(torch::kCUDA);
    std::cout<<a<<std::endl;

-8 -8
-8 -8
-8 -8
[ Variable[CUDAFloatType]{3,2} ]

2. Splice tensor torch::cat and fusion operation of vector and cat

2.1 Splicing by columns

    torch::Tensor a = torch::rand({2,3});
    torch::Tensor b = torch::rand({2,1});
    torch::Tensor cat_1 = torch::cat({a,b},1);//按列拼接--》》前提是行数需要一样

    std::cout<<a<<std::endl;
    std::cout<<b<<std::endl;
    std::cout<<cat_1<<std::endl;

0.3551 0.7215 0.3603
0.1188 0.4577 0.2201
[ Variable[CPUFloatType]{2,3} ]
0.5876
0.3040
[ Variable[CPUFloatType]{2,1} ]
0.3551 0.7215 0.3603 0.5876 0.1 188
0.4577 0.2201 0.3040
[ Variable[CPUFloatType]{2,4} ]
Note : If the number of rows is different, the following error will be reported
: terminate called after throwing an instance of 'std::runtime_error'
what(): invalid argument 0: Sizes of tensors must match except in dimension 1. Got 2 and 4 in dimension 0 at / data_2/everyday/0429/pytorch/aten/src/TH/generic/THTensor.cpp:689

2.2 Splicing by rows

    torch::Tensor a = torch::rand({2,3});
    torch::Tensor b = torch::rand({1,3});
    torch::Tensor cat_1 = torch::cat({a,b},0);

    std::cout<<a<<std::endl;
    std::cout<<b<<std::endl;
    std::cout<<cat_1<<std::endl;

0.0004 0.7852 0.4586
0.1612 0.6524 0.7655
[ Variable[CPUFloatType]{2,3} ]
0.5999 0.5445 0.2152
[ Variable[CPUFloatType]{1,3} ]
0.0004 0.7852 0.4586
0.1612 0.6524 0.7655
0.5999 0.5445 0.2152
[ Variable[CPUFloatType]{3,3} ]

2.3 Other examples

    torch::Tensor box_1 = torch::rand({5,4});
    torch::Tensor score_1 = torch::rand({5,1});
    torch::Tensor label_1 = torch::rand({5,1});
    torch::Tensor result_1 = torch::cat({box_1,score_1,label_1},1);
    result_1.print();

[Variable[CPUFloatType] [5, 6]]

2.4 Fusion operation of vector and cat

    torch::Tensor xs_t0 = xs - wh_0 / 2;
    torch::Tensor ys_t0 = ys - wh_1 / 2;
    torch::Tensor xs_t1 = xs + wh_0 / 2;
    torch::Tensor ys_t1 = ys + wh_1 / 2;
    xs_t0.print();
    ys_t0.print();
    xs_t1.print();
    ys_t1.print();
    vector<torch::Tensor> abce = {xs_t0,ys_t0,xs_t1,ys_t1};
    torch::Tensor bboxes = torch::cat(abce,2);
    std::cout<<"-----cat   shape---"<<std::endl;
    bboxes.print();
    while(1);

Print as follows:

[Variable[CUDAType] [1, 100, 1]]
[Variable[CUDAType] [1, 100, 1]]
[Variable[CUDAType] [1, 100, 1]]
[Variable[CUDAType] [1, 100, 1]]
[Variable[CUDAType] [1, 100, 4]]
-----cat   shape---

It can also be done in one sentence:

 torch::Tensor bboxes = torch::cat({xs_t0,ys_t0,xs_t1,ys_t1},2);

3. Torch’s slicing operation [select (shallow copy)] [index_select deep copy)] [index deep copy] [slice shallow copy] narrow, narrow_copy

select [shallow copy] can only specify a certain row or column
index [deep copy] can only specify a certain row
index_select [deep copy] can specify multiple rows or columns by row or column
slice [shallow copy] continuous row or column
narrow, narrow_copy

When it is a shallow copy and you don’t want to affect the previous results, you can add clone(), for example:

 torch::Tensor x1 = boxes.select(1,0).clone();

3.1 inline Tensor Tensor::select(int64_t dim, int64_t index); It seems that it can only be 2-dimensional. The first parameter is the dimension, 0 is to fetch the row, 1 is to fetch the column, the second parameter is the index serial number
3.1.1 select//fetch by row

    torch::Tensor a = torch::rand({2,3});
    std::cout<<a<<std::endl;
    torch::Tensor b = a.select(0,1);//按行取
    std::cout<<b<<std::endl;

0.6201 0.7021 0.1975
0.3080 0.6304 0.1558
[ Variable[CPUFloatType]{2,3} ]
0.3080
0.6304
0.1558
[ Variable[CPUFloatType]{3} ]
3.1.2 select//Get by column

    torch::Tensor a = torch::rand({2,3});
    std::cout<<a<<std::endl;

    torch::Tensor b = a.select(1,1);
    std::cout<<b<<std::endl;

0.8295 0.9871 0.1287
0.8466 0.7719 0.2354
[ Variable[CPUFloatType]{2,3} ]
0.9871
0.7719
[ Variable[CPUFloatType]{2} ]
Note: This is a shallow copy, that is, changing b, and the value of a will also change
3.1. 3 select shallow copy

    torch::Tensor a = torch::rand({2,3});
    std::cout<<a<<std::endl;
    
    torch::Tensor b = a.select(1,1);
    std::cout<<b<<std::endl;
    
    b[0] = 0.0;
    std::cout<<a<<std::endl;
    std::cout<<b<<std::endl;

0.0938 0.2861 0.0089
0.3481 0.5806 0.3711
[ Variable[CPUFloatType]{2,3} ]
0.2861
0.5806
[ Variable[CPUFloatType]{2} ]
0.0938 0.0000 0.0089
0.3481 0.580 6 0.3711
[ Variable[CPUFloatType]{2,3} ]
0.0000
0.5806
[ Variable[ CPUFloatType]{2} ]
You can see that b[0] = 0.0; then the corresponding positions of a and b are 0. Shallow copy! !

3.2 inline Tensor Tensor::index_select(Dimname dim, const Tensor & index) //Similarly, dim0 means by row, 1 means by column, index means the row number or column number, it is strange here, index must be toType(
torch ::kLong) this type. Another strange thing is that I was going to use an array to import tensor, and found that the idx was all 0. The reason is unknown.

 torch::Tensor a = torch::rand({2,6});
    std::cout<<a<<std::endl;
slice

     torch::Tensor idx = torch::empty({4}).toType(torch::kLong);
     idx[0]=0;
     idx[1]=2;
     idx[2]=4;
     idx[3]=1;

//    int idx_data[4] = {1,3,2,4};
//    torch::Tensor idx = torch::from_blob(idx_data,{4}).toType(torch::kLong);//idx全是0  ?????????????????

    std::cout<<idx<<std::endl;

    torch::Tensor b = a.index_select(1,idx);
    std::cout<<b<<std::endl;

0.4956 0.5028 0.0863 0.9464 0.6714 0.5348
0.3523 0.2245 0.0924 0.7088 0.6913 0.2237
[ Variable[CPUFloatType]{2,6} ]
0
2
4
1
[ Variable[CPULongType]{4} ]
0.4956 0.0863 0.6714 0.5028
0.3523 0.0924 0.6913 0.2245
[ Variable[CPUFloatType]{2,4} ]

3.2.2 index_select deep copy

    torch::Tensor a = torch::rand({2,6});
    std::cout<<a<<std::endl;


     torch::Tensor idx = torch::empty({4}).toType(torch::kLong);
     idx[0]=0;
     idx[1]=2;
     idx[2]=4;
     idx[3]=1;

//    int idx_data[4] = {1,3,2,4};
//    torch::Tensor idx = torch::from_blob(idx_data,{4}).toType(torch::kLong);

    std::cout<<idx<<std::endl;

    torch::Tensor b = a.index_select(1,idx);
    std::cout<<b<<std::endl;

    b[0][0]=0.0;
    std::cout<<a<<std::endl;
    std::cout<<b<<std::endl;

0.6118 0.6078 0.5052 0.9489 0.6201 0.8975
0.0901 0.2040 0.1452 0.6452 0.9593 0.7454
[ Variable[CPUFloatType]{2,6} ]
0
2
4
1
[ Variable[CPULongType]{4} ]
0.6118 0.5052 0.6201 0.6078
0.0901 0.1452 0.9593 0.2040
[ Variable[CPUFloatType]{2,4} ]
0.6118 0.6078 0.5052 0.9489 0.6201 0.8975
0.0901 0.2040 0.1452 0.6452 0.9593 0.7454
[ Variable[CPUFloatType]{2,6} ]
0.0000 0.5052 0.6201 0.6078
0.0901 0.1452 0.9593 0.2040
[ Variable[CPUFloatType]{2,4} ]

3.3 index inline Tensor Tensor::index(TensorList indices)
After experimenting with this function, it can only be fetched by row, and it is a deep copy.

    torch::Tensor a = torch::rand({2,6});
    std::cout<<a<<std::endl;


    torch::Tensor idx_1 = torch::empty({2}).toType(torch::kLong);
    idx_1[0]=0;
    idx_1[1]=1;


    torch::Tensor bb = a.index(idx_1);
    bb[0][0]=0;

    std::cout<<bb<<std::endl;
    std::cout<<a<<std::endl;

0.1349 0.8087 0.2659 0.3364 0.0202 0.4498 0.4785 0.4274 0.9348
0.0437 0.6732 0.3174
[ Variable[CPUFloatType]{2,6} ]
0.0000 0.8087 0.2659 0.3364 0.0202 0.4498 0.4785 0.4274 0.9348 0.0437 0.6732
0.3174
[ Variable[CPUFloatType]{2,6} ]
0.1349 0.8087 0.2659 0.3364 0.0202 0.4498
0.4785 0.4274 0.9348 0.0437 0.6732 0.3174
[ Variable[CPUFloatType]{2,6} ]
3.4 slice inline Tensor Tensor::slice(int64_t dim, int64_t start, int64_t end, int64_t step) //d im0 means fetching by row, 1 means Fetch by column, starting from start and ending at end (exclusive).
You can see the result, which is a shallow copy! ! !

 torch::Tensor a = torch::rand({2,6});
    std::cout<<a<<std::endl;

    torch::Tensor b = a.slice(0,0,1);
    torch::Tensor c = a.slice(1,0,3);

    b[0][0]=0.0;

    std::cout<<b<<std::endl;
    std::cout<<c<<std::endl;

    std::cout<<a<<std::endl;

0.8270 0.7952 0.3743 0.7992 0.9093 0.5945
0.3764 0.8419 0.7977 0.4150 0.8531 0.9207
[ Variable[CPUFloatType]{2,6} ]
0.0000 0.7952 0.3743 0.7992 0.9093 0.5945
[ Variable[CPUFloatType]{1,6} ]
0.0000 0.7952 0.3743
0.3764 0.8419 0.7977
[ Variable[CPUFloatType]{2,3} ]
0.0000 0.7952 0.3743 0.7992 0.9093 0.5945
0.3764 0.8419 0.7977 0.4150 0.8531 0.9207
[ Variable[CPUFloatType]{2,6} ]

3.5 narrow narrow_copy
inline Tensor Tensor::narrow(int64_t dim, int64_t start, int64_t length) const
inline Tensor Tensor::narrow_copy(int64_t dim, int64_t start, int64_t length) const

    torch::Tensor a = torch::rand({4,6});
    torch::Tensor b = a.narrow(0,1,2);
    torch::Tensor c = a.narrow_copy(0,1,2);

    std::cout<<a<<std::endl;
    std::cout<<b<<std::endl;
    std::cout<<c<<std::endl;

0.9812 0.4205 0.4169 0.2412 0.8769 0.9873
0.8052 0.0312 0.9901 0.5065 0.6344 0.3408
0.0182 0.6933 0.9375 0.8675 0.5201 0.9521
0.5119 0.3880 0.1117 0.5413 0.8203 0.4163
[ Variable[CPUFloatType]{4,6} ]
0.8052 0.0312 0.9901 0.5065 0.6344 0.3408
0.0182 0.6933 0.9375 0.8675 0.5201 0.9521
[ Variable[CPUFloatType]{2,6} ]
0.8052 0.0312 0.9901 0.5065 0.6344 0.3408
0.0182 0.6933 0.9375 0.8675 0.5201 0.9521
[ Variable[CPUFloatType]{2,6} ]

4.squeeze() unsqueeze()

inline Tensor Tensor::squeeze() const//Without parameters, compress all dimensions of 1
inline Tensor Tensor::squeeze(int64_t dim)const//With parameters, specify which dimension to compress
inline Tensor & Tensor: :squeeze_() const //I don’t know the difference yet.
inline Tensor & Tensor::squeeze_(int64_t dim) const //I don’t know the difference yet.
4.1 squeeze()

(1,.,.) = 
  0.5516  0.6561  0.3603
  0.7555  0.1048  0.2016
[ Variable[CPUFloatType]{1,2,3} ]
 0.5516  0.6561  0.3603
 0.7555  0.1048  0.2016
[ Variable[CPUFloatType]{2,3} ]
(1,.,.) = 
  0.7675  0.5439  0.5162

(2,.,.) = 
  0.6103  0.1925  0.1222
[ Variable[CPUFloatType]{2,1,3} ]
 0.7675  0.5439  0.5162
 0.6103  0.1925  0.1222
[ Variable[CPUFloatType]{2,3} ]
(1,1,.,.) = 
  0.9875
  0.1980

(2,1,.,.) = 
  0.6973
  0.3272
[ Variable[CPUFloatType]{2,1,2,1} ]
 0.9875  0.1980
 0.6973  0.3272
[ Variable[CPUFloatType]{2,2} ]

4.2 squeeze(int64_t dim) specifies which dimension to compress

    torch::Tensor a = torch::rand({1,1,3});
    std::cout<<a<<std::endl;
    
    torch::Tensor b = a.squeeze();
    std::cout<<b<<std::endl;
    
    torch::Tensor c = a.squeeze(0);
    std::cout<<c<<std::endl;
    
    torch::Tensor d = a.squeeze(1);
    std::cout<<d<<std::endl;
    
    torch::Tensor e = a.squeeze(2);
    std::cout<<e<<std::endl;

(1,.,.) =
0.8065 0.1287 0.8073
[ Variable[CPUFloatType]{1,1,3} ]
0.8065
0.1287
0.8073
[ Variable[CPUFloatType]{3} ]
0.8065 0.1287 0.8073
[ Variable[CPUFloatType]{1,3} ]
0.8065 0.1287 0.8073
[ Variable[CPUFloatType]{1,3} ]
(1,.,.) =
0.8065 0.1287 0.8073
[ Variable[CPUFloatType]{1,1,3} ]
4.3. unsqueeze

    torch::Tensor a = torch::rand({2,3});
    std::cout<<a<<std::endl;

    torch::Tensor b = a.unsqueeze(0);
    std::cout<<b<<std::endl;

    torch::Tensor bb = a.unsqueeze(1);
    std::cout<<bb<<std::endl;

    torch::Tensor bbb = a.unsqueeze(2);
    std::cout<<bbb<<std::endl;

0.7945 0.0331 0.1666
0.7821 0.3359 0.0663
[ Variable[CPUFloatType]{2,3} ]
(1,.,.) =
0.7945 0.0331 0.1666
0.7821 0.3359 0.0663
[ Variable[CPUFloatType]{1,2,3} ]
(1,.,.) =
0.7945 0.0331 0.1666

(2,.,.) =
0.7821 0.3359 0.0663
[ Variable[CPUFloatType]{2,1,3} ]
(1,.,.) =
0.7945
0.0331
0.1666

(2,.,.) =
0.7821
0.3359
0.0663
[ Variable[CPUFloatType]{2,3,1} ]

5.torch::nonzero outputs non-zero coordinates

    torch::Tensor a = torch::rand({2,3});
    a[0][1] = 0;
    a[1][2] = 0;
    std::cout<<a<<std::endl;
     torch::Tensor b = torch::nonzero(a);
     std::cout<<b<<std::endl;

0.4671 0.0000 0.3360
0.9320 0.9246 0.0000
[ Variable[CPUFloatType]{2,3} ]
0 0
0 2
1 0
1 1
[ Variable[CPULongType]{4,2} ]

6. When accessing the tensor value a.item(), convert the a of the 1*1 tensor into float.

Take out a certain value of tensor as int or float ===》》》auto bbb = a[1][1].item().toFloat(); Generally, when taking out a certain value of tensor, you can directly subscript the index
. . For example, a[0][1], but this value is still of tensor type. If you want it to be C++ int or float, as follows:

    torch::Tensor a = torch::rand({2,3});
    std::cout<<a<<std::endl;
    auto bbb = a[1][1].item().toFloat();
    std::cout<<bbb<<std::endl;

0.7303 0.6608 0.0024
0.5917 0.0145 0.6472
[ Variable[CPUFloatType]{2,3} ]
0.014509
[ Variable[CPUFloatType]{} ]
0.014509

Additional examples:

    torch::Tensor scores = torch::rand({10});
    std::tuple<torch::Tensor,torch::Tensor> sort_ret = torch::sort(scores.unsqueeze(1), 0, 1);
    torch::Tensor v = std::get<0>(sort_ret).squeeze(1).to(scores.device());
    torch::Tensor idx = std::get<1>(sort_ret).squeeze(1).to(scores.device());
    std::cout<<scores<<std::endl;
    std::cout<<v<<std::endl;
    std::cout<<idx<<std::endl;

    for(int i=0;i<10;i++)
    {
         int idx_1 = idx[i].item<int>();
         float s = v[i].item<float>();

          std::cout<<idx_1<<"  "<<s<<std::endl;
    }

0.1125
0.9524
0.7033
0.3204
0.7907
0.8486
0.7783
0.3215
0.0378
0.7512
[ Variable[CPUFloatType]{10} ]
0.9524
0.8486
0.7907
0.7783
0.7512
0.7033
0.3215
0.3204
0.1125
0.0378
[ Variable[CPUFloatType]{10} ]
1
5
4
6
9
2
7
3
0
8
[ Variable[CPULongType]{10} ]
1 0.952351
5 0.848641
4 0.790685
6 0.778329
9 0.751163
2 0.703278
7 0.32146
3 0.320435
0 0.112517
8 0.0378203

7.opencv Mat type to tensor or other vector or array data to tensor

7.1

   Mat m_out = imread(path);
 //[320,320,3]
    input_tensor = torch::from_blob(
                m_out.data, {m_SIZE_IMAGE, m_SIZE_IMAGE, 3}).toType(torch::kFloat32);//torch::kByte //大坑
    //[3,320,320]
    input_tensor = input_tensor.permute({2,0,1});
    input_tensor = input_tensor.unsqueeze(0);
    input_tensor = input_tensor.to(torch::kFloat).to(m_device);

It should be noted here that because the above image has been preprocessed by me and subtracted from the mean, the m_out pixel value has a negative number. If the format is torch::kByte, the negative number will be turned into a positive number, so the torch::kFloat32 type is required.
permute({2,0,1});
before it was opencv Mat was
0 1 2
[320,320,3]
after permute({2,0,1}), which means that the corresponding position is changed and it becomes [3,320,320 ]

7.2

std::vector<float> region_priors;
//region_priors.push_back(num)  region_priors的size是6375 × 4
torch::Tensor m_prior = torch::from_blob(region_priors.data(),{6375,4}).cuda();

8.tensor 的size sizes() numel()

    torch::Tensor a = torch::rand({2,3});
    std::cout<<a<<std::endl;

    auto aa = a.size(0);
    auto bb = a.size(1);
    auto a_size = a.sizes();
    std::cout<<aa<<std::endl;
    std::cout<<bb<<std::endl;
    std::cout<<a_size<<std::endl;

    int num_ = a.numel();
    std::cout<<num_<<std::endl;

0.6522 0.0480 0.0009
0.1185 0.4639 0.0386
[ Variable[CPUFloatType]{2,3} ]
2
3
[2, 3]
6

8.2
There is a problem that when torch::Tensor a; directly defines a tensor, then access it

    torch::Tensor a;
     auto a_size = a.sizes();

就会报错
terminate called after throwing an instance of 'c10::Error'
what(): sizes() called on undefined Tensor (sizes at /data_2/everyday/0429/pytorch/c10/core/UndefinedTensorImpl.cpp:12)
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&) + 0x6a (0x7f83b563f0aa in /data_2/everyday/0429/pytorch/torch/lib/libc10.so)
frame #1: c10::UndefinedTensorImpl::sizes() const + 0x258 (0x7f83b56362b8 in /data_2/everyday/0429/pytorch/torch/lib/libc10.so)
frame #2: at::Tensor::sizes() const + 0x27 (0x405fc9 in /data_2/everyday/0516/build-libtorch- syntax-unknown-Default/main)
frame #3: main + 0x30 (0x405d06 in /data_2/everyday/0516/build-libtorch- syntax-unknown-Default/main)
frame #4: __libc_start_main + 0xf0 (0x7f83b4d12830 in /lib/x86_64-linux-gnu/libc.so.6)
frame #5: _start + 0x29 (0x405c09 in /data_2/everyday/0516/build-libtorch- syntax-unknown-Default/main)

The program ended abnormally.
There is no problem if you use numel()

    torch::Tensor a;
    int num_ = a.numel();
    std::cout<<num_<<std::endl;

8.3 Get the dimension size, such as [1,5,8,2], I need to get the dimension 4

auto aaa = img_poly.sizes();
int len_ = aaa.size();

9.torch::sort

static inline std::tuple<Tensor,Tensor> sort(const Tensor & self, Dimname dim, bool descending)
dim0 means by row, 1 means by column
descending=false means ascending order, true means descending order
returns tuples, the first The first represents the sorted value, and the second represents the index corresponding to the previous one after sorting.

    torch::Tensor scores = torch::rand({10});
    std::tuple<torch::Tensor,torch::Tensor> sort_ret = torch::sort(scores.unsqueeze(1), 0, 1);
    torch::Tensor v = std::get<0>(sort_ret).squeeze(1).to(scores.device());
    torch::Tensor idx = std::get<1>(sort_ret).squeeze(1).to(scores.device());
    std::cout<<scores<<std::endl;
    std::cout<<v<<std::endl;
    std::cout<<idx<<std::endl;

0.8355
0.1386
0.7910
0.0988
0.2607
0.7810
0.7855
0.5529
0.5846
0.1403
[ Variable[CPUFloatType]{10} ]
0.8355
0.7910
0.7855
0.7810
0.5846
0.5529
0.2607
0.1403
0.1386
0.0988
[ Variable[CPUFloatType]{10} ]
0
2
6
5
8
7
4
9
1
3
[ Variable[CPULongType]{10} ]

10.clamp controls the value between min and max. If it is less than min, it is min, and if it is greater than max, it is max.

inline Tensor Tensor::clamp(c10::optional min, c10::optional max) const

    torch::Tensor a = torch::rand({2,3});
    a[0][0] = 20;
    a[0][1] = 21;
    a[0][2] = 22;
    a[1][0] = 23;
    a[1][1] = 24;
    std::cout<<a<<std::endl;

    torch::Tensor b = a.clamp(21,22);
    std::cout<<b<<std::endl;

20.0000 21.0000 22.0000 23.0000
24.0000 0.4792
[ Variable[CPUFloatType]{2,3} ]
21 21 22
22 22 21
[ Variable[CPUFloatType]{2,3} ]
In engineering, the value in tensor is generally taken, and sometimes Just limit one side, for example, only limit min, as follows:

 xx1 = xx1.clamp(x1[i].item().toFloat(),INT_MAX*1.0);

11. Greater than > Less than < operation

    torch::Tensor a = torch::rand({2,3});
    std::cout<<a<<std::endl;
    torch::Tensor b = a > 0.5;
    std::cout<<b<<std::endl;

0.3526 0.0321 0.7098
0.9794 0.6531 0.9410
[ Variable[CPUFloatType]{2,3} ]
0 0 1
1 1 1
[ Variable[CPUBoolType]{2,3} ]

12. Transpose Tensor::transpose

inline Tensor Tensor::transpose(Dimname dim0, Dimname dim1) const

    torch::Tensor a = torch::rand({2,3});
    std::cout<<a<<std::endl;

    torch::Tensor b = a.transpose(1,0);
    std::cout<<b<<std::endl;

0.4039 0.3568 0.9978
0.6895 0.7258 0.5576
[ Variable[CPUFloatType]{2,3} ]
0.4039 0.6895
0.3568 0.7258
0.9978 0.5576
[ Variable[CPUFloatType]{3,2} ]

13.expand_as

inline Tensor Tensor::expand_as(const Tensor & other) const

    torch::Tensor a = torch::rand({2,3});;
    //    torch::Tensor b = torch::ones({2,2});
    torch::Tensor b = torch::ones({2,1});
    torch::Tensor c = b.expand_as(a);
    
    std::cout<<a<<std::endl;
    std::cout<<b<<std::endl;
    std::cout<<c<<std::endl;

0.6063 0.4150 0.7665
0.8663 0.9563 0.7461
[ Variable[CPUFloatType]{2,3} ]
1
1
[ Variable[CPUFloatType]{2,1} ]
1 1 1
1 1 1
[ Variable[CPUFloatType]{2,3} ]

注意维度有一定要求,我这么写torch::Tensor b = torch::ones({2,2});torch::Tensor b = torch::ones({2});都会报错:
terminate called after throwing an instance of 'c10::Error'
what(): The expanded size of the tensor (3) must match the existing size (2) at non-singleton dimension 1. Target sizes: [2, 3]. Tensor sizes: [2, 2] (inferExpandGeometry at /data_2/everyday/0429/pytorch/aten/src/ATen/ExpandUtils.cpp:76)
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&) + 0x6a (0x7f6a488150aa in /data_2/everyday/0429/pytorch/torch/lib/libc10.so)
frame #1: at::inferExpandGeometry(c10::ArrayRef, c10::ArrayRef, c10::ArrayRef) + 0x76b (0x7f6a49df7a4b in /data_2/everyday/0429/pytorch/torch/lib/libtorch.so)
frame #2: at::native::expand(at::Tensor const&, c10::ArrayRef, bool) + 0x84 (0x7f6a4a1e4324 in /data_2/everyday/0429/pytorch/torch/lib/libtorch.so)
frame #3: + 0x1aeb9e1 (0x7f6a4a5189e1 in /data_2/everyday/0429/pytorch/torch/lib/libtorch.so)
frame #4: + 0x19e8a2e (0x7f6a4a415a2e in /data_2/everyday/0429/pytorch/torch/lib/libtorch.so)
frame #5: + 0x3509dee (0x7f6a4bf36dee in /data_2/everyday/0429/pytorch/torch/lib/libtorch.so)
frame #6: + 0x19e8a2e (0x7f6a4a415a2e in /data_2/everyday/0429/pytorch/torch/lib/libtorch.so)
frame #7: + 0x14e8a61 (0x7f6a49f15a61 in /data_2/everyday/0429/pytorch/torch/lib/libtorch.so)
frame #8: at::native::expand_as(at::Tensor const&, at::Tensor const&) + 0x39 (0x7f6a4a1e4d49 in /data_2/everyday/0429/pytorch/torch/lib/libtorch.so)
frame #9: + 0x1aece9f (0x7f6a4a519e9f in /data_2/everyday/0429/pytorch/torch/lib/libtorch.so)
frame #10: + 0x3680543 (0x7f6a4c0ad543 in /data_2/everyday/0429/pytorch/torch/lib/libtorch.so)
frame #11: + 0x19e6bb4 (0x7f6a4a413bb4 in /data_2/everyday/0429/pytorch/torch/lib/libtorch.so)
frame #12: at::Tensor c10::KernelFunction::callUnboxed<at::Tensor, at::Tensor const&, at::Tensor const&>(at::Tensor const&, at::Tensor const&) const + 0xb0 (0x433e06 in /data_2/everyday/0516/build-libtorch- syntax-unknown-Default/main)
frame #13: at::Tensor c10::impl::OperatorEntry::callUnboxed<at::Tensor, at::Tensor const&, at::Tensor const&>(c10::TensorTypeId, at::Tensor const&, at::Tensor const&) const::{lambda(c10::DispatchTable const&)#1}::operator()(c10::DispatchTable const&) const + 0x79 (0x432525 in /data_2/everyday/0516/build-libtorch- syntax-unknown-Default/main)
frame #14: std::result_of<at::Tensor c10::impl::OperatorEntry::callUnboxed<at::Tensor, at::Tensor const&, at::Tensor const&>(c10::TensorTypeId, at::Tensor const&, at::Tensor const&) const::{lambda(c10::DispatchTable const&)#1} (c10::DispatchTable const&)>::type c10::LeftRightc10::DispatchTable::read<at::Tensor c10::impl::OperatorEntry::callUnboxed<at::Tensor, at::Tensor const&, at::Tensor const&>(c10::TensorTypeId, at::Tensor const&, at::Tensor const&) const::{lambda(c10::DispatchTable const&)#1}>(at::Tensor c10::impl::OperatorEntry::callUnboxed<at::Tensor, at::Tensor const&, at::Tensor const&>(c10::TensorTypeId, at::Tensor const&, at::Tensor const&) const::{lambda(c10::DispatchTable const&)#1}&&) const + 0x11c (0x4340ba in /data_2/everyday/0516/build-libtorch- syntax-unknown-Default/main)
frame #15: at::Tensor c10::impl::OperatorEntry::callUnboxed<at::Tensor, at::Tensor const&, at::Tensor const&>(c10::TensorTypeId, at::Tensor const&, at::Tensor const&) const + 0x5f (0x4325a5 in /data_2/everyday/0516/build-libtorch- syntax-unknown-Default/main)
frame #16: at::Tensor c10::Dispatcher::callUnboxed<at::Tensor, at::Tensor const&, at::Tensor const&>(c10::OperatorHandle const&, c10::TensorTypeId, at::Tensor const&, at::Tensor const&) const + 0x85 (0x42fd5d in /data_2/everyday/0516/build-libtorch- syntax-unknown-Default/main)
frame #17: at::Tensor::expand_as(at::Tensor const&) const + 0x1a5 (0x42ba47 in /data_2/everyday/0516/build-libtorch- syntax-unknown-Default/main)
frame #18: main + 0xbd (0x427c97 in /data_2/everyday/0516/build-libtorch- syntax-unknown-Default/main)
frame #19: __libc_start_main + 0xf0 (0x7f6a47ee8830 in /lib/x86_64-linux-gnu/libc.so.6)
frame #20: _start + 0x29 (0x426999 in /data_2/everyday/0516/build-libtorch- syntax-unknown-Default/main)

14. Multiply mul_, divide div, and subtract sub_

        boxes_my.select(1,0).mul_(width);
        boxes_my.select(1,1).mul_(height);
        boxes_my.select(1,2).mul_(width);
        boxes_my.select(1,3).mul_(height);
prediction.select(2, 3).div(2);
      input_tensor[0][0] = input_tensor[0][0].sub_(0.485).div_(0.229);
               input_tensor[0][1] = input_tensor[0][1].sub_(0.456).div_(0.224);
               input_tensor[0][2] = input_tensor[0][2].sub_(0.406).div_(0.225);

15. Load the model

    torch::Device m_device(torch::kCUDA);
    torch::jit::script::Module m_model = torch::jit::load(path_pt);
    m_model.to(m_device);
    m_model.eval();

16.The result of model forward

When the model has several outputs,

 auto output = m_model.forward({input_tensor});

    auto tpl = output.toTuple();
    auto arm_loc = tpl->elements()[0].toTensor();
    // arm_loc.print();
    //    std::cout<<arm_loc[0]<<std::endl;
    auto arm_conf = tpl->elements()[1].toTensor();
    //arm_conf.print();
    auto odm_loc = tpl->elements()[2].toTensor();
    //odm_loc.print();
    //     std::cout<<odm_loc[0]<<std::endl;
    auto odm_conf = tpl->elements()[3].toTensor();
    //    odm_conf.print();

17.resize_ zero_

Tensor & resize_(IntArrayRef size) const;
Tensor & zero_() const;

    torch::Tensor a = torch::rand({1,3,2,2});

    const int batch_size = a.size(0);
    const int depth = a.size(1);
    const int image_height = a.size(2);
    const int image_width = a.size(3);

    torch::Tensor crops = torch::rand({1,3,2,2});
    //    torch::Tensor crops;
    crops.resize_({ batch_size, depth, image_height, image_width });
    crops.zero_();

    std::cout<<a<<std::endl;
    std::cout<<crops<<std::endl;

(1,1,.,.) =
0.7889 0.3291
0.2541 0.8283

(1,2,.,.) =
0.0209 0.1846
0.2528 0.2755

(1,3,.,.) =
0.0294 0.6623
0.2736 0.3376
[ Variable[CPUFloatType]{1,3,2,2} ]
(1,1,.,.) =
0 0
0 0

(1,2,.,.) =
0 0
0 0

(1,3,.,.) =
0 0
0 0
[ Variable[CPUFloatType]{1,3,2,2} ]
Note: If only torch::Tensor crops are defined here;//torch::Tensor crops = torch ::rand({1,3,2,2}); will report an error. I feel that it still needs to be initialized before allocating memory, otherwise it will report an error!
terminate called after throwing an instance of '
c10::Error'
what(): There were no tensor arguments to this function (eg, you passed an empty list of Tensors), but no fallback function is registered for schema aten::resize_. This usually means that this function requires a non-empty list of Tensors. Available functions are [CUDATensorId, QuantizedCPUTensorId, CPUTensorId, VariableTensorId] (lookup_ at /data_2/everyday/0429/pytorch/torch/include/ATen/core/dispatch/DispatchTable .h:243)
frame #0: c10::Error::Error(c10::SourceLocation, std::cxx11::basic_string<char, std::char_traits, std::allocator > const&) + 0x6a (0x7fa2f5f450aa in /data_2/everyday/0429/pytorch/torch/lib/libc10.so)
frame #1: c10::KernelFunction const& c10::DispatchTable::lookup
<c10::DispatchTable::lookup(c10::TensorTypeId) const::{lambda()#1}>(c10::DispatchTable::lookup(c10::TensorTypeId) const::{lambda()#1} const&) const + 0x1da (0x42eaa8 in /data_2/everyday/0516/build-libtorch- syntax-unknown-Default/main)
frame #2: c10::DispatchTable::lookup(c10::TensorTypeId) const + 0x3a (0x42acf4 in /data_2/everyday/0516/build-libtorch- syntax-unknown-Default/main)
frame #3: at::Tensor& c10::impl::OperatorEntry::callUnboxedOnly<at::Tensor&, at::Tensor&, c10::ArrayRef >(c10::TensorTypeId, at::Tensor&, c10::ArrayRef) const::{lambda(c10::DispatchTable const&)#1}::operator()(c10::DispatchTable const&) const + 0x51 (0x431543 in /data_2/everyday/0516/build-libtorch- syntax-unknown-Default/main)
frame #4: std::result_of<at::Tensor& c10::impl::OperatorEntry::callUnboxedOnly<at::Tensor&, at::Tensor&, c10::ArrayRef >(c10::TensorTypeId, at::Tensor&, c10::ArrayRef) const::{lambda(c10::DispatchTable const&)#1} (c10::DispatchTable const&)>::type c10::LeftRightc10::DispatchTable::read<at::Tensor& c10::impl::OperatorEntry::callUnboxedOnly<at::Tensor&, at::Tensor&, c10::ArrayRef >(c10::TensorTypeId, at::Tensor&, c10::ArrayRef) const::{lambda(c10::DispatchTable const&)#1}>(at::Tensor& c10::impl::OperatorEntry::callUnboxedOnly<at::Tensor&, at::Tensor&, c10::ArrayRef >(c10::TensorTypeId, at::Tensor&, c10::ArrayRef) const::{lambda(c10::DispatchTable const&)#1}&&) const + 0x114 (0x4333c6 in /data_2/everyday/0516/build-libtorch- syntax-unknown-Default/main)
frame #5: at::Tensor& c10::impl::OperatorEntry::callUnboxedOnly<at::Tensor&, at::Tensor&, c10::ArrayRef >(c10::TensorTypeId, at::Tensor&, c10::ArrayRef) const + 0x63 (0x4315c7 in /data_2/everyday/0516/build-libtorch- syntax-unknown-Default/main)
frame #6: at::Tensor& c10::Dispatcher::callUnboxedOnly<at::Tensor&, at::Tensor&, c10::ArrayRef >(c10::OperatorHandle const&, c10::TensorTypeId, at::Tensor&, c10::ArrayRef) const + 0x7b (0x42eff5 in /data_2/everyday/0516/build-libtorch- syntax-unknown-Default/main)
frame #7: at::Tensor::resize
(c10::ArrayRef) const + 0x1a1 (0x42af3f in /data_2/everyday/0516/build-libtorch- syntax-unknown-Default/main)
frame #8: main + 0x134 (0x42798f in /data_2/everyday/0516/build-libtorch- syntax-unknown-Default/main)
frame #9: __libc_start_main + 0xf0 (0x7fa2f5618830 in /lib/x86_64-linux-gnu/libc.so.6)
frame #10: _start + 0x29 (0x426719 in /data_2/everyday/0516/build-libtorch- syntax-unknown-Default/main)

18.meshgrid turns tens into square matrix

static inline std::vector meshgrid(TensorList tensors)

    torch::Tensor scales = torch::ones({2});
    torch::Tensor ratios = torch::ones({2});
    ratios  += 2;

    std::cout<<scales<<std::endl;
    std::cout<<ratios<<std::endl;

    std::vector<torch::Tensor> mesh = torch::meshgrid({ scales, ratios });

    torch::Tensor scales_1 = mesh[0];
    torch::Tensor ratios_1 = mesh[1];

    std::cout<<scales_1<<std::endl;
    std::cout<<ratios_1<<std::endl;

1
1
[ Variable[CPUFloatType]{2} ]
3
3
[ Variable[CPUFloatType]{2} ]
1 1
1 1
[ Variable[CPUFloatType]{2,2} ]
3 3
3 3
[ Variable[CPUFloatType]{2,2} ]

19.flatten flatten tensor

Tensor flatten(int64_t start_dim=0, int64_t end_dim=-1) const;
Tensor flatten(int64_t start_dim, int64_t end_dim, Dimname out_dim) const;
Tensor flatten(Dimname start_dim, Dimname end_dim, Dimname out_dim) const;
Tensor flatten(DimnameList dims, Dimname out_dim) const;

   torch::Tensor a = torch::rand({2,3});
    torch::Tensor b = a.flatten();
    std::cout<<a<<std::endl;
    std::cout<<b<<std::endl;

0.9953 0.1461 0.0084
0.6169 0.4037 0.7685
[ Variable[CPUFloatType]{2,3} ]
0.9953
0.1461
0.0084
0.6169
0.4037
0.7685

20.fill_ tensor fills a certain value in place and fills the current tensor

Tensor & fill_(Scalar value) const;
Tensor & fill_(const Tensor & value) const;

    torch::Tensor a = torch::rand({2,3});
    torch::Tensor b = a.fill_(4);

    std::cout<<a<<std::endl;
    std::cout<<b<<std::endl;

4 4 4
4 4 4
[ Variable[CPUFloatType]{2,3} ]
4 4 4
4 4 4
[ Variable[CPUFloatType]{2,3} ]

21.torch::stack

static inline Tensor stack(TensorList tensors, int64_t dim)

    torch::Tensor a = torch::rand({3});
    torch::Tensor b = torch::rand({3});
    torch::Tensor c = torch::stack({a,b},1);

    std::cout<<a<<std::endl;
    std::cout<<b<<std::endl;
    std::cout<<c<<std::endl;

0.6776
0.5610
0.2835
[ Variable[CPUFloatType]{3} ]
0.6846
0.3753
0.3873
[ Variable[CPUFloatType]{3} ]
0.6776 0.6846
0.5610 0.3753
0.2835 0.3873
[ Variable[CPUFloatType]{3,2} ]

    torch::Tensor a = torch::rand({3});
    torch::Tensor b = torch::rand({3});
    torch::Tensor c = torch::stack({a,b},0);

    std::cout<<a<<std::endl;
    std::cout<<b<<std::endl;
    std::cout<<c<<std::endl;

0.7129
0.1650
0.6764
[ Variable[CPUFloatType]{3} ]
0.8035
0.1807
0.8100
[ Variable[CPUFloatType]{3} ]
0.7129 0.1650 0.6764
0.8035 0.1807 0.8100
[ Variable[CPUFloatType]{2,3} ]

22.reshape

inline Tensor Tensor::reshape(IntArrayRef shape) const

    torch::Tensor a = torch::rand({2,4});
    torch::Tensor b = a.reshape({-1,2});
    std::cout<<a<<std::endl;
    std::cout<<b<<std::endl;

0.3782 0.6390 0.6919 0.8298
0.3872 0.5923 0.4337 0.9634
[ Variable[CPUFloatType]{2,4} ]
0.3782 0.6390
0.6919 0.8298
0.3872 0.5923
0.4337 0.9634
[ Variable[CPUFloatType]{4,2} ]

23. view

inline Tensor Tensor::view(IntArrayRef size) const

You need to contiguous
a.contiguous().view({-1, 4});

 torch::Tensor a = torch::rand({2,3});
    torch::Tensor b = a.contiguous().view({ -1, 6 });
    torch::Tensor c = a.contiguous().view({ 3, 2 });

    std::cout<<a<<std::endl;
    std::cout<<b<<std::endl;
    std::cout<<c<<std::endl;

0.2069 0.8814 0.8506
0.6451 0.0107 0.7591
[ Variable[CPUFloatType]{2,3} ]
0.2069 0.8814 0.8506 0.6451 0.0107 0.7591
[ Variable[CPUFloatType]{1,6} ]
0.2 069 0.8814
0.8506
0.6451 0.0107 0.7591
[ Variable[CPUFloatType]{3,2} ]
Note that this is different from transpose

24.argmax argmin

static inline Tensor argmax(const Tensor & self, c10::optional<int64_t> dim=c10::nullopt, bool keepdim=false);
static inline Tensor argmin(const Tensor & self, c10::optional<int64_t> dim=c10::nullopt, bool keepdim=false);

    torch::Tensor a = torch::rand({2,3});
    auto b = torch::argmax(a, 0);

    std::cout<<a<<std::endl;
    std::cout<<b<<std::endl;

0.9337 0.7443 0.1323
0.6514 0.5068 0.5052
[ Variable[CPUFloatType]{2,3} ]
0
0
1
[ Variable[CPULongType]{3} ]

    torch::Tensor a = torch::rand({2,3});
    auto b = torch::argmax(a, 1);

    std::cout<<a<<std::endl;
    std::cout<<b<<std::endl;

0.0062 0.3846 0.4844
0.9555 0.2844 0.4025
[ Variable[CPUFloatType]{2,3} ]
2
0
[ Variable[CPULongType]{2} ]

25.where

static inline Tensor where(const Tensor & condition, const Tensor & self, const Tensor & other);
static inline std::vector where(const Tensor & condition);

torch::Tensor d = torch::where(a>0.5,b,c);
Description: Set the position where a is greater than 0.5 to pos, fill the pos position of d with the value above the pos position of b, and fill the remaining positions value is the value of c

     
    torch::Tensor a = torch::rand({2,3});
    torch::Tensor b = torch::ones({2,3});
    torch::Tensor c = torch::zeros({2,3});

    torch::Tensor d = torch::where(a>0.5,b,c);
    std::cout<<a<<std::endl;
    std::cout<<b<<std::endl;
    std::cout<<c<<std::endl;
    std::cout<<d<<std::endl;

0.7301 0.8926 0.9570
0.0979 0.5679 0.4473
[ Variable[CPUFloatType]{2,3} ]
1 1 1
1 1 1
[ Variable[CPUFloatType]{2,3} ]
0 0 0
0 0 0
[ Variable[CPUFloatType]{2,3} ]
1 1 1
0 1 0
[ Variable[CPUFloatType]{2,3} ]

Another example:
auto b = torch::where(a>0.5);


    torch::Tensor a = torch::rand({2,3});
    auto b = torch::where(a>0.5);

    std::cout<<a<<std::endl;
    std::cout<<b<<std::endl;

0.3439 0.1622 0.7149
0.4845 0.5982 0.9443
[ Variable[CPUFloatType]{2,3} ]
0
1
1
[ Variable[CPULongType]{3} ]
2
1
2
[ Variable[CPULongType]{3} ]

26.accessor

TensorAccessor<T,N> accessor() const&
auto result_data = result.accessor<float, 2>(); //2 represents two-dimensional
Example 1:

torch::Tensor one = torch::randn({9,6});
auto foo_one=one.accessor<float,2>();
for(int i=0,sum=0;i<foo_one.size(0);i++)
 for(int j=0;j<foo_one.size(1);j++)
     sum+=foo_one[i][j];

Example 2:

 torch::Tensor result;
    for(int i=1;i<m_num_class;i++) 
    {
        //...
        if(0 == result.numel())
        {
            result = result_.clone();
        }else
        {
            result = torch::cat({result,result_},0);//按行拼接
        }
    }
    result =result.cpu();
    auto result_data = result.accessor<float, 2>();
    
    cv::Mat img_draw = img.clone();
    for(int i=0;i<result_data.size(0);i++)
    {
        float score = result_data[i][4];
        if(score < 0.4) { continue;}
        int x1 = result_data[i][0];
        int y1 = result_data[i][1];
        int x2 = result_data[i][2];
        int y2 = result_data[i][3];
        int id_label = result_data[i][5];
        
        cv::rectangle(img_draw,cv::Point(x1,y1),cv::Point(x2,y2),cv::Scalar(255,0,0),3);
        cv::putText(img_draw,label_map[id_label],cv::Point(x1,y2),CV_FONT_HERSHEY_SIMPLEX,1,cv::Scalar(255,0,55));
    }

27. torch::max torch::min 同max

static inline std::tuple<Tensor,Tensor> max(const Tensor & self, Dimname dim, bool keepdim=false);
static inline Tensor max(const Tensor & self);

    torch::Tensor a = torch::rand({4,2});
    std::tuple<torch::Tensor, torch::Tensor> max_test = torch::max(a,1);

    auto max_val = std::get<0>(max_test);
    // index
    auto index = std::get<1>(max_test);

    std::cout<<a<<std::endl;
    std::cout<<max_val<<std::endl;
     std::cout<<index<<std::endl;

0.1082 0.7954
0.3099 0.4507
0.2447 0.5169
0.8210 0.3141
[ Variable[CPUFloatType]{4,2} ]
0.7954
0.4507
0.5169
0.8210
[ Variable[CPUFloatType]{4} ]
1
1
1
0
[ Variable[CPULongType]{4} ]

Another example: global max

    torch::Tensor a = torch::rand({4,2});
    torch::Tensor max_test = torch::max(a);

    std::cout<<a<<std::endl;
    std::cout<<max_test<<std::endl;

0.1904 0.9493
0.6521 0.5788
0.9216 0.5997
0.1758 0.7384
[ Variable[CPUFloatType]{4,2} ]
0.94929
[ Variable[CPUFloatType]{} ]

28.masked_select 与 masked_fill

28.1 Tensor masked_select(const Tensor & mask) const;

       torch::Tensor a = torch::rand({2,3});
    torch::Tensor c = (a>0.25);
    torch::Tensor d = a.masked_select(c);

    std::cout<<a<<std::endl;
    std::cout<<c<<std::endl;
    std::cout<<d<<std::endl;

0.0667 0.3812 0.3810
0.3558 0.8628 0.6329
[ Variable[CPUFloatType]{2,3} ]
0 1 1
1 1 1
[ Variable[CPUBoolType]{2,3} ]
0.3812
0.3810
0.3558
0.8628
0.6329
[ Variable[CPUFloatType]{5} ]

28.2 Tensor masked_fill(const Tensor & mask, Scalar value) const;

Tensor & masked_fill_(const Tensor & mask, const Tensor & value) const;
Tensor masked_fill(const Tensor & mask, const Tensor & value) const;

    torch::Tensor a = torch::rand({2,3});
    torch::Tensor aa = a.clone();
    aa.masked_fill_(aa>0.5,-2);

    std::cout<<a<<std::endl;
    std::cout<<aa<<std::endl;

0.8803 0.2387 0.8577
0.8166 0.0730 0.4682
[ Variable[CPUFloatType]{2,3} ]
-2.0000 0.2387 -2.0000
-2.0000 0.0730 0.4682
[ Variable[CPUFloatType]{2,3} ]

28.3 masked_fill_ Everything underlined is an in-place operation.

There is a requirement that Tensor score represents the score and Tensor label represents the label. They are both of the same size. Post-processing is when label=26 and the score of label=26 is less than 0.5, then set the corresponding position of label to 1

 float index[] = {3,2,3,3,5,6,7,8,9,10,11,12,13,14,15,16};
    float score[] = {0.1,0.1,0.9,0.9,0.9,0.1,0.1,0.1,0.1,0.1,0.8,0.8,0.8,0.8,0.8,0.8};

    torch::Tensor aa = torch::from_blob(index, {4,4}).toType(torch::kFloat32);
    torch::Tensor bb = torch::from_blob(score, {4,4}).toType(torch::kFloat32);
    std::cout<<aa<<std::endl;
    std::cout<<bb<<std::endl;

    torch::Tensor tmp = (aa == 3);
    torch::Tensor tmp_2 = (bb >= 0.9);
    std::cout<<tmp<<std::endl;
    std::cout<<tmp_2<<std::endl;
    torch::Tensor condition_111 = tmp * tmp_2;

    std::cout<<condition_111<<std::endl;
    aa.masked_fill_(condition_111,-1);

     std::cout<<aa<<std::endl;

输出如下:
3 2 3 3
5 6 7 8
9 10 11 12
13 14 15 16
[ Variable[CPUFloatType]{4,4} ]
0.1000 0.1000 0.9000 0.9000
0.9000 0.1000 0.1000 0.1000
0.1000 0.1000 0.8000 0.8000
0.8000 0.8000 0.8000 0.8000
[ Variable[CPUFloatType]{4,4} ]
1 0 1 1
0 0 0 0
0 0 0 0
0 0 0 0
[ Variable[CPUByteType]{4,4} ]
0 0 1 1
1 0 0 0
0 0 0 0
0 0 0 0
[ Variable[CPUByteType]{4,4} ]
0 0 1 1
0 0 0 0
0 0 0 0
0 0 0 0
[ Variable[CPUByteType]{4,4} ]
3 2 -1 -1
5 6 7 8
9 10 11 12
13 14 15 16
[ Variable[CPUFloatType]{4,4} ]

29.libtorch comprehensive operation 1

   torch::jit::script::Module module = torch::jit::load(argv[1]);
    std::cout << "== Switch to GPU mode" << std::endl;
    // to GPU
    module.to(at::kCUDA);

    if (LoadImage(file_name, image)) {
            auto input_tensor = torch::from_blob(
                    image.data, {1, kIMAGE_SIZE, kIMAGE_SIZE, kCHANNELS});
            input_tensor = input_tensor.permute({0, 3, 1, 2});
            input_tensor[0][0] = input_tensor[0][0].sub_(0.485).div_(0.229);
            input_tensor[0][1] = input_tensor[0][1].sub_(0.456).div_(0.224);
            input_tensor[0][2] = input_tensor[0][2].sub_(0.406).div_(0.225);

            // to GPU
            input_tensor = input_tensor.to(at::kCUDA);

            torch::Tensor out_tensor = module.forward({input_tensor}).toTensor();

            auto results = out_tensor.sort(-1, true);
            auto softmaxs = std::get<0>(results)[0].softmax(0);
            auto indexs = std::get<1>(results)[0];

            for (int i = 0; i < kTOP_K; ++i) {
                auto idx = indexs[i].item<int>();
                std::cout << "    ============= Top-" << i + 1
                          << " =============" << std::endl;
                std::cout << "    Label:  " << labels[idx] << std::endl;
                std::cout << "    With Probability:  "
                          << softmaxs[i].item<float>() * 100.0f << "%" << std::endl;
            }

        }

30.pytorch nms <---------> libtorch nms

pytorch nms
for example:
boxes [1742,4]
scores [1742]

def nms(boxes, scores, overlap=0.5, top_k=200):
    """Apply non-maximum suppression at test time to avoid detecting too many
    overlapping bounding boxes for a given object.
    Args:
        boxes: (tensor) The location preds for the img, Shape: [num_priors,4].
        scores: (tensor) The class predscores for the img, Shape:[num_priors].
        overlap: (float) The overlap thresh for suppressing unnecessary boxes.
        top_k: (int) The Maximum number of box preds to consider.
    Return:
        The indices of the kept boxes with respect to num_priors.
    """
    keep = scores.new(scores.size(0)).zero_().long()
    if boxes.numel() == 0:
        return keep
    x1 = boxes[:, 0]
    y1 = boxes[:, 1]
    x2 = boxes[:, 2]
    y2 = boxes[:, 3]
    area = torch.mul(x2 - x1, y2 - y1)
    v, idx = scores.sort(0)  # sort in ascending order
    # I = I[v >= 0.01]
    idx = idx[-top_k:]  # indices of the top-k largest vals
    xx1 = boxes.new()
    yy1 = boxes.new()
    xx2 = boxes.new()
    yy2 = boxes.new()
    w = boxes.new()
    h = boxes.new()

    # keep = torch.Tensor()
    count = 0
    while idx.numel() > 0:
        i = idx[-1]  # index of current largest val
        # keep.append(i)
        keep[count] = i
        count += 1
        if idx.size(0) == 1:
            break
        idx = idx[:-1]  # remove kept element from view
        # load bboxes of next highest vals
        torch.index_select(x1, 0, idx, out=xx1)
        torch.index_select(y1, 0, idx, out=yy1)
        torch.index_select(x2, 0, idx, out=xx2)
        torch.index_select(y2, 0, idx, out=yy2)
        # store element-wise max with next highest score
        xx1 = torch.clamp(xx1, min=x1[i])
        yy1 = torch.clamp(yy1, min=y1[i])
        xx2 = torch.clamp(xx2, max=x2[i])
        yy2 = torch.clamp(yy2, max=y2[i])
        w.resize_as_(xx2)
        h.resize_as_(yy2)
        w = xx2 - xx1
        h = yy2 - yy1
        # check sizes of xx1 and xx2.. after each iteration
        w = torch.clamp(w, min=0.0)
        h = torch.clamp(h, min=0.0)
        inter = w*h
        # IoU = i / (area(a) + area(b) - i)
        rem_areas = torch.index_select(area, 0, idx)  # load remaining areas)
        union = (rem_areas - inter) + area[i]
        IoU = inter/union  # store result in iou
        # keep only elements with an IoU <= overlap
        idx = idx[IoU.le(overlap)]
    return keep, count

libtorch nms

bool nms(const torch::Tensor& boxes, const torch::Tensor& scores, torch::Tensor &keep, int &count,float overlap, int top_k)
{
    count =0;
    keep = torch::zeros({scores.size(0)}).to(torch::kLong).to(scores.device());
    if(0 == boxes.numel())
    {
        return false;
    }

    torch::Tensor x1 = boxes.select(1,0).clone();
    torch::Tensor y1 = boxes.select(1,1).clone();
    torch::Tensor x2 = boxes.select(1,2).clone();
    torch::Tensor y2 = boxes.select(1,3).clone();
    torch::Tensor area = (x2-x1)*(y2-y1);
    //    std::cout<<area<<std::endl;

    std::tuple<torch::Tensor,torch::Tensor> sort_ret = torch::sort(scores.unsqueeze(1), 0, 0);
    torch::Tensor v = std::get<0>(sort_ret).squeeze(1).to(scores.device());
    torch::Tensor idx = std::get<1>(sort_ret).squeeze(1).to(scores.device());

    int num_ = idx.size(0);
    if(num_ > top_k) //python:idx = idx[-top_k:]
    {
        idx = idx.slice(0,num_-top_k,num_).clone();
    }
    torch::Tensor xx1,yy1,xx2,yy2,w,h;
    while(idx.numel() > 0)
    {
        auto i = idx[-1];
        keep[count] = i;
        count += 1;
        if(1 == idx.size(0))
        {
            break;
        }
        idx = idx.slice(0,0,idx.size(0)-1).clone();

        xx1 = x1.index_select(0,idx);
        yy1 = y1.index_select(0,idx);
        xx2 = x2.index_select(0,idx);
        yy2 = y2.index_select(0,idx);

        xx1 = xx1.clamp(x1[i].item().toFloat(),INT_MAX*1.0);
        yy1 = yy1.clamp(y1[i].item().toFloat(),INT_MAX*1.0);
        xx2 = xx2.clamp(INT_MIN*1.0,x2[i].item().toFloat());
        yy2 = yy2.clamp(INT_MIN*1.0,y2[i].item().toFloat());

        w = xx2 - xx1;
        h = yy2 - yy1;

        w = w.clamp(0,INT_MAX);
        h = h.clamp(0,INT_MAX);

        torch::Tensor inter = w * h;
        torch::Tensor rem_areas = area.index_select(0,idx);

        torch::Tensor union_ = (rem_areas - inter) + area[i];
        torch::Tensor Iou = inter * 1.0 / union_;
        torch::Tensor index_small = Iou < overlap;
        auto mask_idx = torch::nonzero(index_small).squeeze();
        idx = idx.index_select(0,mask_idx);//pthon: idx = idx[IoU.le(overlap)]
    }
    return true;
}

31. Data type is important! .to(torch::kByte);

31.1

    //[128,512]
    torch::Tensor b = torch::argmax(output_1, 2).cpu();
    //    std::cout<<b<<std::endl;
    b.print();

    cv::Mat mask(T_height, T_width, CV_8UC1, (uchar*)b.data_ptr());
    imshow("mask",mask*255);
    waitKey(0);

[Variable[CPULongType] [128, 512]]

As above! The obtained b is the segmentation map [128, 512]. But life and death cannot be shown! ! Then I checked the value of b and compared it with pytorch and found that it was consistent. But the above picture is that I can't get the desired segmentation picture. It's all black and is 0. But when I type out the values, some of them are not 0!
This was how the project was written before, hey. . . Then I looked for the implementation of psenet libtorch on github and found that others also wrote it in a similar way.

 cv::Mat tempImg = Mat::zeros(T_height, T_width, CV_8UC1);
 memcpy((void *) tempImg.data, b.data_ptr(), sizeof(torch::kU8) * b.numel());

I also wrote this, but found that it still didn’t work! ! ! 2 hours have passed, and there is no other way. I am going to save the 128*512 data in els for viewing. I experimented aimlessly
cout<<b[0][0].item().toFloat()<<endl;
so that the value can be printed out, be sure to add .toFloat(). Aimlessly writing the loop
for(int i=0;i<128;i++)
for(int j=0;j<512;j++)
{ } but I don’t accept it! What's the problem? The values ​​are all correct but they can't be displayed? I found that b[0][0].item().toFloat() above must be added with .toFloat(). So what type is my b? It is tensor type. What type is it? See the printed [ Variable[CPULongType] [128, 512]], long type. Oh, let me change the type and see. Looking at the previous type conversion, I found that I only need to add .to(torch::kFloat32); similarly. Because I need int, I will int it first, torch::Tensor b = torch::argmax(output_1 , 2).cpu().to(torch::kInt); I tried it but it still doesn’t work. .to(torch::kFloat32); I tried it and it still doesn’t work. When I type torch::k, the compiler will automatically Pop out something starting with k. The first one is kByte. Then I tried: torch::Tensor b = torch::argmax(output_1, 2).cpu().to(torch::kByte);










! ! ! !
That’s it! The segmentation diagram I want comes out.
What kills me is the data type issue. It took at least 2 hours!

31.2
Convert the intermediately processed images to tensor

 Mat m_tmp = grayMat.clone();
    torch::Tensor label_deal = torch::from_blob(
                m_tmp.data, {grayMat.rows, grayMat.cols}).toType(torch::kByte).to(m_device);
//    label_deal = label_deal.to(m_device);
    auto aaa = torch::max(label_deal);
    std::cout<<label_deal<<std::endl;
    std::cout<<aaa<<std::endl;
    while(1);

Another big pit! ! ! At first I thought it was ok, but then the subsequent processing results were wrong, so I checked step by step where the problem was, and then located it here. The pixel value of m_tmp did not match up in the tensor at all! ! ! I know that the maximum pixel value of m_tmp is 34, but the maximum pixel value of the printed tensor is 255! ! ! Hey, it’s torch::kByte type! There is no way, changing to kFloat32 still won't work, the value is even more outrageous and nan. . Uh uh uh. Then I discovered that .toType(torch::kByte) and .to(torch::kByte) are written in this way. Which one should I use or is it still the same? Then I continued the experiment and still had the same problem, and it still didn’t work to separate .to(m_device); because based on previous experience, torch::Tensor tmp = tmp.cpu(); seemed to need to be written separately, otherwise there would be question. So what's the problem here? The pixel value just cannot be put into the tensor correctly! ! ! What's going on? ? ?
Then I was depressed for a long time, so should Mat’s type also need to be changed?

 Mat m_tmp = grayMat.clone();
    m_tmp.convertTo(m_tmp,CV_32FC1);/又是个大坑 图片要先转float32啊
    torch::Tensor label_deal = torch::from_blob(
                m_tmp.data, {grayMat.rows, grayMat.cols}).toType(torch::kByte).to(m_device);

that's it! ! ! Uh-huh, do I have to convert it to CV_32FC1? may be!

32. Pointer access to Tensor data

        torch::Tensor output = m_model->forward({input_tensor}).toTensor()[0];
        torch::Tensor output_cpu = output.cpu();
        //output_cpu     Variable[CPUFloatType] [26, 480, 480]]
        output_cpu.print();

        void *ptr = output_cpu.data_ptr();
        //std::cout<<(float*)ptr[0]<<std::endl;

It can only be defined with void  or auto, otherwise an error will be reported. For example, if I use float  ptr = output_cpu.data_ptr(); an error will be reported:
error: invalid conversion from 'void
' to 'float
' [-fpermissive]
float *ptr = output_cpu.data_ptr();
then void * compilation passes, I need Use pointers to access the data in tensor!

torch::Tensor output = m_model->forward({input_tensor}).toTensor()[0];
        torch::Tensor output_cpu = output.cpu();
        //output_cpu     Variable[CPUFloatType] [26, 480, 480]]
        output_cpu.print();
        void *ptr = output_cpu.data_ptr();
        std::cout<<(float*)ptr<<std::endl;

As written above, the output is:

[Variable[CPUFloatType] [26, 480, 480]]
0x7fab195ee040

The output is an address, so how to access the data? Naturally, it is written like this:
std::cout<<(float )ptr[0]<<std::endl;
when written like this, an error is reported! ! ! !
: error: 'void
' is not a pointer-to-object type, and then write:
std::cout<<(float*)ptr[0][0][0]<<std::endl; still reported Same error! . There was no way, so I Googled it and found that there was the same error as mine, as well as the solution:
 


really! solved!

        void *ptr = output_cpu.data_ptr();
//        std::cout<<*((float*)ptr[0][0][0])<<std::endl;
//        std::cout<<(float*)ptr[0][0][0]<<std::endl;

         std::cout<<*((float*)(ptr+2))<<std::endl;

There is another way to write:

const float* result = reinterpret_cast<const float *>(output_cpu.data_ptr());

And the way I just wrote it:

 void *ptr = output_cpu.data_ptr();
 const float* result = (float*)ptr;

33 Comparison of Tensor assignment methods by index in PyTorch

Comparison of the methods of assigning Tensor values ​​by index in PyTorch [ Comparison of the methods of assigning Tensor values ​​by index in PyTorch - Short Book ]

44 Output multiple tensors (pytorch side) and take out multiple tensors (libtorch side)

Output on pytorch side:

    def forward(self, x, batch=None):
        output, cnn_feature = self.dla(x)
        return (output['ct_hm'],output['wh'],cnn_feature)

The corresponding libtorch end

    auto out = m_model->forward({input_tensor});
    auto tpl = out.toTuple();
    auto out_ct_hm = tpl->elements()[0].toTensor();
    out_ct_hm.print();
    auto out_wh = tpl->elements()[1].toTensor();
    out_wh.print();
    auto out_cnn_feature = tpl->elements()[2].toTensor();
    out_cnn_feature.print();

If a single tensor is output, it is

at::Tensor output = module->forward(inputs).toTensor();

45. torch::Tensor is used as a function parameter. Whether it is a reference or not, the operation of the formal parameters within the function will affect the original tensor, that is, it is a reference.

void test_tensor(torch::Tensor a)
{
    a[0][0] = -100;

}

int main(int argc, const char* argv[])
{

    torch::Tensor p = torch::rand({2,2});
    std::cout<<p<<std::endl;
    std::cout<<"~~~~#########~~~~~~~~~~~~~~~~~~~~~~~~~~"<<std::endl;
    test_tensor(p);
    std::cout<<p<<std::endl;
    while (1);
}

The output is as follows:

 0.0509  0.3509
 0.8019  0.1350
[ Variable[CPUType]{2,2} ]
~~~~#########~~~~~~~~~~~~~~~~~~~~~~~~~~
-100.0000    0.3509
   0.8019    0.1350
[ Variable[CPUType]{2,2} ]

It can be seen that although the function void test_tensor(torch::Tensor a) is not a reference, the value has changed after passing this function!

46. ​​Implement pytorch subscript operation

For example, on the pytorch side, it is written as follows:

c=b[a]

Among them, the shape of a is [1,100], and the shape of b is [1,100,40,2], so everyone guesses what the shape of c is. . Oh, another known condition is that a is equivalent to a mask. The values ​​in a are only 0 or 1. Assume that the first five values ​​​​of a are 1 and the rest are 0. The shape of c obtained is [5, 40,
2 ], you can probably guess that the rows that are 1 are taken out, and the rest are not! So, how can libtorch implement it elegantly?
Uh-huh, I haven't thought of any good solution yet, because the libtorch side does not support subscripting. . Very troublesome. . . Then I wrote the loop myself:
In order to make it easier to see the values, I only assume 10.

// aim [1,10,2,2]   ind_mask_ [1,10] 比如前5个是1余都是0  得到的结果形状是[5,40,2]  即pytorch里面的操作 aim = aim[ind_mask]
torch::Tensor deal_mask_index22(torch::Tensor aim_,torch::Tensor ind_mask_)
{
    torch::Tensor aim = aim_.clone().squeeze(0);//[1,100,40,2]  -->> [100,40,2]
    torch::Tensor ind_mask = ind_mask_.clone().squeeze(0);[1,100]  -->> [100]
    int row = ind_mask.size(0);
    int cnt = 0;
    for(int i=0;i<row;i++)
    {
        if(ind_mask[i].item().toInt())
        {
            cnt += 1;
        }
    }
    torch::Tensor out = torch::zeros({cnt,aim.size(1),aim.size(2)});
    int index_ = 0;
    for(int i=0;i<row;i++)
    {
        if(ind_mask[i].item().toInt())
        {
            out[index_++] = aim[i];
//            std::cout<<i<<std::endl;
        }
    }

    std::cout<<"##############################################"<<std::endl;
    std::cout<<out<<std::endl;
    
    return out;
}

int main(int argc, const char* argv[])
{
    torch::Tensor ind_mask = torch::ones({1,10});
    ind_mask[0][0] = 0;
    ind_mask[0][1] = 0;
    ind_mask[0][2] = 0;
    ind_mask[0][4] = 0;

    torch::Tensor aim = torch::rand({1,10,2,2});
    std::cout<<aim<<std::endl;

    deal_mask_index22(aim,ind_mask);


    while (1);
}

47.pytorch libtorch tensor verification accuracy

[Tensor verification accuracy of pytorch libtorch](Tensor verification accuracy of pytorch libtorch)
https://www.cnblogs.com/yanghailin/p/13669046.html

48. Others--color mapping

 /
    auto t1 = std::chrono::steady_clock::now();
//    static torch::Tensor tensor_m0 = torch::zeros({m_height,m_width}).to(torch::kByte).to(torch::kCPU);
//    static torch::Tensor tensor_m1 = torch::zeros({m_height,m_width}).to(torch::kByte).to(torch::kCPU);
//    static torch::Tensor tensor_m2 = torch::zeros({m_height,m_width}).to(torch::kByte).to(torch::kCPU);

    static torch::Tensor tensor_m0 = torch::zeros({m_height,m_width}).to(torch::kByte);
    static torch::Tensor tensor_m1 = torch::zeros({m_height,m_width}).to(torch::kByte);
    static torch::Tensor tensor_m2 = torch::zeros({m_height,m_width}).to(torch::kByte);
    tensor_m0 = tensor_m0.to(torch::kCUDA);
    tensor_m1 = tensor_m1.to(torch::kCUDA);
    tensor_m2 = tensor_m2.to(torch::kCUDA);
    for(int i=1;i<m_color_cnt;i++)
    {
        tensor_m0.masked_fill_(index==i,colormap[i * 3]);
        tensor_m1.masked_fill_(index==i,colormap[i * 3 + 1]);
        tensor_m2.masked_fill_(index==i,colormap[i * 3 + 2]);
    }
    torch::Tensor tensor_m00 = tensor_m0.cpu();
    torch::Tensor tensor_m11 = tensor_m1.cpu();
    torch::Tensor tensor_m22 = tensor_m2.cpu();
    cv::Mat m0 = cv::Mat(m_height, m_width, CV_8UC1, (uchar*)tensor_m00.data_ptr());
    cv::Mat m1 = cv::Mat(m_height, m_width, CV_8UC1, (uchar*)tensor_m11.data_ptr());
    cv::Mat m2 = cv::Mat(m_height, m_width, CV_8UC1, (uchar*)tensor_m22.data_ptr());
    std::vector<cv::Mat> channels = {m0,m1,m2};
    cv::Mat mergeImg;
    cv::merge(channels, mergeImg);
    mergeImg = mergeImg.clone();
    auto ttt1 = std::chrono::duration_cast<std::chrono::milliseconds>
            (std::chrono::steady_clock::now() - t1).count();
    std::cout << "merge time="<<ttt1<<"ms"<<std::endl;
    /

It takes about 35ms with CPU and 2-3ms with GPU. The following code implements the same function in 2-3ms.

 auto t0 = std::chrono::steady_clock::now();
    for (int i = 0; i<labelMat.rows; i++)
    {
        for (int j = 0; j<labelMat.cols; j++)
        {
            int id = labelMat.at<uchar>(i,j);
            if(0 == id)
            {
                continue;
            }
            colorMat.at<cv::Vec3b>(i, j)[0] = colormap[id * 3];
            colorMat.at<cv::Vec3b>(i, j)[1] = colormap[id * 3 + 1];
            colorMat.at<cv::Vec3b>(i, j)[2] = colormap[id * 3 + 2];
        }
    }
    auto ttt = std::chrono::duration_cast<std::chrono::milliseconds>
            (std::chrono::steady_clock::now() - t0).count();
    std::cout << "consume time="<<ttt<<"ms"<<std::endl;

49.torch.gather

Pure pytorch side: (Reprinted at https://www.jianshu.com/p/5d1f8cd5fe31)
torch.gather(input, dim, index, out=None) → Tensor
dim along the given axis, specify the input index tensor index The values ​​of the positions are aggregated.
For a 3-dimensional tensor, the output can be defined as:

out[i][j][k] = input[index[i][j][k]][j][k]  # if dim == 0
out[i][j][k] = input[i][index[i][j][k]][k]  # if dim == 1
out[i][j][k] = input[i][j][index[i][j][k]]  # if dim == 2

Parameters:
input (Tensor) – source tensor
dim (int) – axis index
of index (LongTensor) – subscript of aggregated element (index needs to be torch.longTensor type)
out (Tensor, optional) – target tensor

Example:
dim = 1

import torch
a = torch.randint(0, 30, (2, 3, 5))
print(a)
#tensor([[[ 18.,   5.,   7.,   1.,   1.],
#         [  3.,  26.,   9.,   7.,   9.],
#         [ 10.,  28.,  22.,  27.,   0.]],

#        [[ 26.,  10.,  20.,  29.,  18.],
#         [  5.,  24.,  26.,  21.,   3.],
#         [ 10.,  29.,  10.,   0.,  22.]]])
index = torch.LongTensor([[[0,1,2,0,2],
                          [0,0,0,0,0],
                          [1,1,1,1,1]],
                        [[1,2,2,2,2],
                         [0,0,0,0,0],
                         [2,2,2,2,2]]])
print(a.size()==index.size())
b = torch.gather(a, 1,index)
print(b)
#True
#tensor([[[ 18.,  26.,  22.,   1.,   0.],
#         [ 18.,   5.,   7.,   1.,   1.],
#         [  3.,  26.,   9.,   7.,   9.]],

#        [[  5.,  29.,  10.,   0.,  22.],
#         [ 26.,  10.,  20.,  29.,  18.],
#         [ 10.,  29.,  10.,   0.,  22.]]])

dim =2

c = torch.gather(a, 2,index)
print(c)
#tensor([[[ 18.,   5.,   7.,  18.,   7.],
#         [  3.,   3.,   3.,   3.,   3.],
#         [ 28.,  28.,  28.,  28.,  28.]],

#       [[ 10.,  20.,  20.,  20.,  20.],
#        [  5.,   5.,   5.,   5.,   5.],
#        [ 10.,  10.,  10.,  10.,  10.]]])

dim = 0

index2 = torch.LongTensor([[[0,1,1,0,1],
                          [0,1,1,1,1],
                          [1,1,1,1,1]],
                        [[1,0,0,0,0],
                         [0,0,0,0,0],
                         [1,1,0,0,0]]])
d = torch.gather(a, 0,index2)
print(d)
#tensor([[[ 18.,  10.,  20.,   1.,  18.],
#         [  3.,  24.,  26.,  21.,   3.],
#         [ 10.,  29.,  10.,   0.,  22.]],

#       [[ 26.,   5.,   7.,   1.,   1.],
#         [  3.,  26.,   9.,   7.,   9.],
#         [ 10.,  29.,  22.,  27.,   0.]]])

I have seen this before and was confused when I saw it again, so I recorded it here! The main thing is this
out[i][j][k] = input[i][index[i][j][k]][k] # if dim == 1 But what can this gather function do
? Intuitively, the shapes of output and input are the same. You can derive one or two of them yourself, for example, dim=1
output[0][0][0] = input[0] [index[0][0][0] ] [0], then first search for index and find index[0][0][0]=0, and then search for input[0][0][0]. This is the process. Therefore, index is the subscript index, and its
value It cannot exceed the dimension of dim!
Intuitively, a new mapping rule is implemented in a certain dimension to obtain the output. The key lies in the index! This is the rule.

50. torch::argsort (libtorch1.0 does not have this function) torch::sort

A libtorch project was written in 1.1. Since the project is using 1.0, I converted the written 1.1 to 1.0. Then it prompted:
error: 'argsort' is not a member of 'torch'
Well, I know, it's because A version problem caused the function names to mismatch, but where did I go to find argsort? Then, I saw that the previous max seemed to have a record index, and then I saw sort, and then I experimented, and the result was the same as argsort!
//pytorch1.1
torch::Tensor edge_idx_sort2 = torch::argsort(edge_num, 2, true);
//pytorch1.0
std::tupletorch::Tensor,torch::Tensor sort_ret = torch::sort(edge_num, 2 , true);
// torch::Tensor v = std::get<0>(sort_ret);
torch::Tensor edge_idx_sort = std::get<1>(sort_ret);

51. Determine whether tensor is empty ind_mask.sizes().empty()

int row = ind_mask.size(0);
If ind_mask is empty, the code will crash and report an error.

terminate called after throwing an instance of 'c10::Error'
  what():  dimension specified as 0 but tensor has no dimensions (maybe_wrap_dim at /data_1/leon_develop/pytorch/aten/src/ATen/core/WrapDimMinimal.h:9)
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0x6a (0x7f4cf0a4af5a in /data_2/project_202009/chejian/3rdparty/libtorch/lib/libc10.so)
frame #1: <unknown function> + 0x48a74f (0x7f4d010af74f in /data_2/project_202009/chejian/3rdparty/libtorch/lib/libcaffe2.so)
frame #2: at::native::size(at::Tensor const&, long) + 0x20 (0x7f4d010afac0 in /data_2/project_202009/chejian/3rdparty/libtorch/lib/libcaffe2.so)
frame #3: at::Tensor::size(long) const + 0x36 (0x467fba in /data_2/project_202009/libtorch/snake_libtorch_cuda8/cmake-build-debug/example-app)
frame #4: deal_mask_index(at::Tensor, at::Tensor) + 0x1a7 (0x45a83e in /data_2/project_202009/libtorch/snake_libtorch_cuda8/cmake-build-debug/example-app)
frame #5: get_gcn_feature(at::Tensor, at::Tensor, at::Tensor, int, int) + 0x4f3 (0x45e092 in /data_2/project_202009/libtorch/snake_libtorch_cuda8/cmake-build-debug/example-app)
frame #6: init_poly(std::shared_ptr<torch::jit::script::Module> const&, std::shared_ptr<torch::jit::script::Module> const&, at::Tensor const&, std::tuple<at::Tensor, at::Tensor, at::Tensor> const&) + 0x168 (0x45e777 in /data_2/project_202009/libtorch/snake_libtorch_cuda8/cmake-build-debug/example-app)
frame #7: main + 0xaee (0x463ab5 in /data_2/project_202009/libtorch/snake_libtorch_cuda8/cmake-build-debug/example-app)
frame #8: __libc_start_main + 0xf0 (0x7f4ced29c840 in /lib/x86_64-linux-gnu/libc.so.6)
frame #9: _start + 0x29 (0x456b89 in /data_2/project_202009/libtorch/snake_libtorch_cuda8/cmake-build-debug/example-app)

Therefore, it is necessary to determine whether the tensor is empty, but:

ind_mask.numel() //返回总个数,但是为空的时候返回1
ind_mask.sizes()// 返回类似python list的东东,[1, 100, 40, 2]  [1, 40, 2]

ind_mask.sizes() Then I followed the sizes() libtorch function definition which is of IntList type, and then traced, using IntList = ArrayRef<int64_t>; and then traced ArrayRef, and then looked at this class and found

  /// empty - Check if the array is empty.
  constexpr bool empty() const {
    return Length == 0;
  }

Therefore, it means that there is a member function that is judged to be empty and can be called!
if(ind_mask.sizes().empty())
{ } %%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%% It’s too difficult for me! I thought it was done!



if(ind_mask.sizes().empty())
    {
        torch::Tensor tmp;
        return tmp;
    }

When a tensor is judged to be empty, I create a tensor and exit, because the function returns a torch::Tensor type.
However, the directly created tensor will also report an error when accessing sizes! ! !
as follows:

 torch::Tensor tmp;
 tmp.print(); //打印[UndefinedTensor]

if(tmp.sizes().empty())
{
}
[UndefinedTensor]
terminate called after throwing an instance of 'c10::Error'
  what():  sizes() called on undefined Tensor (sizes at /data_1/leon_develop/pytorch/aten/src/ATen/core/UndefinedTensorImpl.cpp:12)
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0x6a (0x7f35f1b21f5a in /data_2/project_202009/chejian/3rdparty/libtorch/lib/libc10.so)
frame #1: at::UndefinedTensorImpl::sizes() const + 0x77 (0x7f360217d6b7 in /data_2/project_202009/chejian/3rdparty/libtorch/lib/libcaffe2.so)
frame #2: at::Tensor::sizes() const + 0x27 (0x45e921 in /data_2/project_202009/libtorch/snake_libtorch_cuda8/cmake-build-debug/example-app)
frame #3: main + 0x55 (0x45bcaa in /data_2/project_202009/libtorch/snake_libtorch_cuda8/cmake-build-debug/example-app)
frame #4: __libc_start_main + 0xf0 (0x7f35ee373840 in /lib/x86_64-linux-gnu/libc.so.6)
frame #5: _start + 0x29 (0x44f889 in /data_2/project_202009/libtorch/snake_libtorch_cuda8/cmake-build-debug/example-app)

But this time:

 torch::Tensor tmp;
tmp.print();
std::cout<<tmp.numel()<<std::endl; // 输出为0

! ! ! !
Therefore, define tensor directly, and .numel() at this time is 0.

52.pytorch code out = aim[ind_mask], written with libtorch.

pytorch code
out = aim[ind_mask]
where the shape is as follows:
aim [21, 40, 2]
ind_mask [21] #Elements are either 0 or 1. For example, there are 12 1s.
The output shape of out is [12, 40, 2]
## ################################### How to express
out = aim[ind_mask] in the above pytorch code using libtorch code

    torch::Tensor a = torch::rand({5,3,2});
    torch::Tensor idx = torch::zeros({5}).toType(torch::kLong);
    idx[3] = 1;
    idx[1] = 1;

    torch::Tensor abc = torch::nonzero(idx);
    torch::Tensor b = a.index_select(0,abc.squeeze());

    std::cout<<a<<std::endl;
    std::cout<<abc<<std::endl;
    std::cout<<b<<std::endl;

The output is as follows:

(1,.,.) = 
  0.1767  0.8695
  0.3779  0.3531
  0.3413  0.3734

(2,.,.) = 
  0.9664  0.7723
  0.8640  0.7289
  0.8395  0.6344

(3,.,.) = 
  0.9043  0.2671
  0.9901  0.2966
  0.0347  0.1650

(4,.,.) = 
  0.1457  0.1169
  0.7983  0.5157
  0.6405  0.2213

(5,.,.) = 
  0.7977  0.4066
  0.6691  0.7191
  0.5897  0.7400
[ Variable[CPUFloatType]{5,3,2} ]
 1
 3
[ Variable[CPULongType]{2,1} ]
(1,.,.) = 
  0.9664  0.7723
  0.8640  0.7289
  0.8395  0.6344

(2,.,.) = 
  0.1457  0.1169
  0.7983  0.5157
  0.6405  0.2213
[ Variable[CPUFloatType]{2,3,2} ]

53. How to express the pytorch code a4 = arr[...,3,0] using libtorch and use masked_select!

>>> import numpy as np
>>> arr = np.arange(40).reshape(1,5,4,2)
>>> arr
array([[[[ 0,  1],
         [ 2,  3],
         [ 4,  5],
         [ 6,  7]],

        [[ 8,  9],
         [10, 11],
         [12, 13],
         [14, 15]],

        [[16, 17],
         [18, 19],
         [20, 21],
         [22, 23]],

        [[24, 25],
         [26, 27],
         [28, 29],
         [30, 31]],

        [[32, 33],
         [34, 35],
         [36, 37],
         [38, 39]]]])
>>> a1 = arr[...,0,1]
>>> a2 = arr[...,1,0]
>>> a3 = arr[...,2,1]
>>> a4 = arr[...,3,0]
>>> print(a1)
[[ 1  9 17 25 33]]
>>> print(a2)
[[ 2 10 18 26 34]]
>>> print(a3)
[[ 5 13 21 29 37]]
>>> print(a4)
[[ 6 14 22 30 38]]
>>> 

I struggled for a long time at first, but there seemed to be no good way, and then I used a for loop to complete it.

//ex shape[1,5,4,2]      ex[..., 0, 1]  -->>[1,5]
torch::Tensor index_tensor_3(const torch::Tensor &ex,const int &idx1,const int &idx2)
{
//    ex.print();
    int dim_ = ex.size(1);
    torch::Tensor out = torch::empty({1,dim_}).to(ex.device());
    int size_ = ex.size(1);
    for(int i=0;i<size_;i++)
    {
        auto a = ex[0][i][idx1][idx2];
        out[0][i] = a;
        //        std::cout<<a<<std::endl;
    }
    
    return out;
}

Then optimize, complete with pure libtorch functions:

//ex shape[1,5,4,2]      ex[..., 0, 1] -->>[1,5]
torch::Tensor index_tensor_3(const torch::Tensor &ex,const int &idx1,const int &idx2)
{
    const int dim0 = ex.size(0);
    const int dim1 = ex.size(1);
    const int dim2 = ex.size(2);
    const int dim3 = ex.size(3);

    std::vector<int> v_index(ex.numel());//初始化:ex.numel() 个0
    int offset = dim2 * dim3;
    for(int i=0;i<dim1;i++)
    {
        int index_ = idx1 * dim3 + idx2;
        v_index[i * offset + index_] = 1;
    }

    torch::Tensor index = torch::tensor(v_index).to(ex.device());
    index = index.reshape(ex.sizes()).toType(torch::kByte);//这里需要kByte类型
//    std::cout<<index<<std::endl;

    torch::Tensor selete = ex.masked_select(index).unsqueeze(0);
    return selete;
}

Connect the function and call this function about 10 times in total. The first one takes 15ms, while the following one takes 5ms.

54. Again, type is important! ! Sometimes it is necessary to force writing kernel = kernel.toType(torch::kByte);

One requirement today is to use libtorch1.8 to run the pt model of libtorch1.0. If the syntax is slightly changed, the old version can be compiled and passed in the higher version and can be run, but the running result is wrong. This is quite troublesome.
Because I don't know where the problem lies. The first thing that is questionable is the lack of support. In order to verify this problem, first run inference with the same inputs of the high version and the old version to see if the results of the model are consistent. Of course, this is also quite troublesome, because the higher version of pytorch
needs to run the lower version, and a lot of things need to be changed. There is no way, I changed it, all kinds of errors are reported, I am psenet, this stuff is running on cuda8, ​​python2.7, not only print, but also various other problems, the reason is various data Various libraries were needed for processing, but later I deleted them all regardless,
because I found that when running inference, the sentence
"out = model(img)
" was used, and I only needed to prepare the same img. I condensed the very long test.py file into the following:

#encoding=utf-8
import os
import cv2
import sys
import time
import collections
import torch
import argparse
import numpy as np


import models
#import util


def test(args):


    # Setup Model
    if args.arch == "resnet50":
        model = models.resnet50(pretrained=True, num_classes=7, scale=args.scale)
    elif args.arch == "resnet101":
        model = models.resnet101(pretrained=True, num_classes=7, scale=args.scale)
    elif args.arch == "resnet152":
        model = models.resnet152(pretrained=True, num_classes=7, scale=args.scale)
    
    for param in model.parameters():
        param.requires_grad = False

    model = model.cuda()
    
    if args.resume is not None:                                         
        if os.path.isfile(args.resume):
            print("Loading model and optimizer from checkpoint '{}'".format(args.resume))
            checkpoint = torch.load(args.resume)
            
            # model.load_state_dict(checkpoint['state_dict'])
            d = collections.OrderedDict()
            for key, value in checkpoint['state_dict'].items():
                tmp = key[7:]
                d[tmp] = value
            model.load_state_dict(d)

            print("Loaded checkpoint '{}' (epoch {})"
                  .format(args.resume, checkpoint['epoch']))
            sys.stdout.flush()
        else:
            print("No checkpoint found at '{}'".format(args.resume))
            sys.stdout.flush()

    model.eval()

    img_tmp = torch.rand(1, 3, 963, 1280).cuda()
    traced_script_module = torch.jit.trace(model, img_tmp)
    traced_script_module.save("./myfile/22.pt")

    init_seed = 1 #设置同样的种子确保产生一样的随机数
    torch.manual_seed(init_seed)
    torch.cuda.manual_seed(init_seed)

    img_tmp = torch.rand(1, 3, 64, 64).cuda()
    out = model(img_tmp)
    print(img_tmp)
    print(out)


    print("save pt ok!")
    return 1


if __name__ == '__main__':
    parser = argparse.ArgumentParser(description='Hyperparams')
    parser.add_argument('--arch', nargs='?', type=str, default='resnet50')
    parser.add_argument('--resume', nargs='?', type=str, default="./myfile/checkpoint.pth.tar",    
                        help='Path to previous saved model to restart from')
    parser.add_argument('--binary_th', nargs='?', type=float, default=1.0,
                        help='Path to previous saved model to restart from')
    parser.add_argument('--kernel_num', nargs='?', type=int, default=3,
                        help='Path to previous saved model to restart from')
    parser.add_argument('--scale', nargs='?', type=int, default=1,
                        help='Path to previous saved model to restart from')
    parser.add_argument('--long_size', nargs='?', type=int, default=1280,
                        help='Path to previous saved model to restart from')
    parser.add_argument('--min_kernel_area', nargs='?', type=float, default=10.0,
                        help='min kernel area')
    parser.add_argument('--min_area', nargs='?', type=float, default=300.0,
                        help='min area')
    parser.add_argument('--min_score', nargs='?', type=float, default=0.93,
                        help='min score')
    
    args = parser.parse_args()
    test(args)

This is very important:
init_seed = 1 #Set the same seed to ensure that the same random number is generated
torch.manual_seed(init_seed)
torch.cuda.manual_seed(init_seed)
Because I need to verify the model accuracy on both torch1.0 and torch1.8, I need The control inputs are the same, so setting the same seed ensures the same random numbers are generated. print to verify that they are consistent.
Then I found that there is a difference in out, but only the three digits after the decimal point are different, and the first few are the same, so I feel that it is OK to load the high version with the weight of the low version! But the results in libtorch are very different. Why?
You need to look at the code of libtorch carefully! ! !
Then aimlessly experiment and print. . It’s important to talk about printing here! ! !
The part I printed in my lower version of libtorch is as follows:

[ Variable[CPUByteType]{7,703,1280} ]
[Variable[CPUByteType] [7, 703, 1280]]
[Variable[CPUByteType] [3, 703, 1280]]
kernel_size=3
[Variable[CPUByteType] [3, 703, 1280]]

Then the higher version prints as follows:

[CUDAFloatType [1, 7, 703, 1280]]
[CPUFloatType [7, 703, 1280]]
[CPUFloatType [3, 703, 1280]]
kernel_size=3
[CPUFloatType [3, 703, 1280]]

Um, did you see that the data types are different? Why are they different? So I know what the data type problem is.
Then added this sentence,

kernel = kernel.toType(torch::kByte);

Perfect solution!
Some operations default to CPUByteType in lower versions, but become CPUFloatType in higher versions.
This seemingly simple sentence took me most of the day!
So to sum up, the above is my thought process for finding the problem and solving it perfectly. To sum up, we need to constantly look for positioning problems and constantly experiment to solve them.

Then I will share a data type problem of opencv’s Mat that I encountered recently.

Mat convertTo3Channels_2(const Mat& binImg)
{
    Mat three_channel = Mat::zeros(binImg.rows,binImg.cols,CV_8UC3);
    vector<Mat> channels;
    for (int i=0;i<3;i++)
    {
        channels.push_back(binImg);
    }
    merge(channels,three_channel);
    three_channel.convertTo(three_channel,CV_8UC3); //重要,还要再写一次!!
    return three_channel;
}

Looking at the code,
I declared the type CV_8UC3 at the beginning, Mat three_channel = Mat::zeros(binImg.rows,binImg.cols,CV_8UC3);. Because when the function is passed out, I need uint type.
three_channel.convertTo(three_channel,CV_8UC3); //Important, I have to write it again! !
Here, you need to write it again here, otherwise what is sent out is not of this type. Mat does not know how to view or print out this type, but I am looking at this picture through my debugger gdb imagewatch, and the type will be displayed below.
I reproduced it and took a screenshot. I
interrupted below the sentence merge(channels,three_channel); and my gdb imagewatch showed the following type.
 


Did you see, I clearly initialized Mat three_channel = Mat::zeros(binImg.rows,binImg.cols,CV_8UC3);
CV_8UC3 type, this should be uint type, but after merge it becomes float type, maybe it is merge This function has changed the type for me.
Some of the subsequent operations that caused the function to be passed out were strange, and I don't know where the problem lies.

Then just force transfer again.
three_channel.convertTo(three_channel,CV_8UC3); //Important, I have to write it again! !
 


Summary:
Type is important
. Type is important.
Type is important.
Say important things three times.

Guess you like

Origin blog.csdn.net/m0_72734364/article/details/133428012