Use complete semantic segmentation libtorch

First talk about my computer configuration, ubuntu16.04; opencv3.4.3; in anaconda installed inside torch1.3.1 (different versions of the torch using c ++ code is different, if more than v1.0.1, you need to use the new API otherwise, you need to use the old); cuda10.0; cudnn7.6.5;

Semantic segmentation of the network used here is: DSPNet V2 ; pytorch based development, support PyTorch 1.0, 1.3 I can. Can achieve real-time semantic segmentation (this network can achieve the same target object detection and classification), my computer will resize images to 384 * 384 size, handling one need only 15ms, can be very fast, my computer run segnet It requires 60ms / frame, and two similar accuracy. My computer graphics is GTX1650.

Then talk about how to model pre-trained on how to use semantic segmentation in c ++ environment.


1. .pth .pt files into file

After downloading the code on github, which comes with a segmentation_demo.py, this demo pictures can be completed in sample_images catalog semantic segmentation, saved to segmentation_results directory, look inside the source code generation model of the original network definition when necessary, and also need to pass a parameter args, some networks do not need to pass parameters to achieve the specific need to see the source code.

First, it Tieshanglai Code (posted only part of a complete can be downloaded here):

    model = ESPNetv2Segmentation(args).to(device='cuda')#将模型加载到相应的设备中。有的可能不需要传递参数
    #model = espnetv2_seg(args)
    model.load_state_dict(torch.load('./model/segmentation/model_zoo/espnetv2/espnetv2_s_2.0_pascal_384x384.pth', map_location='cuda'))
   
    model.eval()
    example = torch.Tensor(1, 3, 384, 384).cuda()
    out = model(example)
    print_info_message(out.size())  
    traced_script_module = torch.jit.trace(model, example)#这句话执行时会有warning,没有事   
    traced_script_module.save("espnetv2_s_2.0_pascal_384x384.pt")
    print_info_message('Done')

Args above the specific meaning of the complete code inside, not too long Tieshanglai, the following warning may appear when you run the above program:

The warning google for a long time I did not know how to solve, but he did not affect the results generated .pt files. Because this model is trained on the GPU, so the incoming tensor to put on the GPU, which is .cuda (), in addition, the resulting model should be put on the GPU. Other networks approach should be similar, nothing more than to find the code that implements the original network, or need to take a look at the incoming parameters, then the pre-training model comes in, give a random tensor, and then generate .pt files.


2.c ++ (libtorch) loading model, complete semantic segmentation

This one I have studied for a long time, because not much information on the Internet, everyone's not the same network, implementation may be somewhat different, so we must first understand how to use the demo inside python semantic segmentation.

The general steps are: First, create a model based on the read parameter

model = espnetv2_seg(args)#将各种参数传入

Then load the parameters of the pre-training:

weight_dict = torch.load(args.weights_test, map_location=torch.device('cuda'))
model.load_state_dict(weight_dict)

Then a read image, and a series of image processing, comprising, a resize size, BGR converted from RGB, the image is converted into tensor, normalized, and then passing model to obtain the output tensor, and into numpy, in transformed into pictures, reply size, color to the picture, converted to RGB, preservation, therefore, our c ++ program must follow similar steps, first labeled cmakelists

cmake_minimum_required(VERSION 3.0 FATAL_ERROR)
project(example-app)

SET(CMAKE_BUILD_TYPE Release)
MESSAGE("Build type: " ${CMAKE_BUILD_TYPE})

set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -Wall -O3 -march=native ")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wall -O3 -march=native")
#set( CMAKE_CXX_FLAGS "-std=c++11 -O3" )
#Check C++11 or C++0x support
include(CheckCXXCompilerFlag)
CHECK_CXX_COMPILER_FLAG("-std=c++11" COMPILER_SUPPORTS_CXX11)
CHECK_CXX_COMPILER_FLAG("-std=c++0x" COMPILER_SUPPORTS_CXX0X)
if(COMPILER_SUPPORTS_CXX11)
   set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++11")
   add_definitions(-DCOMPILEDWITHC11)
   message(STATUS "Using flag -std=c++11.")
elseif(COMPILER_SUPPORTS_CXX0X)
   set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++0x")
   add_definitions(-DCOMPILEDWITHC0X)
   message(STATUS "Using flag -std=c++0x.")
else()
   message(FATAL_ERROR "The compiler ${CMAKE_CXX_COMPILER} has no C++11 support. Please use a different C++ compiler.")
endif()

if( TORCH_PATH ) 
   message("TORCH_PATH set to: ${TORCH_PATH}")
   set(Torch_DIR ${TORCH_PATH})
else()
   message(FATAL_ERROR "Need to specify Torch path, e.g., pytorch/torch/share/cmake/Torch ")
endif()

find_package(OpenCV 3.4.3 REQUIRED)
find_package(Torch REQUIRED)
message(STATUS "Torch version is: ${Torch_VERSION}")
if(Torch_VERSION GREATER 1.0.1)
   message(STATUS "Torch version is newer than v1.0.1, will use new api")
   add_definitions(-DTORCH_NEW_API)   #TORCH_NEW_API这个变量在代码中被检测是否定义,因为不同版本的一些功能的使用方法不一样
endif()

add_executable(example-app example-app.cpp)
target_link_libraries(example-app ${TORCH_LIBRARIES} ${OpenCV_LIBS})
set_property(TARGET example-app PROPERTY CXX_STANDARD 11)


add_executable(example-c example-c.cpp)
target_link_libraries(example-c ${TORCH_LIBRARIES} ${OpenCV_LIBS})
set_property(TARGET example-c PROPERTY CXX_STANDARD 11)

build.sh

mkdir build
cd build
cmake .. -DCMAKE_BUILD_TYPE=Release  -DTORCH_PATH=/home/azs/anaconda3/lib/python3.6/site-packages/torch/share/cmake/Torch #使用anaconda里面的pytorch
make -j8

When you create, according to an environment variable set TORCH_PATH find the torch of cmake files.

Here I wrote two programs, example-app.cpp can read the image data set TUM inside a folder, enabling multiple images continuously for semantic segmentation, example-c.cpp is a single image semantic segmentation. With a single code as an example:

#include <torch/torch.h>
#include <iostream>

#include "torch/script.h"
#include "torch/torch.h"
#include "opencv2/core.hpp"
#include "opencv2/imgproc.hpp"
#include "opencv2/highgui.hpp"
#include "opencv2/imgcodecs.hpp"
#include <vector>
#include <chrono>
#include <string>
#include <vector>
using namespace std;
//上色
void Visualization(cv::Mat prediction_map, std::string LUT_file) {

  cv::cvtColor(prediction_map.clone(), prediction_map, CV_GRAY2BGR);
  cv::Mat label_colours = cv::imread(LUT_file,1);
  cv::cvtColor(label_colours, label_colours, CV_RGB2BGR);
  cv::Mat output_image;
  LUT(prediction_map, label_colours, output_image);

  cv::imshow( "Display window", output_image);
  
}
//对图片首先进行处理,返回张量
torch::Tensor process( cv::Mat& image,torch::Device device,int img_size)
{
    cv::imshow("test1",image);
     //首先对输入的图片进行处理
    cv::cvtColor(image, image, CV_BGR2RGB);// bgr -> rgb
    cv::Mat img_float;
   // image.convertTo(img_float, CV_32F, 1.0 / 255);//归一化到[0,1]区间,
    cv::resize(image, img_float, cv::Size(img_size, img_size));
    

    std::vector<int64_t> dims = {1, img_size, img_size, 3};
    #if defined(TORCH_NEW_API)  //根据编译结果选择执行哪一段
    torch::Tensor img_var = torch::from_blob(img_float.data, dims, torch::kByte).to(device);//将图像转化成张量
    #else
   //下面这两句只有使用老版本的时候才用
    torch::Tensor img_tensor = torch::CPU(torch::kFloat32).tensorFromBlob(img_float.data, dims);
    torch::Tensor img_var = torch::autograd::make_variable(img_tensor, false).to(device);//创建图像变量
    #endif
   img_var = img_var.permute({0,3,1,2});//将张量的参数顺序转化为 torch输入的格式 1,3,384,384
   img_var = img_var.toType(torch::kFloat);
   img_var = img_var.div(255);

   return img_var;

}
int main() {

   char path[] = "../h2.png";
   int img_size = 384;
   std::string LUT_file="../pascal.png";
   //设置device类型
    torch::DeviceType device_type;
    device_type = torch::kCUDA;
    torch::Device device(device_type);
    std::cout<<"cudu support:"<< (torch::cuda::is_available()?"ture":"false")<<std::endl; 
    //读取模型
   //新版本
    torch::jit::script::Module module= torch::jit::load("../espnetv2_s_2.0_pascal_384x384.pt");
   //老版本,老版本的module是指针,使用->,新版本是对象,使用.
   //std::shared_ptr<torch::jit::script::Module> module = torch::jit::load("../espnetv2_s_2.0_pascal_384x384.pt");
    module.to(device);
    
    // 读取图片
    cv::Mat image = cv::imread(path,cv::ImreadModes::IMREAD_COLOR);
   //对图片进行处理,得到张量
    torch::Tensor img_var=process(image,device,img_size); 
    std::chrono::steady_clock::time_point t1 = std::chrono::steady_clock::now();
    torch::Tensor result = module.forward({img_var}).toTensor();  //前向传播获取结果,还是tensor类型
    std::chrono::steady_clock::time_point t2 = std::chrono::steady_clock::now();
    std::cout << "Processing time = " << (std::chrono::duration_cast<std::chrono::microseconds>(t2 - t1).count())/1000000.0 << " sec" <<std::endl;
    result=result.argmax(1);//找出每个点概率最大的一个
    result = result.squeeze();//删除一个维度   
    result = result.to(torch::kU8);//.mul(100)这里是为了让分割的区域更明显,但是不需要加,因为后面使用了lut的方法可以使不同的mask显示不同颜色
    result=result.to(torch::kCPU);
    cv::Mat pts_mat(cv::Size(384,384), CV_8U, result.data_ptr());//新建一个矩阵,用于保存数据,将tensor的数据转移到这里面
    //cv::imshow("test",pts_mat);//这个图是灰度图
    Visualization(pts_mat,LUT_file);//上色
    cv::waitKey(0);

   return 0;

}

Notes are written on the inside, and the final operating results:

Original:

result:

If you run the example-app.cpp this program, the time results are as follows:

You can see, the processing time is mostly about 14ms.

My code Download


Reference Catalog:

https://blog.csdn.net/cp562090732/article/details/100172372?depth_1-utm_source=distribute.pc_relevant.none-task&utm_source=distribute.pc_relevant.none-task#4__142

https://blog.csdn.net/pplxlee/article/details/90445316

https://blog.csdn.net/u010397980/article/details/89437628

https://blog.csdn.net/IAMoldpan/article/details/85057238

https://www.cnblogs.com/geoffreyone/p/10827010.html

https://www.jianshu.com/p/aee6a3d72014

https://www.jianshu.com/p/7cddc09ca7a4

编译遇到的一些问题参考:

https://www.jianshu.com/p/186bcdfe9492

 

发布了22 篇原创文章 · 获赞 22 · 访问量 4万+

Guess you like

Origin blog.csdn.net/qq_35590091/article/details/104557020