Yolov5 to ONNX model + C++ deployment using ONNX Runtime (including the introduction of official documents and using different inference engines as ONNXRuntime backend)

20230512 added, directly compile the source code of ONNXRuntime, and implement ONNXRuntime using different reasoning engines (especially TensorRT) as the backend.

  • Documentation: TensorRT Execution Provider , requires version correspondence
  • Of course, other inference engines can also be used as the backend: ONNX Runtime Execution Providers
  • The currently supported backends are as follows:
    insert image description here
  • Check the version by yourself, I verified ONNX Runtime 1.12.1 version successfully, install TensorRT 8.4.3 | CUDA 11.4.4 | cuDNN 8.4.1 ,
  • It was originally the 525.60 driver version, but I couldn’t find it after reinstalling it. Smartly installed the 525.125 version. CUDA here is the highest version supported by the driver! !
    insert image description here
  • Reference:
    TensorRT Execution Provider
    CUDA Execution Provider
    tensorrt configuration code is as follows, only supports FP16 quantization, not INT8 quantization.
    OrtTensorRTProviderOptions trt_options{
    
    };
    trt_options.device_id = 0;
    trt_options.has_user_compute_stream = 1;
    trt_options.trt_max_partition_iterations = 1000;
    trt_options.trt_min_subgraph_size = 1;
    trt_options.trt_engine_decryption_enable = false;
    trt_options.trt_dla_core = 0;
    trt_options.trt_dla_enable = 0;
    trt_options.trt_fp16_enable = 1;
    trt_options.trt_int8_enable = 0;
    trt_options.trt_max_workspace_size = 2147483648;
    trt_options.trt_int8_use_native_calibration_table = 1;
    trt_options.trt_engine_cache_enable = 1;
    trt_options.trt_engine_cache_path = "./trtcache";
    trt_options.trt_dump_subgraphs = 1;
    session_options.AppendExecutionProvider_TensorRT(trt_options);

You can use the Linux (x86_64) version to install.
You can use nvidia-smithe command to view the current driver version and the highest supported CUDA version.
Note : Because of CUDA Minor Version Compatibility, Onnx Runtime built with CUDA 11.4 should be compatible with any CUDA 11.x version. Please reference Nvidia CUDA Minor Version Compatibility .

  • Note:
    To compile ONNX Runtime with TensorRT as the backend in Ubuntu, you need to install the cuDNN developer version, because it contains
    cuDNN header files and dynamic libraries, which are used to compile and link code using cuDNN.
    Note that the deb package is used to install cuDNN. The cuDNN developer version is different from the cuDNN runtime version. The former contains the files required for development, while the latter only contains runtime library files. When installing, please make sure to select the correct version, but to install the dev version, you must first install the runtime version. After installing cuDNN with deb package
    on Ubuntu system , the files of cuDNN library will usually be installed in /usr/lib/x86_64-linux-gnu/ directory. Specifically, libcudnn.so will be installed in the /usr/lib/x86_64-linux-gnu/ directory, and the header files will be installed in the /usr/include/ directory. The version correspondence of the docs form of the onnxruntime official document below is not accurate! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! For the version of ONNX Runtime TensorRT CUDA, it is best to refer to the TensorRT documentation and the github repository of ONNX Runtime: TensorRT Release Notes ONNX Runtime Release




    insert image description here
    insert image description here

- Supplementary knowledge

1 Export and verification of ONNX model

The model used in this article is as follows:
insert image description here
According to the python code and environment, the opset version I use is 12 .

parser.add_argument('--opset', type=int, default=12, help='ONNX: opset version')

The version of onnx is 1.12.0 , and onnxruntime uses 1.13.1 in Python, and 1.13.1 will also be used here :
insert image description here
Versions of CUDA and TensorRT, etc.: (Windows 10) TensorRT acceleration of Yolov5-5.0 model + C++ deployment + VS2019 packaging dll (CMake) + Qt calls

cuda 11.1
cudnn 8.5.0
TensorRT 8.2.1.8
Opencv 4.5.5
CMake 3.24.2

2 ONNX Runtime (C++) reads the model in ONNX format

2.1 Check the version compatibility of ONNX Runtime (C++)

Python can pipinstall onnxruntime directly, but C++ needs to be set by itself, and compatibility should also be considered

2.1.1 Enter the Docs page of ONNX Runtime

Welcome to ONNX Runtime
insert image description here

2.2.2 View the compatibility of the C++ version of ONNX Runtime with different systems

Enter Get Started through the option of How to use ONNX Runtime , and then enter Get Started / C++ . Compatibility can be checked through Builds.
insert image description here

2.2 Installation of ONNX Runtime (C++)

I am here mainly to learn Load and run the model with ONNX Runtime.
You can see the use of the C++ version of ONNX Runtime, there are two options:

  1. download a prebuilt package : download the prebuilt package directly
  2. build from source : Build the required version from the source code, which can be understood as building a version according to the environment of your computer, such as different CUDA versions, etc.
  • The two methods are shown in the figure below: You can choose according to your own needs.
    insert image description here
    The following is a brief introduction of the two options.

2.2.1 Select the prebuilt version you need in GitHub

Option 1: download a prebuilt package
insert image description here
The version I downloaded: onnxruntime-win-x64-1.13.1.zip , after downloading, unzip it and use it.

2.2.2 build from source

Option 2: build from source
Just follow the guidance in Option 2 in the picture above.

2.3 ONNX Runtime (C++) read model

#include <fstream>
#include <iostream>
#include <opencv2/imgproc.hpp>
#include <opencv2/highgui.hpp>
#include <onnxruntime_cxx_api.h>
#include <opencv2/core/utils/logger.hpp>
#include <opencv2/opencv.hpp>
using namespace cv;

Mat resize_image(Mat srcimg, int* newh, int* neww, int* top, int* left)
{
    
    
    int srch = srcimg.rows, srcw = srcimg.cols;
    int inpHeight = 640;
    int  inpWidth = 640;
    *newh = inpHeight;
    *neww = 640;
    bool keep_ratio = true;
    Mat dstimg;
    if (keep_ratio && srch != srcw) {
    
    
        float hw_scale = (float)srch / srcw;
        if (hw_scale > 1) {
    
    
            *newh = inpHeight;
            *neww = int(inpWidth / hw_scale);
            resize(srcimg, dstimg, Size(*neww, *newh), INTER_AREA);
            *left = int((inpWidth - *neww) * 0.5);
            copyMakeBorder(dstimg, dstimg, 0, 0, *left, inpWidth - *neww - *left, BORDER_CONSTANT, 114);
        }
        else {
    
    
            *newh = (int)inpHeight * hw_scale;
            *neww = inpWidth;
            resize(srcimg, dstimg, Size(*neww, *newh), INTER_AREA);
            *top = (int)(inpHeight - *newh) * 0.5;
            copyMakeBorder(dstimg, dstimg, *top, inpHeight - *newh - *top, 0, 0, BORDER_CONSTANT, 114);
        }
    }
    else {
    
    
        resize(srcimg, dstimg, Size(*neww, *newh), INTER_AREA);
    }
    return dstimg;
}

int main(int argc, char* argv[])
{
    
    
    //std::string imgpath = "images/bus.jpg";
    std::string imgpath = "images/real.jpg";
    utils::logging::setLogLevel(utils::logging::LOG_LEVEL_ERROR);//设置OpenCV只输出错误日志
    Ort::Env env(ORT_LOGGING_LEVEL_WARNING, "yolov5s-5.0");
    Ort::SessionOptions session_options;
    session_options.SetIntraOpNumThreads(1);

    session_options.SetGraphOptimizationLevel(GraphOptimizationLevel::ORT_ENABLE_EXTENDED);

#ifdef _WIN32
    //const wchar_t* model_path = L"yolov5s.onnx";
    const wchar_t* model_path = L"sim_best20221027.onnx";
#else
    const char* model_path = "yolov5s.onnx";
#endif

    std::vector<std::string> class_names;
    //std::string classesFile = "class.names";
    std::string classesFile = "myclass.txt";
    std::ifstream ifs(classesFile.c_str());
    std::string line;
    while (getline(ifs, line)) class_names.push_back(line);
    Ort::Session session(env, model_path, session_options);
    // print model input layer (node names, types, shape etc.)
    Ort::AllocatorWithDefaultOptions allocator;

    // print number of model input nodes
    size_t num_input_nodes = session.GetInputCount();
    std::vector<const char*> input_node_names = {
    
     "images"};
    std::vector<const char*> output_node_names = {
    
     "output0"};

    size_t input_tensor_size = 3*640*640;
    std::vector<float> input_tensor_values(input_tensor_size);
    cv::Mat srcimg = cv::imread(imgpath);
    int newh = 0, neww = 0, padh = 0, padw = 0;

    Mat dstimg = resize_image(srcimg, &newh, &neww, &padh, &padw);//Padded resize

    //resizedImage.convertTo(floatImage, CV_32FC3, 1 / 255.0);
	for (int c = 0; c < 3; c++)
	{
    
    
		for (int i = 0; i < 640; i++)
		{
    
    
			for (int j = 0; j < 640; j++)
			{
    
    
				float pix = dstimg.ptr<uchar>(i)[j * 3 + 2 - c];
                input_tensor_values[c * 640 * 640 + i * 640 + size_t(j)] = pix / 255.0;
			}
		}
	}
    // create input tensor object from data values
    std::vector<int64_t> input_node_dims = {
    
     1, 3, 640, 640 };
    auto memory_info = Ort::MemoryInfo::CreateCpu(OrtArenaAllocator, OrtMemTypeDefault);
    Ort::Value input_tensor = Ort::Value::CreateTensor<float>(memory_info, input_tensor_values.data(), input_tensor_size, input_node_dims.data(), input_node_dims.size());

    std::vector<Ort::Value> ort_inputs;
    ort_inputs.push_back(std::move(input_tensor));
    // score model & input tensor, get back output tensor
    std::vector<Ort::Value> output_tensors = session.Run(Ort::RunOptions{
    
     nullptr }, input_node_names.data(), ort_inputs.data(), input_node_names.size(), output_node_names.data(), output_node_names.size());

    // Get pointer to output tensor float values
    const float* rawOutput = output_tensors[0].GetTensorData<float>();
	//generate proposals
    std::vector<int64_t> outputShape = output_tensors[0].GetTensorTypeAndShapeInfo().GetShape();
    size_t count = output_tensors[0].GetTensorTypeAndShapeInfo().GetElementCount();
    std::vector<float> output(rawOutput, rawOutput + count);

    std::vector<cv::Rect> boxes;
    std::vector<float> confs;
    std::vector<int> classIds;
    int numClasses = (int)outputShape[2] - 5;
    int elementsInBatch = (int)(outputShape[1] * outputShape[2]);

    float confThreshold = 0.5;
    for (auto it = output.begin(); it != output.begin() + elementsInBatch; it += outputShape[2])
    {
    
    
        float clsConf = *(it+4);//object scores
        if (clsConf > confThreshold)
        {
    
    
            int centerX = (int)(*it);
            int centerY = (int)(*(it + 1));
            int width = (int)(*(it + 2));
            int height = (int)(*(it + 3));
            int x1 = centerX - width / 2;
            int y1 = centerY - height / 2;
            boxes.emplace_back(cv::Rect(x1, y1, width, height));

            // first 5 element are x y w h and obj confidence
            int bestClassId = -1;
            float bestConf = 0.0;

            for (int i = 5; i < numClasses + 5; i++)
            {
    
    
                if ((*(it + i)) > bestConf)
                {
    
    
                    bestConf = it[i];
                    bestClassId = i - 5;
                }
            }

            //confs.emplace_back(bestConf * clsConf);
            confs.emplace_back(clsConf);
            classIds.emplace_back(bestClassId);
        }
    }

    float iouThreshold = 0.5;
    std::vector<int> indices;
    // Perform non maximum suppression to eliminate redundant overlapping boxes with
    // lower confidences
    cv::dnn::NMSBoxes(boxes, confs, confThreshold, iouThreshold, indices);

    //随机数种子
    RNG rng((unsigned)time(NULL));
    for (size_t i = 0; i < indices.size(); ++i)
	{
    
    
        int index = indices[i];
        int colorR = rng.uniform(0, 255);
        int colorG = rng.uniform(0, 255);
        int colorB = rng.uniform(0, 255);

        //保留两位小数
        float scores = round(confs[index] * 100) / 100;
        std::ostringstream oss;
        oss << scores;

		rectangle(dstimg, Point(boxes[index].tl().x, boxes[index].tl().y), Point(boxes[index].br().x, boxes[index].br().y), Scalar(colorR, colorG, colorB), 1.5);
        putText(dstimg, class_names[classIds[index]] + " " + oss.str(), Point(boxes[index].tl().x, boxes[index].tl().y - 5), FONT_HERSHEY_SIMPLEX, 0.5, Scalar(colorR, colorG, colorB), 2);
	}
    imshow("检测结果", dstimg);
    cv::waitKey();
} 

这里有一个大坑: The reason is that
direct assignment can be used for the API of ONNX Runtime

    std::vector<const char*> input_node_names = {
    
     "images"};
    std::vector<const char*> output_node_names = {
    
     "output0"};

In this way, the use of input_node_names obtained by the API of ONNX Runtime will throw an exception. After observation, the difference between the two contents is "images" is that the value of ‘i’the address const char* _ptr64type is different.

	// print number of model input nodes
	size_t num_input_nodes = session.GetInputCount();
	for (int i = 0; i < num_input_nodes; i++)
	{
    
    
		Ort::AllocatedStringPtr input_name_Ptr = session.GetInputNameAllocated(i, allocator);
		input_node_names.push_back(input_name_Ptr.get());//"images" char * __ptr64
		Ort::TypeInfo input_type_info = session.GetInputTypeInfo(i);
		auto input_tensor_info = input_type_info.GetTensorTypeAndShapeInfo();//Get OrtTensorTypeAndShapeInfo from an OrtTypeInfo.
		auto input_dims = input_tensor_info.GetShape();//Uses GetDimensionsCount & GetDimensions to return a std::vector of the shape.
		input_node_dims.push_back(input_dims);//input_node_dims[0] = vector<unsigned_int64>{1, 3, 640, 640}
	}

insert image description here

2.3.1 API Learning of ONNX Runtime (C++)

  • The C++ API is a thin wrapper of the C API. Please refer to C API for more details. You can also choose C/C++ API Docs
    from the official API Docs for more details.
  • According to the introduction of the official document, here are some important APIs. The official document feels that C / C++ is very messy. Here, we will learn directly from the MNIST demo combined with the official C & C++ APIs .
  • Reference: Ort Namespace Reference for C++
  • The following introduction mainly introduces the namespace Ort Namespace Reference , uses the classes in it as the basis for classification, and then introduces the member functions used by each class
  • According to the introduction of Python's onnxruntime, InferenceSessionit is the main class of ONNX Runtime. It is used to load and run ONNX models, and to specify environment and application configuration options
  • Functions with deprecated(强烈反对 / 强烈抨击)a should not be used and their suggested alternatives should be used.

2.3.1.1 (Classes)Ort::MemoryInfo

  1. (Static Public Member Functions)CreateCpu()
	static MemoryInfo Ort::MemoryInfo::CreateCpu(OrtAllocatorType type, OrtMemType 	mem_type1)
  • Function input:
    OrtAllocatorType type:
    insert image description here
    OrtMemType mem_type1:
    insert image description here
  • The function returns
    OrtMemoryInfo: The introduction comes from the function Memory description of where the p_data (Pointer to the data buffer) buffer resides (CPU vs GPU etc) that require OrtMemoryInfo as input
    .
    insert image description here

2.3.1.2 (Classes)Ort::Value

  1. (Static Public Member Functions) CreateTensor()
    There are four overloaded forms, you can enter the link to view: CreateTensor() [1/4]
    MNIST uses the first one:
    insert image description here
  • Input
    p_data: You can use the .data() function to get
    p_data_element_number: You can use the .size() function to get
    shape: the pointer of the dimension (1, 3, 640, 640)
    shape_len: the number of dimensions (4)
  • The function returns: Ort::Value Struct Reference
    according to Creates a tensor with a user supplied buffer. Wraps OrtApi::CreateTensorWithDataAsOrtValue .
    Enter the link to know that the return is If no error, nullptr will be returned. If there is an error, a pointer to an OrtStatusthat contains error details will be returned.
    Use OrtApi::ReleaseStatusto free this pointer.
    insert image description here
  1. (Static Public Member Functions) GetTensorMutableData ()or GetTensorData (), and the difference is that the input and return of the latter are const .
    insert image description here
    insert image description here

The function GetTensorMutableData() of its Wraps is as follows:
Get a pointer to the original data (row data) in the tensor .
Used to directly read/write/modify internal tensor data.
insert image description here

2.3.1.3 (Classes)Ort::Session

  1. (Static Public Member Functions): link
    size_t GetInputCount () const
    size_t GetOutputCount () const
    insert image description here
    Note that it is the number of inputs, how many inputs are required, not the number of Batch input pictures. For example, the output of the yolov5 model is unprocessed, with one input and three outputs. This is the number of input and output.
  2. (Static Public Member Functions) Run()
    There are four overloaded forms, you can enter the link to view: Run() [1/3]
    MNIST uses the second one:
    insert image description here
  • Input parameters:
    input_names: the Array of each input name
    input_values: the input data, the function to be used to obtain Valuethe parameters can refer to the first form:CreateTensor()

    insert image description here
  • The first form of the function returns:
    A std::vector of Value objects that directly maps to the output_count (eg. output_name[0] is the first entry of the returned vector)
  1. (Public Member Functions) Session()
    There are 5 overloaded forms, you can enter the link to view: Session() [1/5]
    uses the second one:
    insert image description here
    OrtApi ::CreateSession. :
    insert image description here
  2. (Static Public Member Functions): Get In / Out putName()
    char * GetInputName (size_t index, OrtAllocator *allocator) const
    char * GetOutputName (size_t index, OrtAllocator *allocator) const
    You can see the following Deprecated , so these two members The function has also been abandoned and should be replaced GetInputNameAllocated()with GetOutputNameAllocated()
    insert image description here
    insert image description here
    the introduction of its wrap function:
    SessionGetInputName() and SessionGetOutputName()
  • Obviously deprecated here , so 应该换成GetInputNameAllocated() and GetOutputNameAllocated() .
    Returns a copy of the input/output name (corresponding to onnx's input and output name) at the specified index.
    insert image description hereinsert image description here
    The return value is AllocatedStringPtr , an unique_ptrinstance of smart pointer ( ), see 2.3.1.8 for details .
  1. (Static Public Member Functions): link
    TypeInfo GetInputTypeInfo (size_t index) const
    TypeInfo GetOutputTypeInfo (size_t index) const
    insert image description here
    The use of its input and output parameters can be known through its Wrap function:
    SessionGetInputTypeInfo()
    insert image description here
    SessionGetOutputTypeInfo()
    insert image description here

2.3.1.4 (Classes)Ort::Env

  • Reference: Ort::Env Struct Reference
  • The Env holds the logging state used by all other objects. Note: One Env must be created before using any other Onnxruntime functionality
  1. (Public Member Functions) Env()
    There are a total of 6 overloaded forms, which can be viewed in the link: Env () [1/6]
    uses the second type: OrtLoggingLevel : specifies the lowest severity of the log message to be displayed
    insert image description here

    insert image description here

2.3.1.5 (Classes)Ort::SessionOptions

  1. (Public Member Functions) SessionOptions()
    There are a total of 3 overloaded forms, which can be viewed in the link: SessionOptions() [1/3]
    uses the second type: CreateSessionOptions()l
    insert image description here
    : first call it on your favorite execution provider This method, and then the less preferred execution provider. If not called Ort will use its internal CPU to execute the provider.
    insert image description here
  2. (Public Member Functions) SetGraphOptimizationLevel()
    Introduction: Graph Optimizations in ONNX Runtime
    One level of optimization is performed after applying the previous level of optimization.

    Among them, GraphOptimizationLevell
    insert image description here
  3. (Public Member Functions) SetIntraOpNumThreads()
    Introduction: SetIntraOpNumThreads()
    ![](https://img-blog.csdnimg.cn/e642260bedd241e8a6c0da17c7f6a01f.png Among them, OrtApi::SetIntraOpNumThreads.
    Sets the number of threads used for parallel execution within the node.
    When running single node operations, eg. add, this will set the maximum number of threads to use.
    ![Insert picture description here](https://img-blog.csdnimg.cn/b5e45a039e914c6aa098c8a18b47bb2e.png

2.3.1.6 (Classes)Ort::TypeInfo

  1. (Public Member Functions)
    Unowned< TensorTypeAndShapeInfo > GetTensorTypeAndShapeInfo () const
    You can enter the link to view: GetTensorTypeAndShapeInfo() CastTypeInfoToTensorInfo()
    insert image description here
    of its wraps : Get OrtTensorTypeAndShapeInfo from an OrtTypeInfo .
    insert image description here

2.3.1.7 (Classes)Ort::TensorTypeAndShapeInfo

  1. (Public Member Functions) std::vector< int64_t > GetShape () const
    You can enter the link to view: GetShape()
    insert image description here
    Uses GetDimensionsCount & GetDimensions to return a std::vector of the shape.
  2. (Public Member Functions) size_t GetElementCount () const
    You can enter the link to view: GetElementCount ()
    ![Insert picture description here](https://img-blog.csdnimg.cn/65a00335ef104e4fb2b38a37f4685cfe.png
    and its Wraps: GetTensorShapeElementCount(), return the total number of data ( all dimensions multiplied by each other ), return 1 for 0 dimensions, and return -1 for dimensions less than 0.
    insert image description here

2.3.1.8 (Classes)Ort::AllocatorWithDefaultOptions / Ort::Allocator

2.3.1.10 (Classes)Ort::RunOptions

2.3.1.11 (Classes)AllocatedStringPtr

AllocatedStringPtr : Note that there are pitfalls here. AllocatedStringPtr is a smart pointer, so you must pay attention to the life cycle.
insert image description here
The unique_ptr typedef is used to own strings allocated by OrtAllocators and free them when the scope ends. The lifetime of the given allocator must outlive the lifetime of the allocated string Ptr instance

2.3.2 Basic learning of OpenCV (C++)

To reason about images, you need to use OpenCV to process images

2.3.2.1 Important library functions

1. Use functions cv::imread()to read pictures
2. Use functions cv::imshow()to display pictures
  • When used imshow(), it must be matched with a function later waitkey(), otherwise it cannot be displayed normally.
    waitkey()Explanation:
    @brief : Polls for a pressed key.
    Function pollKeyPolls for keypress events without waiting. It returns the code of the key that was pressed, or -1 if no key was pressed since the last call.
    To wait until a key is pressed, use waitKey.
    @note : Function waitKeyand pollKeyare the only methods in HighGUI that can get and handle GUI events, so one of them needs to be called periodically for normal event handling, unless HighGUI is used in an event-handling environment.
    @note : This function is only valid when at least one HighGUI window is created and active . If there are multiple HighGUI windows, any one of them can be activated.
3. Use functions cv::dnn::NMSBoxesfor non-maximum suppression

NMSBoxes() [1/3]

4. Use functions cv::rectangle()to draw rectangles

rectangle

4. cv::putText()Write text using functions

putText()

2.3.2.2 Basic data structure Mat

Official documentation: cv::Mat Class Reference

1. Use the member function channels()to get the number of channels of the matrix

Color channel conversion: cvtColor(cv_image, cv_image, cv::COLOR_BGR2RGB);
[OpenCV3] Color space conversion - cv::cvtColor() detailed explanation

2. Use member variables dimsto get the dimension of the matrix

Tips: The difference between dims and channels()

3. Use member functions size()to get the dimensions of the matrix
4. Use member functions convertTo()to convert the format of the matrix

cv::Mat::convertTo() can realize data type conversion, channel sequence conversion, data digit conversion, etc., and realize its own search.
insert image description here

5. Use the member function ptr() / at<T>() / 地址to get the value of a pixel in the matrix

Official docs: ptr() [1/20]
uchar* cv::Mat::ptr ( int i0 = 0 ) Returns a pointer to the specified matrix row.

  • So how to get the value of a certain position of the picture in CHW format?
    The i-th row, the j-th column, the value of the c-th channel: the at operation is simple and convenient, but the efficiency is low. It is recommended to use the ptr operation.
    at method : float pix = img.at<Vec3b>(i, j)[c];
    matrix element address : float pix = (int)(*(img.data + img.step[0] * i + img.step[1] * j + c));
    pointer ptr : float pix = img.ptr<uchar>(i)[j * channel + channel-1 - c];//channel-1 是因为数组从0开始
    iterator : difficult, not recommended for novices
  • Among them, the value of three channelscout<<img.at<Vec3b>(i, j)<<endl; can be output , such as [230 222 102], and the output is uchar type garbled code, which requires data type conversion to output the value.cout<<img.at<Vec3b>(i, j)[c]<<endl;
  • The above access methods refer to: C++ version OpenCv tutorial (four) 4 methods of reading Mat class elements and [C++ Opencv] read and write grayscale images, a certain pixel of RGB image, modify pixel value, image inversion (source code + API) , you can also find some other methods.
  • Refer to the data type used int,uchar,Vec3bor other, refer to the second table of opencv cv::Mat data type summary .
  • Note that in the use of OpenCV, the order of color channels used is BGR color format instead of RGB format. If you want to get it out in RGB format, you shouldfloat pix = img.ptr<uchar>(i)[j * 3 + 2 - c];
  • Here is an image of my hand drawing:
    insert image description here

2.4 ONNX Runtime (C++) reasoning model

The steps of the inference model:
OpenCV reads the image —> processes the image to obtain an image of a suitable size —> performs inference —> obtains the result of the inference

2.4.1 Processing before data input

Padded resizeThe processing of input data is best realized by the method proposed by yolov5

2.4.1.1 Padded Resize Method Introduction

  • Padded Resize: The aspect ratio of the image can be maintained , and the remaining part is filled with gray , so as to maintain the aspect ratio of the original image by filling the border (usually gray filling), and at the same time meet the needs of the model square input.
  • Its Pyhon source code is located in the dataset.pyletterbox() method of yolov5.
    Its return value only needs the first parameter img (the picture after padded resize):
    ![Insert picture description here](https://img-blog.csdnimg.cn/83e02a96c03c466eafcd9c000a2c06ca.png
    its source code is as follows: know the principle and reproduce it with C++
def letterbox(img, new_shape=(640, 640), color=(114, 114, 114), auto=True, scaleFill=False, scaleup=True, stride=32):
    # Resize and pad image while meeting stride-multiple constraints
    shape = img.shape[:2]  # current shape [height, width]
    if isinstance(new_shape, int):
        new_shape = (new_shape, new_shape)

    # Scale ratio (new / old)
    r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])
    if not scaleup:  # only scale down, do not scale up (for better test mAP)
        r = min(r, 1.0)

    # Compute padding
    ratio = r, r  # width, height ratios
    new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))
    dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1]  # wh padding
    if auto:  # minimum rectangle
        dw, dh = np.mod(dw, stride), np.mod(dh, stride)  # wh padding
    elif scaleFill:  # stretch
        dw, dh = 0.0, 0.0
        new_unpad = (new_shape[1], new_shape[0])
        ratio = new_shape[1] / shape[1], new_shape[0] / shape[0]  # width, height ratios

    dw /= 2  # divide padding into 2 sides
    dh /= 2

    if shape[::-1] != new_unpad:  # resize
        img = cv2.resize(img, new_unpad, interpolation=cv2.INTER_LINEAR)
    top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
    left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
    img = cv2.copyMakeBorder(img, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color)  # add border
    return img, ratio, (dw, dh)

2.4.1.2 Learning cv::resize()functions

Reference: official document cv::resize()
Look at the source code first: atimgproc.hpp

CV_EXPORTS_W void resize( InputArray src, OutputArray dst,
                          Size dsize, double fx = 0, double fy = 0,
                          int interpolation = INTER_LINEAR );

Parameter introduction:

  • src: input image , input image
  • dst: output image that has the size and dsizethe same type as src .dsizesrc
  • dsize: dsize = Size(round(fx*src.cols), round(fy*src.rows)), C++ function round()and OpenCV class Sizereference: C++: round function usage and Opencv's Size class-size class , pay attention Sizeto the width first and then the height
  • interpolation: interpolation method, default bilinear interpolation, reference: enum cv::InterpolationFlags
    The screenshot is as follows:
    insert image description here
    insert image description here

2.4.1.3 Learning cv::copyMakeBorder()functions

Reference: Official document: copyMakeBorder() , the function is Forms a border around an image.

  • This function copies the source image to the middle of the destination image.
    The areas to the left, right, above, and below the copied source image will be filled with extrapolated pixels . This is not what filter functions based on it do (they infer pixels dynamically), but other more complex functions (including your own) can be used to simplify image boundary handling.
    The function supports modes where src is already in the middle of dst. In this case, the function does not copy the src itself, but simply constructs the border.
  • When the source image is part of a larger image (ROI: region of interest), the function will try to use pixels outside the ROI to form the border . To disable this and always extrapolate as if src were not the ROI, use borderType | BORDER_ISOLATED.
CV_EXPORTS_W void copyMakeBorder(InputArray src, OutputArray dst,
                                 int top, int bottom, int left, int right,
                                 int borderType, const Scalar& value = Scalar() );

Parameter introduction:

  • src: input image , input image
  • dst: output image that has the size and dsizethe same type as src . with a size ofdsizesrcSize(src.cols+left+right, src.rows+top+bottom)
  • top, bottom, left, right: expand the size of the border around the original image
  • borderType: The type of border to extend, See borderInterpolate for details .
  • value: Border value if borderType==BORDER_CONSTANT . Otherwise it depends on the original image, it has nothing to do with value
  • In the official introduction of BorderTypes, you can see the various border types in the figure below. The image border is represented by |, the middle is the input image, and the two sides are the extended border of the image and the internal relationship . For details, see OpenCV library members - BorderTypes .
  • The Scalar() function is used to set the color and format in OpenCV. Scalar(B,G,R)If you directly write a number X , it is equivalent to Scalar(X,0,0), refer to: Scalar() function in opencv . And the official example: typedef Scalar_ cv::Scalar
    insert image description here

2.4.1.4 Pixel normalization

Knowledge that needs to be understood: [ -What are the data types CV_8U, CV_16U, CV_16S, CV_32F and CV_64F in Opencv? , mine should be V_8UC3converted from C. It should be noted here : the image channel order read by OpenCV is BGR , and the normalized input should be in RGB order. At the same time, my input type is float32 , which also needs to be converted.CV_32FC3

3 ONNX Runtime (C++) reasoning model in ONNX format

Function used to construct input data: public member function CreateTensor() of Ord::Value
Reasoning main function used: public member function Run of Ord::Session

3.1 Model input data specification

Carefully study the input of the function, construct your own data into the data format it needs and the input data type and format of your own model , and then input it.

For my model, the attribute should befloat32(1, 3, 640, 640)

  • OpenCV is now used, so the input data specification of OpenCV is used, which is located in the header file as follows:
    insert image description here
    insert image description here
  • For the data type specification of C/C++, you can look at its own header file, and you can see that for the same data type, many aliases are given
    insert image description here
    ![Insert picture description here](https://img-blog.csdnimg.cn/24512f9fc06140e785dd62b4fb0935b9.png
    and another alias is located in the following header file:typedef unsigned short wchar_t;
    insert image description here

Looking stdint.hat the source code, you can see that in addition to the data type alias, many data-related macros are defined, which can be used to detect the size of the data and avoid problems such as overflow.

//
// stdint.h
//
//      Copyright (c) Microsoft Corporation. All rights reserved.
//
// The C Standard Library <stdint.h> header.
//
#pragma once
#define _STDINT

#include <vcruntime.h>

#if _VCRT_COMPILER_PREPROCESSOR

#pragma warning(push)
#pragma warning(disable: _VCRUNTIME_DISABLED_WARNINGS)

typedef signed char        int8_t;
typedef short              int16_t;
typedef int                int32_t;
typedef long long          int64_t;
typedef unsigned char      uint8_t;
typedef unsigned short     uint16_t;
typedef unsigned int       uint32_t;
typedef unsigned long long uint64_t;

typedef signed char        int_least8_t;
typedef short              int_least16_t;
typedef int                int_least32_t;
typedef long long          int_least64_t;
typedef unsigned char      uint_least8_t;
typedef unsigned short     uint_least16_t;
typedef unsigned int       uint_least32_t;
typedef unsigned long long uint_least64_t;

typedef signed char        int_fast8_t;
typedef int                int_fast16_t;
typedef int                int_fast32_t;
typedef long long          int_fast64_t;
typedef unsigned char      uint_fast8_t;
typedef unsigned int       uint_fast16_t;
typedef unsigned int       uint_fast32_t;
typedef unsigned long long uint_fast64_t;

typedef long long          intmax_t;
typedef unsigned long long uintmax_t;

// These macros must exactly match those in the Windows SDK's intsafe.h.
#define INT8_MIN         (-127i8 - 1)
#define INT16_MIN        (-32767i16 - 1)
#define INT32_MIN        (-2147483647i32 - 1)
#define INT64_MIN        (-9223372036854775807i64 - 1)
#define INT8_MAX         127i8
#define INT16_MAX        32767i16
#define INT32_MAX        2147483647i32
#define INT64_MAX        9223372036854775807i64
#define UINT8_MAX        0xffui8
#define UINT16_MAX       0xffffui16
#define UINT32_MAX       0xffffffffui32
#define UINT64_MAX       0xffffffffffffffffui64

#define INT_LEAST8_MIN   INT8_MIN
#define INT_LEAST16_MIN  INT16_MIN
#define INT_LEAST32_MIN  INT32_MIN
#define INT_LEAST64_MIN  INT64_MIN
#define INT_LEAST8_MAX   INT8_MAX
#define INT_LEAST16_MAX  INT16_MAX
#define INT_LEAST32_MAX  INT32_MAX
#define INT_LEAST64_MAX  INT64_MAX
#define UINT_LEAST8_MAX  UINT8_MAX
#define UINT_LEAST16_MAX UINT16_MAX
#define UINT_LEAST32_MAX UINT32_MAX
#define UINT_LEAST64_MAX UINT64_MAX

#define INT_FAST8_MIN    INT8_MIN
#define INT_FAST16_MIN   INT32_MIN
#define INT_FAST32_MIN   INT32_MIN
#define INT_FAST64_MIN   INT64_MIN
#define INT_FAST8_MAX    INT8_MAX
#define INT_FAST16_MAX   INT32_MAX
#define INT_FAST32_MAX   INT32_MAX
#define INT_FAST64_MAX   INT64_MAX
#define UINT_FAST8_MAX   UINT8_MAX
#define UINT_FAST16_MAX  UINT32_MAX
#define UINT_FAST32_MAX  UINT32_MAX
#define UINT_FAST64_MAX  UINT64_MAX

#ifdef _WIN64
    #define INTPTR_MIN   INT64_MIN
    #define INTPTR_MAX   INT64_MAX
    #define UINTPTR_MAX  UINT64_MAX
#else
    #define INTPTR_MIN   INT32_MIN
    #define INTPTR_MAX   INT32_MAX
    #define UINTPTR_MAX  UINT32_MAX
#endif

#define INTMAX_MIN       INT64_MIN
#define INTMAX_MAX       INT64_MAX
#define UINTMAX_MAX      UINT64_MAX

#define PTRDIFF_MIN      INTPTR_MIN
#define PTRDIFF_MAX      INTPTR_MAX

#ifndef SIZE_MAX
    // SIZE_MAX definition must match exactly with limits.h for modules support.
    #ifdef _WIN64
        #define SIZE_MAX 0xffffffffffffffffui64
    #else
        #define SIZE_MAX 0xffffffffui32
    #endif
#endif

#define SIG_ATOMIC_MIN   INT32_MIN
#define SIG_ATOMIC_MAX   INT32_MAX

#define WCHAR_MIN        0x0000
#define WCHAR_MAX        0xffff

#define WINT_MIN         0x0000
#define WINT_MAX         0xffff

#define INT8_C(x)    (x)
#define INT16_C(x)   (x)
#define INT32_C(x)   (x)
#define INT64_C(x)   (x ## LL)

#define UINT8_C(x)   (x)
#define UINT16_C(x)  (x)
#define UINT32_C(x)  (x ## U)
#define UINT64_C(x)  (x ## ULL)

#define INTMAX_C(x)  INT64_C(x)
#define UINTMAX_C(x) UINT64_C(x)

#pragma warning(pop) // _VCRUNTIME_DISABLED_WARNINGS

#endif // _VCRT_COMPILER_PREPROCESSOR

3.2 Introduction to model input data and reasoning of ONNVRuntime (C++)

According to the selected inference function inline std::vector<Value> Session::Run()to set the input, I choose three forms, and I choose the first one: Run() [1/3]

std::vector< Value > Ort::Session::Run	(	const RunOptions & 	run_options,
const char *const * 	input_names,
const Value * 	input_values,
size_t 	input_count,
const char *const * 	output_names,
size_t 	output_count 
)	

The caller provides a list of inputs and a list of desired outputs to be returned.

  • Create a function inside the function std::vector<Ort::Value> output_values;as output.
    insert image description here
  • run_options: set by yourself
  • input_names: Store input names, for example: if vector<const char*> input_names = { "images"};then input_names.data()(because pointers are used), of course, multi-input models can have multiple parameters.
  • input_values: This is the key point, there are pitfalls! ! ! ! ! To use rvalue references, this parameter will be described in detail later
  • input_count: The number of input names, which is also the number of model input nodes. eg input_names.size().
  • output_names: The names of the outputs of the model. See input_names .
  • output_count: The number of output names, which is also the number of model input nodes, refer to input_count .

Then enter the following function:

inline void Session::Run(const RunOptions& run_options, const char* const* input_names, const Value* input_values, size_t input_count,
                         const char* const* output_names, Value* output_values, size_t output_count) {
    
    
  static_assert(sizeof(Value) == sizeof(OrtValue*), "Value is really just an array of OrtValue* in memory, so we can reinterpret_cast safely");
  auto ort_input_values = reinterpret_cast<const OrtValue**>(const_cast<Value*>(input_values));
  auto ort_output_values = reinterpret_cast<OrtValue**>(output_values);
  ThrowOnError(GetApi().Run(p_, run_options, input_names, ort_input_values, input_count, output_names, output_count, ort_output_values));
}

can be seen:

  1. For input_valuesthe operation of auto ort_input_values = reinterpret_cast<const OrtValue**>(const_cast<Value*>(input_values))
    first remove the input constattributes, and then convert to const OrtValue**the value of
  2. The new operation output_valuesis performed , and finally the sum reinterpret_cast<OrtValue**>is obtained .ort_input_valuesort_output_values
  3. Finally, the value ( OrtValue **type) is returned, and the type is encapsulated.

4 Display of inference results of ONNVRuntime (C++)

insert image description here

Tip: Qt(MinGW) deploys ONNX Runtime reasoning framework:

ONNXRuntime is compiled with MSVC and cannot be used with MinGW compiler. Refer to the following:
error: unknown type name ' Frees_ptr_opt '
error: ' Frees_ptr_opt ' has not been declared

Guess you like

Origin blog.csdn.net/qq_22487889/article/details/128133933