Write YOLOv8-Seg instance segmentation model reasoning program with OpenVINO C++ API

Author: Intel Innovation Ambassador Zhan Pengzhou

 1.1 Introduction

This article will introduce the use of OpenVINO™ 2023.0 C++ API to develop an AI reasoning program for the YOLOv8-Seg instance segmentation (Instance Segmentation) model. The development environment of the C++ sample program in this article is Windows + Visual Studio Community 2022. Readers are requested to configure the Visual Studio-based OpenVINO C++ development environment first .

Please clone the code warehouse of this article: git clone https://gitee.com/ppov-nuc/yolov8_openvino_cpp.git

1.2  Export YOLOv8-Seg OpenVINO IR model

YOLOv8 is a SOTA model toolkit for object detection and tracking, instance segmentation, image classification and pose estimation tasks released by Ultralytics based on the YOLO framework.

First install ultralytics and openvino-dev with the command pip install -r requirements.txt .

Then use the command: yolo export model=yolov8n-seg.pt format=openvino half=True to export the OpenVINO IR model with FP16 precision, as shown in the figure below.

Then use the command: benchmark_app -m yolov8n-seg.xml -d GPU.1 to obtain the asynchronous inference computing performance of the yolov8n-seg.xml model on the A770m discrete graphics card, as shown in the figure below.

1.3  Write YOLOv8-Seg instance segmentation model reasoning program using OpenVINO C++ API

There are five typical steps to write the YOLOv8-Seg instance segmentation model reasoning program using the OpenVINO C++ API:

  1. Capture image & image decoding
  2. Image data preprocessing
  3. AI reasoning calculation (based on OpenVINO C++ API)
  4. Post-processing the inference results
  5. Visualize the processed results

The implementation of image data preprocessing and AI inference calculation of the YOLOv8-Seg instance segmentation model reasoning program is almost exactly the same as that of the YOLOv8 target detection model reasoning program, and can be directly reused.

1.3.1  Image data preprocessing

Use Netron to open yolov8n-seg.onnx, as shown in the figure below, you can see:

  1. The name of the input node: " images "; data:  float32[1,3,640,640]
  2. Name of output node 1 : " output0 "; data: float32[1,116,8400] . Among them, the first 84 fields of 116 are completely consistent with the output definition of the YOLOv8 target detection model, that is, the scores of cx, cy, w, h and 80 categories; the last 32 fields are the mask confidence, which is used to calculate the mask data.
  3. Name of output node 2 : " output1 "; data: float32[1,32,160,160] . The result of matrix multiplication of the last 32 fields of output0 and the data of output1 is the mask data of the corresponding target

The goal of image data preprocessing is to convert image data of any size into a tensor with a shape of [1,3,640,640] and an accuracy of FP32. The input size of the YOLOv8-Seg model is square. In order to solve the image distortion problem caused by scaling data of any size into a square, the letterbox algorithm is used to maintain the aspect ratio of the image before the image is scaled, as shown in the figure below, and then Then use the cv::dnn::blobFromImage function to scale the image.

A sample program for image data preprocessing is shown below

Mat letterbox(const Mat& source)

{

    int col = source.cols;

    int row = source.rows;

    int _max = MAX(col, row);

    Mat result = Mat::zeros(_max, _max, CV_8UC3);

    source.copyTo(result(Rect(0, 0, col, row)));

    return result;

}

Mat img = cv::imread("bus.jpg");

Mat letterbox_img = letterbox(img);

Mat blob = blobFromImage(letterbox_img, 1.0/255.0, Size(640,640), Scalar(), true);

1.3.2  AI Synchronous Reasoning Computing

Using OpenVINO C++ API to realize synchronous reasoning and calculation, there are seven main steps:

  1. Instantiate the Core object: ov::Core core;
  2. Compile and load the model: core.compile_model();
  3. Create an inference request: infer_request = compiled_model.create_infer_request();
  4. Read image data and complete preprocessing;
  5. Pass the input data into the model: infer_request.set_input_tensor(input_tensor);
  6. Start inference calculation: infer_request.infer();
  7. Get inference results: output0 = infer_request.get_output_tensor(0);

output1 = infer_request.get_output_tensor(1);

The sample code is as follows: 

    // -------- Step 1. Initialize OpenVINO Runtime Core --------

    ov::Core core;

    // -------- Step 2. Compile the Model --------

    auto compiled_model = core.compile_model("yolov8n-seg.xml", "CPU");

    // -------- Step 3. Create an Inference Request --------

    ov::InferRequest infer_request = compiled_model.create_infer_request();

    // -------- Step 4.Read a picture file and do the preprocess --------

    Mat img = cv::imread("bus.jpg");

    // Preprocess the image

    Mat letterbox_img = letterbox(img);

    float scale = letterbox_img.size[0] / 640.0;

    Mat blob = blobFromImage(letterbox_img, 1.0 / 255.0, Size(640, 640), Scalar(), true);

    // -------- Step 5. Feed the blob into the input node of the Model -------

    // Get input port for model with one input

    auto input_port = compiled_model.input();

    // Create tensor from external memory

    ov::Tensor input_tensor(input_port.get_element_type(), input_port.get_shape(), blob.ptr(0));

    // Set input tensor for model with one input

    infer_request.set_input_tensor(input_tensor);

    // -------- Step 6. Start inference --------

    infer_request.infer();

    // -------- Step 7. Get the inference result --------

    auto output0 = infer_request.get_output_tensor(0); //output0

    auto output1 = infer_request.get_output_tensor(1); //otuput1

1.3.3 Post-processing of processing results

The post-processing of the instance segmentation reasoning program is to disassemble the predicted class (class_id), class score (class_score), class bounding box (box) and class mask (mask) from the results. The sample code is as follows:

   // -------- Step 8. Postprocess the result --------

    Mat output_buffer(output0_shape[1], output0_shape[2], CV_32F, output0.data<float>());

    Mat proto(32, 25600, CV_32F, output1.data<float>()); //[32,25600]

    transpose(output_buffer, output_buffer); //[8400,116]

    float score_threshold = 0.25;

    float nms_threshold = 0.5;

    std::vector<int> class_ids;

    std::vector<float> class_scores;

    std::vector<Rect> boxes;

    std::vector<Mat> mask_confs;

    // Figure out the bbox, class_id and class_score

    for (int i = 0; i < output_buffer.rows; i++) {

        Mat classes_scores = output_buffer.row(i).colRange(4, 84);

        Point class_id;

        double maxClassScore;

        minMaxLoc(classes_scores, 0, &maxClassScore, 0, &class_id);



        if (maxClassScore > score_threshold) {

            class_scores.push_back(maxClassScore);

            class_ids.push_back(class_id.x);

            float cx = output_buffer.at<float>(i, 0);

            float cy = output_buffer.at<float>(i, 1);

            float w = output_buffer.at<float>(i, 2);

            float h = output_buffer.at<float>(i, 3);

            int left = int((cx - 0.5 * w) * scale);

            int top = int((cy - 0.5 * h) * scale);

            int width = int(w * scale);

            int height = int(h * scale);

            cv::Mat mask_conf = output_buffer.row(i).colRange(84, 116);

            mask_confs.push_back(mask_conf);

            boxes.push_back(Rect(left, top, width, height));

        }

    }

    //NMS

    std::vector<int> indices;

    NMSBoxes(boxes, class_scores, score_threshold, nms_threshold, indices);

For a complete example reference, see: yolov8_seg_ov_infer.cpp, the running result is shown in the figure below:

1.4  Conclusion:

OpenVINO C++ API is simple and clear, easy to learn and use. This paper uses less than 100 lines of C++ code (excluding visual detection results) to implement the inference program of the YOLOv8-Seg instance segmentation model based on OpenVINO, and obtains better inference computing performance on the Intel discrete graphics card A770m.

Guess you like

Origin blog.csdn.net/gc5r8w07u/article/details/131303130
Recommended