[Matting] MODNet: Real-time portrait matting model - NCNN C++ quantitative deployment

Related Links:

[Matting] MODNet: Real-time portrait matting model - onnx python deployment

[Matting] MODNet: Real-time portrait matting model-notes

[Matting] MODNet: Real-time portrait matting model - onnx C++ deployment

MODNet is a lightweight Matting model. The onnx model of MODNet has been deployed using python before. In this chapter, NCNN will be used to deploy MODNet. In addition, the model is statically quantized to reduce its size by 1/4 of the original model . Matting effect is as follows:

The full code and required weights are linked at the end of the article.


1. NCNN compilation

For specific steps, please refer to: Official Compilation Tutorial

1. Compile protobuf

下载protobuf:https://github.com/google/protobuf/archive/v3.4.0.zip

Open the x64 Native Tools Command Prompt for VS 2017 command line tool in the start menu (more advanced versions are also available, I succeeded with 2022), and compile protobuf.

cd <protobuf-root-dir>
mkdir build
cd build
cmake -G"NMake Makefiles" -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=%cd%/install -Dprotobuf_BUILD_TESTS=OFF -Dprotobuf_MSVC_STATIC_RUNTIME=OFF ../cmake
nmake
nmake install

2. Compile NCNN

Clone the NCNN repository:

git clone https://github.com/Tencent/ncnn.git

Compile NCNN (I don't use Vulkan here, refer to the official tutorial if necessary), replace the path in the command with your own path:

cd <ncnn-root-dir>
mkdir -p build
cd build
cmake -G"NMake Makefiles" -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=%cd%/install -DProtobuf_INCLUDE_DIR=<protobuf-root-dir>/build/install/include -DProtobuf_LIBRARIES=<protobuf-root-dir>/build/install/lib/libprotobuf.lib -DProtobuf_PROTOC_EXECUTABLE=<protobuf-root-dir>/build/install/bin/protoc.exe -DNCNN_VULKAN=OFF.. -DOpenCV_DIR=C:/opencv/opencv/build
nmake
nmake install

2. NCNN deployment

1. Environment

windows cpu

opencv 4.5.5

visual studio 2019

2, onnx to ncnn model

First convert the simplified onnx model obtained above to the ncnn model (obtain 2 files with param and bin suffixes), pay attention to the conversion without reporting an error, or subsequent loading will fail:

The converted model can be downloaded here: Portal

../ncnn/build/tools/onnx/onnx2ncnn simple_modnet.onnx simple_modnet.param simple_modnet.bin 

3. C++ code

Code structure:

 MODNet.h code:

#pragma once
#include <string>
#include "net.h"
#include <opencv.hpp>
#include "time.h"


class MODNet
{

private:
	std::string param_path;
	std::string bin_path;
	std::vector<int> input_shape;
	ncnn::Net net;

	const float norm_vals[3] = { 1 / 177.5, 1 / 177.5, 1 / 177.5 };
	const float mean_vals[3] = { 175.5, 175.5, 175.5 };

	cv::Mat normalize(cv::Mat& image);
public:
	MODNet() = delete;
	MODNet(const std::string param_path, const std::string bin_path, std::vector<int> input_shape);
	~MODNet();

	cv::Mat predict_image(cv::Mat& image);
	void predict_image(const std::string& src_image_path, const std::string& dst_path);

	void predict_camera();
};

 MODNet.cpp code:

#include "MODNet.h"


MODNet::MODNet(const std::string param_path, const std::string bin_path, std::vector<int> input_shape)
	:param_path(param_path), bin_path(bin_path), input_shape(input_shape) {
	net.load_param(param_path.c_str());
	net.load_model(bin_path.c_str());
}


MODNet::~MODNet() {
	net.clear();
}


cv::Mat MODNet::normalize(cv::Mat& image) {
	std::vector<cv::Mat> channels, normalized_image;
	cv::split(image, channels);

	cv::Mat r, g, b;
	b = channels.at(0);
	g = channels.at(1);
	r = channels.at(2);
	b = (b / 255. - 0.5) / 0.5;
	g = (g / 255. - 0.5) / 0.5;
	r = (r / 255. - 0.5) / 0.5;

	normalized_image.push_back(r);
	normalized_image.push_back(g);
	normalized_image.push_back(b);

	cv::Mat out = cv::Mat(image.rows, image.cols, CV_32F);
	cv::merge(normalized_image, out);
	return out;
}


cv::Mat MODNet::predict_image(cv::Mat& image) {
	cv::Mat rgbImage;
	cv::cvtColor(image, rgbImage, cv::COLOR_BGR2RGB);
	ncnn::Mat in = ncnn::Mat::from_pixels_resize(rgbImage.data, ncnn::Mat::PIXEL_RGB, image.cols, image.rows, input_shape[3], input_shape[2]);
	in.substract_mean_normalize(mean_vals, norm_vals);
	ncnn::Extractor ex = net.create_extractor();
	ex.set_num_threads(4);
	ex.input("input", in);
	ncnn::Mat out;
	ex.extract("output", out);

	cv::Mat mask(out.h, out.w, CV_8UC1);
	const float* probMap = out.channel(0);

	for (int i{ 0 }; i < out.h; i++) {
		for (int j{ 0 }; j < out.w; ++j) {
			mask.at<uchar>(i, j) = probMap[i * out.w + j] > 0.5 ? 255 : 0;
		}
	}
	cv::resize(mask, mask, cv::Size(image.cols, image.rows), 0, 0);
	cv::Mat segFrame;
	cv::bitwise_and(image, image, segFrame, mask = mask);
	return segFrame;
}


void MODNet::predict_image(const std::string& src_image_path, const std::string& dst_path) {
	cv::Mat image = cv::imread(src_image_path);
	cv::Mat segFrame = predict_image(image);
	cv::imwrite(dst_path, segFrame);
}


void MODNet::predict_camera() {
	cv::Mat frame;
	cv::VideoCapture cap;
	int deviceID{ 0 };
	int apiID{ cv::CAP_ANY };
	cap.open(deviceID, apiID);
	if (!cap.isOpened()) {
		std::cout << "Error, cannot open camera!" << std::endl;
		return;
	}
	//--- GRAB AND WRITE LOOP
	std::cout << "Start grabbing" << std::endl << "Press any key to terminate" << std::endl;
	int count{ 0 };
	clock_t start{ clock() }, end{ 0 };
	double fps{ 0 };
	for (;;)
	{
		// wait for a new frame from camera and store it into 'frame'
		cap.read(frame);
		// check if we succeeded
		if (frame.empty()) {
			std::cout << "ERROR! blank frame grabbed" << std::endl;
			break;
		}
		cv::Mat segFrame = predict_image(frame);

		// fps
		++count;
		end = clock();
		fps = count / (float(end - start) / CLOCKS_PER_SEC);
		if (count >= 50) {
			count = 0;  //防止计数溢出
			start = clock();
		}
		std::cout << "FPS: " << fps << "  Seg Image Number: " << count << "   time consume:" << (float(end - start) / CLOCKS_PER_SEC) << std::endl;
		//设置绘制文本的相关参数
		std::string text{ std::to_string(fps) };
		int font_face = cv::FONT_HERSHEY_COMPLEX;
		double font_scale = 1;
		int thickness = 2;
		int baseline;
		cv::Size text_size = cv::getTextSize(text, font_face, font_scale, thickness, &baseline);

		//将文本框居中绘制
		cv::Point origin;
		origin.x = 20;
		origin.y = 20;
		cv::putText(segFrame, text, origin, font_face, font_scale, cv::Scalar(0, 255, 255), thickness, 8, 0);

		// show live and wait for a key with timeout long enough to show images
		imshow("Live", segFrame);
		if (cv::waitKey(5) >= 0)
			break;

	}
	cap.release();
	cv::destroyWindow("Live");
	return;
}

main.cpp code:

#include <opencv.hpp>
#include <iostream>
#include "MODNet.h"
#include <vector>
#include "net.h"
#include "time.h"


int main() {
	std::string param_path{ "onnx_model\\simple_modnet.param" };
	std::string bin_path{ "onnx_model\\simple_modnet.bin" };
	std::vector<int> input_shape{ 1, 3, 512, 512 };
	MODNet model(param_path, bin_path, input_shape);


	// 预测并显示
	cv::Mat image = cv::imread("C:\\Users\\langdu\\Pictures\\test.png");
	cv::Mat segFrame = model.predict_image(image);
	cv::imshow("1", segFrame);
	cv::waitKey(0);

	// 摄像头
	//model.predict_camera();
	return -1;
}

3. NCNN quantization

For mobile devices, the size of the model is very demanding, and an effective method is needed to reduce its storage space. Quantization is an effective method to reduce the size of the model. For quantitative information, please refer to: [Deep Learning] Model Quantization - Notes/Experiments

The static quantization method is used here, and the model size comparison after quantization:

bin(KB) param(KB)
Before quantification 25236 22
After quantification 6442 24

It can be seen from the above table that after the model is quantized, its size is only about 1/4 of the original size (there will be a certain loss of accuracy in the prediction). Let's start the quantitative tutorial and refer to the official tutorial .

1. Model optimization

Use ncnnoptimize (in the build\tools directory) to optimize the model.

./ncnnoptimize simple_modnet.param simple_modnet.bin quantize_modnet.param quantize_modnet 0

2. Create a calibration table

When ncnn creates the calibration table, the mean and norm parameters can be modified here . Also, note that the pixel settings are the same as the MODNet official repo, which is BGR. The images used for calibration are stored in the images folder.

find images/ -type f > imagelist.txt
./ncnn2table quantize_modnet.param quantize_modnet.bin imagelist.txt modnet.table mean=[177.5, 177.5, 177.5] norm=[0.00784, 0.00784, 0.00784] shape=[512, 512, 3] pixel=BGR thread=8 method=kl

3. Quantification

Use the following command to get the int8 model.

./ncnn2int8 quantize_modnet.param quantize_modnet-opt.bin modnet_int8.param modnet_int8.bin modnet.table

4. Use

The int8 model is obtained, which is very simple to use. Just replace the bin and param paths in the above code with the generated int8 model path.

5. Prediction results after quantification

First, the unquantized prediction results are given:

 The results after quantization (loss of accuracy, full prediction of shoes, and more floor errors):

4. Related Links

1. NCNN deploys all code + all model weights

Guess you like

Origin blog.csdn.net/qq_40035462/article/details/123902107