Table of contents
1. Introduction to MNN
It is a lightweight deep neural network engine open sourced by Ali, which supports reasoning and training of deep learning, and is suitable for servers/personal computers/mobile phones/embedded devices. At present, MNN has been used in more than 30 apps such as Alibaba's mobile Taobao, mobile Tmall, and Youku, covering live broadcast, short video, search recommendation, product image search, interactive marketing, rights distribution, security risk control and other scenarios. github address https://github.com/alibaba/MNN
Two, MNN compilation
The compilation process on the Linux platform is as follows.
1. Dependency installation
- cmake (3.10 and above)
- protobuf (3.0+)
-
- Refers to the protobuf library and the protobuf compiler. The version number
protoc --version
is printed out using . - On some Linux distributions these two packages are released separately and need to be installed manually
- Ubuntu needs to be installed separately
libprotobuf-dev
as well asprotobuf-compiler
two packages - sudo apt install libprotobuf-dev
- sudo apt install protobuf-compiler
brew install protobuf
Install on Mac OS using
- Refers to the protobuf library and the protobuf compiler. The version number
- C++ compiler
-
- Either GCC or Clang can be used (macOS does not need to be installed separately, Xcode comes with it)
-
-
- GCC recommended version 4.9 or above
-
-
-
-
- On some distributions GCC (GNU C Compiler) and G++ (GNU C++ Compiler are installed separately).
- Also take Ubuntu as an example, you need to install
gcc
andg++
-
-
-
-
- Clang recommends version 3.9 or higher
-
- zlib
2. Download the code of MNN, after decompression, do as follows
mkdir -p build/install
cd build
cmake .. -DMNN_OPENCL=true -DMNN_SEP_BUILD=false -DMNN_BUILD_CONVERTER=true -DMNN_BUILD_TORCH=true -DMNN_BUILD_DEMO=true -DMNN_BUILD_BENCHMARK=true -DMNN_BUILD_TOOLS=true -DCMAKE_INSTALL_PREFIX=./install
make -j4
make install
The compiled header files and library files are located in the install directory.
3. MNN deploys PINet model
First, the model needs to be converted to the format defined by mnn, and the process is pytorch—onnx—mnn.
pytorch to onnx
Download the PINet code and configure the pytorch operating environment. The onnx_converter.py in the code base provides functions to convert the onnx model, and onnx_inference.py provides a demo using onnx inference. Run onnx_conveter.py to get the onnx model.
onnx to mnn
./MNNConvert -f ONNX --modelFile pinet_v2.onnx --MNNModel pinet_v2.mnn --bizCode biz
This results in the pinet_v2.mnn model.
mnn deployment
Refer to onnx_inference.py and the API interface of mnn for related deployment, you need to pay attention to the following points.
1) The format of the input image, whether it is normalized; here, according to the demo example, it can be seen that the image format is BGR, and it is normalized to [0,1]
2) You can use Netron to view the network to obtain the name of the input and output nodes, and obtain the tensor of the input and output accordingly
3) The data format of the input and output tensor, according to the official documentation of MNN, if the internal format is not clear, it is recommended to explicitly convert the input and output to a tensor in the specified format before accessing the data.
#include <iostream>
#include <opencv2/opencv.hpp>
#include <stdio.h>
#include <MNN/ImageProcess.hpp>
#define MNN_OPEN_TIME_TRACE
#include <algorithm>
#include <fstream>
#include <functional>
#include <memory>
#include <sstream>
#include <vector>
#include <MNN/MNNDefine.h>
#include <MNN/expr/Expr.hpp>
#include <MNN/expr/ExprCreator.hpp>
#include <MNN/AutoTime.hpp>
#include <MNN/Interpreter.hpp>
using namespace MNN;
using namespace MNN::CV;
using namespace MNN::Express;
int main(int argc, char** argv)
{
std::shared_ptr<Interpreter> net(Interpreter::createFromFile("pinet_v2.mnn"));
//net->setCacheFile(".tempcache");
//net->setSessionMode(Interpreter::Session_Debug);
//net->setSessionMode(Interpreter::Session_Resize_Defer);
ScheduleConfig config;
config.numThread = 1;
config.type = MNN_FORWARD_CPU;
config.backupType = MNN_FORWARD_OPENCL;
BackendConfig backendConfig;
backendConfig.precision = static_cast<MNN::BackendConfig::PrecisionMode>(BackendConfig::Precision_Low);
config.backendConfig = &backendConfig;
auto session = net->createSession(config);
auto input = net->getSessionInput(session, NULL);
std::vector<int> shape = input->shape();
std::vector<int> nhwc_shape{ 1, shape[2], shape[3], shape[1] };
auto nhwc_tensor = new Tensor(input, MNN::Tensor::TENSORFLOW);
cv::Mat img = cv::imread("3.jpg");
cv::Mat img_float;
cv::Mat resized_img;
cv::resize(img, resized_img, cv::Size(shape[3], shape[2]));
resized_img.convertTo(resized_img, CV_32FC3);
resized_img = resized_img / 255.f;
memcpy(nhwc_tensor->host<float>(), resized_img.data, nhwc_tensor->size());
input->copyFromHostTensor(nhwc_tensor);
MNN::Timer time;
time.reset();
net->runSession(session);
MNN_PRINT("use time %f ms\n", time.durationInUs()/ 1000.f);
auto offset_output = net->getSessionOutput(session, "2830");
auto nchw_offset_output = new Tensor(offset_output, Tensor::CAFFE);
offset_output->copyToHostTensor(nchw_offset_output);
auto feature_output = net->getSessionOutput(session, "2841");
auto nchw_feature_output = new Tensor(feature_output, Tensor::CAFFE);
feature_output->copyToHostTensor(nchw_feature_output);
auto confidence_output = net->getSessionOutput(session, "input.1560");
auto nchw_confidence_output = new Tensor(confidence_output, Tensor::CAFFE);
confidence_output->copyToHostTensor(nchw_confidence_output);
shape = confidence_output->shape();
// get lines
std::vector<std::vector<cv::Point>> lines_predicted, lines_final;
std::vector<std::vector<float>> line_features;
float width_scale_factor = img.cols / shape[3];
float height_scale_factor = img.rows / shape[2];
float* confidence_buf = nchw_confidence_output->host<float>();
float* feature_buf = nchw_feature_output->host<float>();
float* offset_buf = nchw_offset_output->host<float>();
float point_threshold = 0.96;
float instance_threshold = 0.08;
for (int h = 0; h < shape[2]; h++)
{
for (int w = 0; w < shape[3]; w++)
{
int idx = h * shape[3] + w;
float confidence = confidence_buf[idx];
if (confidence < point_threshold)
continue;
float offset_x = offset_buf[idx];
float offset_y = offset_buf[shape[3] * shape[2] + idx];
std::vector<float> feature;
feature.push_back(feature_buf[idx]);
feature.push_back(feature_buf[shape[3] * shape[2] + idx]);
feature.push_back(feature_buf[shape[3] * shape[2] * 2 + idx]);
feature.push_back(feature_buf[shape[3] * shape[2] * 3 + idx]);
cv::Point2f pt;
pt.x = (offset_x + w) * width_scale_factor;
pt.y = (offset_y + h) * height_scale_factor;
if (pt.x > img.cols - 1 || pt.x < 0 || pt.y > img.rows - 1 || pt.y < 0)
continue;
if (lines_predicted.size() == 0)
{
line_features.push_back(feature);
std::vector<cv::Point> line;
line.push_back(pt);
lines_predicted.push_back(line);
}
else
{
int min_feature_idx = -1;
float min_feature_dis = 10000;
for (int n = 0; n < line_features.size(); n++)
{
float dis = 0;
dis += (feature[0] - line_features[n][0]) * (feature[0] - line_features[n][0]);
dis += (feature[1] - line_features[n][1]) * (feature[1] - line_features[n][1]);
dis += (feature[2] - line_features[n][2]) * (feature[2] - line_features[n][2]);
dis += (feature[3] - line_features[n][3]) * (feature[3] - line_features[n][3]);
if (min_feature_dis > dis)
{
min_feature_dis = dis;
min_feature_idx = n;
}
}
if (min_feature_dis < instance_threshold)
{
line_features[min_feature_idx][0] = (line_features[min_feature_idx][0] * lines_predicted[min_feature_idx].size()
+ feature[0]) / (lines_predicted[min_feature_idx].size() + 1);
line_features[min_feature_idx][1] = (line_features[min_feature_idx][1] * lines_predicted[min_feature_idx].size()
+ feature[1]) / (lines_predicted[min_feature_idx].size() + 1);
line_features[min_feature_idx][2] = (line_features[min_feature_idx][2] * lines_predicted[min_feature_idx].size()
+ feature[2]) / (lines_predicted[min_feature_idx].size() + 1);
line_features[min_feature_idx][3] = (line_features[min_feature_idx][3] * lines_predicted[min_feature_idx].size()
+ feature[3]) / (lines_predicted[min_feature_idx].size() + 1);
lines_predicted[min_feature_idx].push_back(pt);
}
else
{
line_features.push_back(feature);
std::vector<cv::Point> line;
line.push_back(pt);
lines_predicted.push_back(line);
}
}
}
}
delete nchw_confidence_output;
delete nchw_feature_output;
delete nchw_offset_output;
delete nhwc_tensor;
// draw point
cv::Mat draw_lines;
img.copyTo(draw_lines);
for (int n = 0; n < lines_predicted.size(); n++)
{
if (lines_predicted[n].size() < 3)
continue;
cv::RNG rng(cv::getTickCount());
cv::Scalar color = cv::Scalar(rng.uniform(0, 255), rng.uniform(0, 255), rng.uniform(0, 255));
for (int i = 0; i < lines_predicted[n].size(); i++)
cv::circle(draw_lines, lines_predicted[n][i], 5, color, 3);
lines_final.push_back(lines_predicted[n]);
}
for (int n = 0; n < lines_final.size(); n++)
{
cv::Vec4f param;
cv::fitLine(lines_final[n], param, CV_DIST_HUBER, 0, 0.01, 0.01);
float vx, vy, x0, y0;
vx = param[0];
vy = param[1];
x0 = param[2];
y0 = param[3];
float x1 = x0 + 1000 * vx;
float y1 = y0 + 1000 * vy;
x0 = x0 - 1000 * vx;
y0 = y0 - 1000 * vy;
cv::line(draw_lines, cv::Point(x0, y0), cv::Point(x1, y1), cv::Scalar(0, 0, 255), 2);
}
cv::imwrite("result.jpg", draw_lines);
return 0;
}
inference effect
The reasoning result is shown in the figure below, complete.