[NCNN] hardware acceleration code sample of arm architecture cpu

Introduction to NCNN Acceleration Framework

NCNN (Nihui's CNN) is a lightweight, high-performance deep learning reasoning framework developed by Nihui, the master of Tencent Youtu Lab. The framework enables efficient inference of deep learning models on mobile applications and embedded devices, with low memory footprint and highly optimized computing performance.

features

Lightweight: The core code of NCNN is very streamlined, without any third-party dependencies, only one header file and one source file, which is very suitable for porting to various platforms.

High performance: NCNN realizes efficient model reasoning by optimizing the calculation process and giving full play to the computing power of the hardware platform. Good acceleration can be obtained on both multi-core CPU and GPU.

Multi-platform support: NCNN supports multiple operating systems and hardware platforms, including Android, iOS, Linux, etc. At the same time, it also provides Caffe and Tensorflow model conversion tools, which can easily convert models of other frameworks into formats available to NCNN.

Low memory usage: NCNN adopts a memory sharing and multiplexing strategy in design, which effectively reduces memory usage. This is very important for mobile and embedded devices to reduce energy consumption and improve system performance.

Ease of use: NCNN provides a concise API interface, which is convenient for users to load and reason models. At the same time, it also provides rich sample codes and documents to help users get started quickly.

NCNN makes full use of the instruction sets of various hardware platforms, such as SIMD instruction sets (such as ARM's NEON instruction set, x86's SSE instruction set, etc.) and the parallel computing capability of GPU. By using these instruction sets, NCNN can perform computing tasks in parallel to improve computing efficiency.

sample code

Compare the similarity between two pictures in the folder, and the model uses arcface.


#ifndef Retinaface_RetinafacePostSelfPlug_H
#define Retinaface_RetinafacePostSelfPlug_H
#include <iostream>
#include <vector>
#include <cmath>
#include <algorithm>
#include <net.h>
#include <mat.h>
#include <opencv2/opencv.hpp>
#include <chrono>
// 计算特征向量的单位向量
std::vector<float> calculateUnitVector(const std::vector<float>& feature)
{
    
    
    // for (const auto& element : feature) {
    
    
    //     std::cout << element << " ";
    // }

    std::vector<float> unitVector(feature.size());
    float norm = 0.0f;
    for (float value : feature)
    {
    
    
        norm += value * value;
    }
    norm = std::sqrt(norm);
    std::transform(feature.begin(), feature.end(), unitVector.begin(), [norm](float value) {
    
    
        return value / norm;
    });
    return unitVector;
}
std::vector<float> extractFeatureVector(const std::string& imagePath)
{
    
    
    std::string modelPath = "/home/kylin/ncnnbuild2/ncnnwork/model/iresnetface1s.param";
    std::string weightPath = "/home/kylin/ncnnbuild2/ncnnwork/model/iresnetbin1s.bin";
    cv::Mat image = cv::imread(imagePath);
    
    // 加载模型
    ncnn::Net net;
    ncnn::Option opt;
    net.opt.use_fp16_packed = false;
    net.opt.use_fp16_storage = false;
    net.opt.use_fp16_arithmetic = false;
    opt.num_threads = 4; 
    opt.use_vulkan_compute = false; 
    net.opt=opt;
    net.load_param(modelPath.c_str());
    net.load_model(weightPath.c_str());
    
    // 读取图片
    //ncnn::Mat input = ncnn::Mat::from_pixels(image.data, ncnn::Mat::PIXEL_BGR2RGB, image.cols, image.rows);
    ncnn::Mat input = ncnn::Mat::from_pixels(image.data, ncnn::Mat::PIXEL_BGR, image.cols, image.rows);
    // 减均值除标准差
    // const float mean_vals[3] = { 127.5f, 127.5f, 127.5f };
    // const float norm_vals[3] = { 1.0 / 127.5, 1.0 / 127.5, 1.0 / 127.5 };
    // input.substract_mean_normalize(mean_vals, norm_vals);
    //如果模型是keras,输入为NHWC时需要的额外变换:
    // ncnn::Mat oo;
    // ncnn::convert_packing(input,oo,3);
    // 创建推理对象
    ncnn::Extractor ex = net.create_extractor();
    ex.input("data", input);
    
    ncnn::Mat feature;
    ex.extract("fc1", feature);
    
    // 取出特征向量
    std::vector<float> featureVector((float*)(feature.data), (float*)(feature.data) + feature.w);
    
    // 计算特征向量的单位向量C*H*W
    std::vector<float> unitVector = calculateUnitVector(featureVector);
    
    return unitVector;
}
float dotProduct(const std::vector<float>& vec1, const std::vector<float>& vec2)
{
    
    
    if(vec1.size() != vec2.size())
    {
    
    
        std::cerr << "Vector sizes do not match." << std::endl;
        return 0.0f;
    }
    float result = 0.0f;
    for(size_t i = 0; i < vec1.size(); i++)
    {
    
    
        result += vec1[i] * vec2[i];
    }
    return result;
}
void fun(const std::string& path1, const std::string& path2) {
    
    

    auto v1=extractFeatureVector(path1);
    auto v2=extractFeatureVector(path2);
    std::cout << "Comparing " << path1 << " and " << path2 <<":"<<dotProduct(v1,v2)<< std::endl;
}
int main() {
    
    
    using namespace std;
    string folder = "/home/kylin/ncnnbuild2/ncnnwork/pic";
    vector<string> imagePaths;
    // 遍历文件夹,获取所有图片的路径
    cv::glob(folder, imagePaths);
    // 两两对比图片路径
    for (int i = 0; i < imagePaths.size(); ++i) {
    
    
        for (int j = i + 1; j < imagePaths.size(); ++j) {
    
    
            fun(imagePaths[i], imagePaths[j]);
        }
    }
    return 0;
}
#endif // Retinaface_RetinafacePostSelfPlug_H

Guess you like

Origin blog.csdn.net/hh1357102/article/details/131918201