table of Contents
NCNN Profile
ncnn is a front end of a mobile phone optimized ultimate high performance neural network calculation to the frame. ncnn from initial design to think deeply about the deployment and use of the mobile terminal. No third-party dependency, cross-platform, mobile terminal cpu faster than all currently known open source framework. Based ncnn, developers can easily migrate to deep learning algorithm to perform efficiently the mobile phone side, the development of artificial intelligence APP, AI will be brought to your fingertips. ncnn Tencent currently using a variety of applications, such as QQ, Qzone, micro-letters, every P-picture and the like.
About installation, compilation, use and other steps not go into details, the official website has very detailed documentation
Windows | Linux | MacOS | Android | iOS | |
---|---|---|---|---|---|
intel-cpu | ✔️ | ✔️ | ✔️ | ❔ | / |
intel-gpu | ✔️ | ✔️ | ❔ | ❔ | / |
amd-cpu | ✔️ | ✔️ | ✔️ | ❔ | / |
amd-gpu | ✔️ | ✔️ | ❔ | ❔ | / |
nvidia-gpu | ✔️ | ✔️ | ❔ | ❔ | / |
qcom-cpu | ❔ | ✔️ | / | ✅ | / |
qcom-gpu | ❔ | ✔️ | / | ✔️ | / |
arm-cpu | ❔ | ❔ | / | ✅ | / |
arm-gpu | ❔ | ❔ | / | ✔️ | / |
apple-cpu | / | / | / | / | ✅ |
apple-gpu | / | / | / | / | ✔️ |
NCNN Notes
In fact ncnn already a complete library, very few people go and change the source code, of course, especially if you need to make the project possible.
The main problem is the use of input and output places do not correspond, the following is my use of emerging issues.
- A network problem
When using caffe model, input section must be written in canonical format:
input: "data"
layer {
name: "data"
type: "Input"
top: "data"
input_param { shape: { dim: 1 dim: 1 dim: 256 dim: 512 } }
}
Do not the easy way written in the following format, caffe can run no problem, but the conversion is not recognized, the data structure ncnn cause! ! !
input: "data"
input_dim: 1
input_dim: 1
input_dim: 256
input_dim: 512
- Question two networks
Defined by the network layer do not duplicate, the specification must be defined:
layer {
name: "AAAA"
type: "Concat"
bottom: "box_softmax"
bottom: "conv6_2"
top: "concat_out1"
concat_param {
axis: 2
}
}
layer {
name: "BBBB"
type: "Concat"
bottom: "box_softmax"
bottom: "concat_out1"
top: "concat_out2"
concat_param {
axis: 2
}
}
Do not written in the network, and will operate in caffe, but ncnn reads the top layer of the first occurrence of! ! !
The output is the first layer concat_out1
, the second layer is output concat_out1
, when ncnn.extract
an error occurs! ! !
layer {
name: "AAAA"
type: "Concat"
bottom: "box_softmax"
bottom: "conv6_2"
top: "concat_out1"
concat_param {
axis: 2
}
}
layer {
name: "BBBB"
type: "Concat"
bottom: "box_softmax"
bottom: "concat_out1"
top: "concat_out1"
concat_param {
axis: 2
}
}
- Question three NCNN network
This seemingly is counted as caffe
a problem in the process I used to ignore this point, simply count NCNN
operation inside.
Batch Normalization
A layer use_global_stats
parameters effect of this operation is: whether the mean and variance inside caffe
In other words, it means:
--------- true
: Using the mean and variance of internal caffe, where the mean and variance are fixed, after a good model training, these two values is fixed.
--------- false
: mean and variance calculation of the current layer, this is not fixed, that has been changed during training, the training is good optimal.
Where the NCNN
default true
state, whether it is false
or true
are ultimately countedtrue
caffe
When the test was manually settrue
- NCNN a data input
Normally, ncnn
and the caffe
margin of error in the original 0.001
left and right, my data in the 0.000X
range of wandering, if your data is accurate less than a third significant figure, it would have to check the accuracy of the network input.
The input substract_mean_normalize
was as accurate as possible, especially the normalized value! ! !
Assuming that 0-255
the image needs to normalization 0-1
:
const float noml_vals[1] = { 0.0078431372549019607843137254902f };
Do not write it like this, readers can test their accuracy vary greatly.
const float noml_vals[1] = { 0.0078 }
- NCNN two input data
There is no point of error, only experience points.
- If you enter the opencv Mat objects, it can only be CV_8U type, do not try to use other types CV_32F, etc., have no effect on the results.
- Opencv image processing on the use and handling of image effects ncnn the same, such as the opencv resize, normalize, cvtcolor and other functions, and ncnn of from_pixels_resize, substract_mean_normalize effect is basically no difference, I've tested.
NCNN Reviews
Tips
- Multi-layer output
Examples of NCNN official website to read, it is converted to an output row of data, and then processed one by one:
ncnn::Mat out_flatterned = out.reshape(out.w * out.h * out.c);
std::vector<float> scores;
scores.resize(out_flatterned.w);
for (int j=0; j<out_flatterned.w; j++)
{
scores[j] = out_flatterned[j];
}
With this personal feeling small data processing is still possible, I use the network output 100 × 100 × 10 , how to handle this situation?
- You can use a kind of a way to save the array, it is a waste of time points.
- When you need to deal with the results of it? For example, simply look for the maximum value of each of the channels, and the main know the coordinates?
I use the following process:
for (size_t i = 0; i < out.c; i++)
{
cv::Mat cv_mat = cv::Mat::zeros(cv::Size(100, 100), CV_8UC1);
ncnn::Mat ppp = out.channel(i);
//转化为opencv的Mat进行操作,因为有很多矩阵运算就很方便
ppp.to_pixels(cv_mat.data, ncnn::Mat::PIXEL_GRAY);
double max_c = 0, min_c = 0;
cv::Point min_loc, max_loc;
cv::minMaxLoc(cv_mat, &min_c, &max_c, &min_loc, &max_loc);
/*---------------后续操作-----------------*/
}
Small ideas
NCNN official website can ask the individual input and output data of a plurality of channels, which have been implemented in the above, see the following former.
NCNN input of
Extractor.input(const char* blob_name, const Mat& in)
whichin
it is thencnn::Mat
type of data, apparentlyA plurality of channels may be input.
Ncnn can be used to create
100×100×10
data, and then assign each channel can operate through from_pixel.
Without the specific implementation, the official website did not explain, do not know can not do, readers can try the above themselves.