ncnn source code reading (4) ---- model inference process

Model inference process

The reasoning process mainly involves two classes: Net and Extractor. They are used in the example as follows:

 	ncnn::Net squeezenet;
    squeezenet.load_param("squeezenet_v1.1.param");
    squeezenet.load_model("squeezenet_v1.1.bin");

    ncnn::Mat in = ncnn::Mat::from_pixels_resize(bgr.data, ncnn::Mat::PIXEL_BGR, bgr.cols, bgr.rows, 227, 227);

    const float mean_vals[3] = {104.f, 117.f, 123.f};
    in.substract_mean_normalize(mean_vals, 0);
	
	//利用Net得到Extractor类的实例
    ncnn::Extractor ex = squeezenet.create_extractor();
    //为Extractor设置模式和输入数据
    ex.set_light_mode(true);
    ex.input("data", in);
	
	//通过extract得到推理结果
    ncnn::Mat out;
    ex.extract("prob", out);

    cls_scores.resize(out.c);
    for (int j=0; j<out.c; j++)
    {
        const float* prob = out.data + out.cstep * j;
        cls_scores[j] = prob[0];
    }

    return 0;

Call logic in ncnn

  • Create an Extractor instance using the net instance and the number of blobs
Extractor Net::create_extractor() const
{
    return Extractor(this, blobs.size());
}

Extractor's constructor:

Extractor::Extractor(const Net* _net, int blob_count) : net(_net)
{
    blob_mats.resize(blob_count);
    lightmode = false;
    num_threads = 0;
}

blob_mats stores the data content of all blobs. It is defined as std::vector<Mat> blob_mats;
the set_light_mode method in the Extractor class is mainly used to modify the lightmode member variable in the class, and the input method assigns values ​​to the corresponding blob:

int Extractor::input(const char* blob_name, const Mat& in)
{
	//根据blob的名字得到在blob集合中的索引
    int blob_index = net->find_blob_index_by_name(blob_name);
    if (blob_index == -1)
        return -1;
	//将对应的输入数据赋值给对应索引的Mat
    blob_mats[blob_index] = in;
    return 0;
}

The blob_mats in the Extractor class are shared by this class, and the blobs in the Net class are data nodes. What is stored in blob_mats is the specific value of each data node, which corresponds to the blob in the Net class in turn.
Using the extract method in the Extractor class, you can get the data corresponding to the blob based on the blob name. The specific logic is to find the corresponding index based on the blob name, and then return the Mat in blob_mats based on the index.

int Extractor::extract(const char* blob_name, Mat& feat)
{
	//根据名字得到对应的索引
    int blob_index = net->find_blob_index_by_name(blob_name);
    if (blob_index == -1)
        return -1;

    int ret = 0;
	
	//根据索引判断blob_mats中对应的数据是否有效,如果没有效,就执行推理,得到填充后的blob_mats
    if (blob_mats[blob_index].dims == 0)
    {
        int layer_index = net->blobs[blob_index].producer;
        ret = net->forward_layer(layer_index, blob_mats, lightmode);
    }
	//根据索引返回对应位置的Mat
    feat = blob_mats[blob_index];

    return ret;
}

Recursive calls realize layer-by-layer execution of the network

According to the extract method above, we can see that the process of filling blob_mats is forward_layercompleted by the Net class, so we need to focus on the reasoning process, which is the specific implementation of the following code:

 if (blob_mats[blob_index].dims == 0)
 {
     int layer_index = net->blobs[blob_index].producer;
     ret = net->forward_layer(layer_index, blob_mats, lightmode);
 }

The meaning expressed in the above branch is: get the index according to the blob name, and then obtain the data in blob_mats according to the index is invalid data, so it is necessary to perform forward calculation of the network to obtain the specific data of the current blob data node.
First of all, according to the production of the blob, we can know which layer of the network the data of the previous blob should be generated from; in order to reduce the amount of calculation, it is not necessary to complete all the networks through inference, as long as the layer of the blob currently needed is calculated, so During forward calculation, you need to know the index of the target layer in the network and the operating mode.
Based on the above process, int layer_index = net->blobs[blob_index].producer;the index of the target layer is obtained, and then the index of the target layer, the blob_mats that stores the data node data, and the operating mode lightmode are told to forward inference. That is this line of code: ret = net->forward_layer(layer_index, blob_mats, lightmode);The forward_layer method in the Net class will be analyzed in detail below:

  • According to layer_index, which is the index of the layer, obtain the instance of the layer;
const Layer* layer = layers[layer_index];
  • Depending on whether the layer has a single input or a single output, branch processing is performed;
    • It is a layer with single input and single output.
      1. First obtain the index of the input and output blob;
      2. Obtain the input blob from blob_mats according to the index;
      3. Determine whether the input data is valid according to the dims of the input blob;
      4. If invalid, then Use the producer of the current blob to recursively call the current forward_layerfunction;
      5. If it is valid, take out the input blob of the current layer;
      lightmode indicates that the lightweight mode will continuously perform garbage collection during network reasoning;
      according to lightmode is false, the execution logic is relative Simple:
      6. Call the forward function of the layer, input the input blob to get the output blob, and then assign the output blob to the corresponding position of blob_mats; if lightmode is true: release the
      mat
      corresponding to the position of the input blob in blob_mats (); In subsequent operations or inplace calculations, the independence of using Mat must also be ensured to facilitate the release of resources.
if (layer->one_blob_only)
{
    // load bottom blob
    int bottom_blob_index = layer->bottoms[0];
    int top_blob_index = layer->tops[0];

    if (blob_mats[bottom_blob_index].dims == 0)
    {
        int ret = forward_layer(blobs[bottom_blob_index].producer, blob_mats, lightmode);
        if (ret != 0)
            return ret;
    }

    Mat bottom_blob = blob_mats[bottom_blob_index];

    if (lightmode)
    {
        // delete after taken in light mode
        blob_mats[bottom_blob_index].release();
        // deep copy for inplace forward if data is shared
        if (layer->support_inplace && *bottom_blob.refcount != 1)
        {
            bottom_blob = bottom_blob.clone();
        }
    }

    // forward
    if (lightmode && layer->support_inplace)
    {
        Mat& bottom_top_blob = bottom_blob;
        int ret = layer->forward_inplace(bottom_top_blob);
        if (ret != 0)
            return ret;

        // store top blob
        blob_mats[top_blob_index] = bottom_top_blob;
    }
    else
    {
        Mat top_blob;
        int ret = layer->forward(bottom_blob, top_blob);
        if (ret != 0)
            return ret;

        // store top blob
        blob_mats[top_blob_index] = top_blob;
    }

}
- 不是单输入、单输出的层

1. According to the number of inputs in the current layer, define a vector that stores the Mat of the input blob;
2. Traverse according to the number of inputs. The operations performed during each traversal are the same as the single-input and single-output process;

 std::vector<Mat> bottom_blobs;
 bottom_blobs.resize(layer->bottoms.size());
 for (size_t i=0; i<layer->bottoms.size(); i++)
 {
     int bottom_blob_index = layer->bottoms[i];

     if (blob_mats[bottom_blob_index].dims == 0)
     {
         int ret = forward_layer(blobs[bottom_blob_index].producer, blob_mats, lightmode);
         if (ret != 0)
             return ret;
     }

     bottom_blobs[i] = blob_mats[bottom_blob_index];

     if (lightmode)
     {
         // delete after taken in light mode
         blob_mats[bottom_blob_index].release();
         // deep copy for inplace forward if data is shared
         if (layer->support_inplace && *bottom_blobs[i].refcount != 1)
         {
             bottom_blobs[i] = bottom_blobs[i].clone();
         }
     }
 }

3. The forward process is similar to the single-input single-output process. During output, traversal assignment is required.

 // forward
 if (lightmode && layer->support_inplace)
  {
      std::vector<Mat>& bottom_top_blobs = bottom_blobs;
      int ret = layer->forward_inplace(bottom_top_blobs);
      if (ret != 0)
          return ret;

      // store top blobs
      for (size_t i=0; i<layer->tops.size(); i++)
      {
          int top_blob_index = layer->tops[i];

          blob_mats[top_blob_index] = bottom_top_blobs[i];
      }
  }
  else
  {
      std::vector<Mat> top_blobs;
      top_blobs.resize(layer->tops.size());
      int ret = layer->forward(bottom_blobs, top_blobs);
      if (ret != 0)
          return ret;

      // store top blobs
      for (size_t i=0; i<layer->tops.size(); i++)
      {
          int top_blob_index = layer->tops[i];

          blob_mats[top_blob_index] = top_blobs[i];
      }
  }

The complete multiple input multiple output code is as follows:

 // load bottom blobs
 std::vector<Mat> bottom_blobs;
 bottom_blobs.resize(layer->bottoms.size());
 for (size_t i=0; i<layer->bottoms.size(); i++)
 {
     int bottom_blob_index = layer->bottoms[i];

     if (blob_mats[bottom_blob_index].dims == 0)
     {
         int ret = forward_layer(blobs[bottom_blob_index].producer, blob_mats, lightmode);
         if (ret != 0)
             return ret;
     }

     bottom_blobs[i] = blob_mats[bottom_blob_index];

     if (lightmode)
     {
         // delete after taken in light mode
         blob_mats[bottom_blob_index].release();
         // deep copy for inplace forward if data is shared
         if (layer->support_inplace && *bottom_blobs[i].refcount != 1)
         {
             bottom_blobs[i] = bottom_blobs[i].clone();
         }
     }
 }

 // forward
 if (lightmode && layer->support_inplace)
 {
     std::vector<Mat>& bottom_top_blobs = bottom_blobs;
     int ret = layer->forward_inplace(bottom_top_blobs);
     if (ret != 0)
         return ret;

     // store top blobs
     for (size_t i=0; i<layer->tops.size(); i++)
     {
         int top_blob_index = layer->tops[i];

         blob_mats[top_blob_index] = bottom_top_blobs[i];
     }
 }
 else
 {
     std::vector<Mat> top_blobs;
     top_blobs.resize(layer->tops.size());
     int ret = layer->forward(bottom_blobs, top_blobs);
     if (ret != 0)
         return ret;

     // store top blobs
     for (size_t i=0; i<layer->tops.size(); i++)
     {
         int top_blob_index = layer->tops[i];

         blob_mats[top_blob_index] = top_blobs[i];
     }
 }

Guess you like

Origin blog.csdn.net/qq_25105061/article/details/131761864