CAFFE :classification.cpp c++接口解读

当训练好一个caffemodel, 我们需要用这个model对一张图片进行分类时，caffe提供了下面的命令接口(即c++接口)

./build/examples/cpp_classification/classification.bin \          #  调用接口
deploy.prorotxt  \                                                                    #  网络模型的描述文件
caffemodel  \                                                                          #  训练好的model
mean.binaryproto   \                                                               #  训练时加载的均值文件
synset_words.txt    \                                                               #  标签对应的类别名,比如2类猫狗分类问题,可以是第一行 0 cat  第二行 1 dog
test.jpg

运行上述命令后会打印出top5的标签以及对应的置信度。现在，我们分析该接口是怎么运作的，也就是看它的源文件，该文件在caffe根目录下的examples/cpp_classification/classification.cpp

先看主函数：

int main(int argc, char** argv) {
  if (argc != 6) {
    std::cerr << "Usage: " << argv[0]
              << " deploy.prototxt network.caffemodel"
              << " mean.binaryproto labels.txt img.jpg" << std::endl;
    return 1;
  }

  ::google::InitGoogleLogging(argv[0]);

  string model_file   = argv[1];             // 网络模型描述文件
  string trained_file = argv[2];              // 训练好的model
  string mean_file    = argv[3];             // 训练时加载的均值文件
  string label_file   = argv[4];                // 标签标注文件
  Classifier classifier(model_file, trained_file, mean_file, label_file);          //  核心的分类器，由上述4个文件标注

  string file = argv[5];                             // 要进行分类的图片路径 

  std::cout << "---------- Prediction for "<< file << " ----------" << std::endl;

  cv::Mat img = cv::imread(file, -1);
  CHECK(!img.empty()) << "Unable to decode image " << file;
  std::vector<Prediction> predictions = classifier.Classify(img);            // 进行预测，返回top5的<标签号，可信度>

  /* Print the top N predictions. */                                                           
  for (size_t i = 0; i < predictions.size(); ++i) {                                        // 简单的打印结果
    Prediction p = predictions[i];
    std::cout << std::fixed << std::setprecision(4) << p.second << " - \""
              << p.first << "\"" << std::endl;
  }
}

上面的代码，是将一张图片输入到网络，进行前向传播，然后将网络的输出返回，也就是将一张图片映射成一个特征向量(这个特征向量正好就是网络最后一层softmax层的输出值)；比如猫狗2分类，猫的标签是0，狗的标签是1，输入网络的是猫的图片，得到的特征向量就是[1,0].

接下来看其他内容

在classification.cpp中包含一个Classifier类，该类中包含：
Classifer函数：根据模型的配置文件.prototxt，训练好的模型文件.caffemodel，建立模型，得到net_;
处理均值文件，得到mean_;
读入labels文件，得到labels_;
classify函数：调用Predict函数对图像img进行分类，返回std::pair<std::string, float>形式预测结果。

私有函数：仅供classifier函数和classify函数使用，包括：
setmean函数：将均值文件读入，转换为一张均值图像mean_
Predict函数：调用Process函数将图像输入到网络中，使用net_->Forward()函数进行预测；将输出层的输出结果保存到vector容器中并返回。
Preprocess函数：对图像的通道数，大小，数据形式进行改变，减去均值mean_, 在写入到net_的输入层中。

私有变量：
net_：模型变量;
input_geometry：输入层的图像大小;
num_channels：输入层的通道数;
mean_：均值文件处理得到的均值图像;
labels_：标签文件，输出的结果表示的含义。

WrapInputLayer：对输入层数据进行包装，方便后续操作。

这里重点解析Predict函数:

std::vector<float> Classifier::Predict(const cv::Mat& img) {
  Blob<float>* input_layer = net_->input_blobs()[0];                      // 这是网络数据输入层
  // 调整输入层的尺寸，使之符合输入图片的尺寸，比如1*3*227*227, 也就是一张图片，三个通道，图片尺寸是227*227
  input_layer->Reshape(1, num_channels_, input_geometry_.height, input_geometry_.width);
  /* Forward dimension change to all layers. */
  /* 整个网络，根据输入层，以及构造函数中的deploy.prototxt等文件的参数(比如卷积核个数卷积核大小等等)一层一层的开辟空间，
  为接下来的计算做好准备 */
  net_->Reshape();
  
  /* 接下来的三行，将测试图片img送入网络的数据输入层，并做了一些列的处理，归一化，缩放操作等等，
  具体的，可以看函数的实现细节，这里就不讲了 */
  std::vector<cv::Mat> input_channels;
  WrapInputLayer(&input_channels);
  Preprocess(img, &input_channels);

  // 图片送入网络，进行一次前向传播
  net_->Forward();
  
  /* Copy the output layer to a std::vector */
  // 网络的输出层，比如imagenet中是1000类，则该层的输出层的维度是1*1000
  Blob<float>* output_layer = net_->output_blobs()[0];
  // 下面三行代码，就是取出网络的输出层的N维向量(比如imagenet中是1000维)，然后返回
  const float* begin = output_layer->cpu_data();
  const float* end = begin + output_layer->channels();
  return std::vector<float>(begin, end);
}

在这里插入图片描述

关键函数如上述，classification.cpp中的其他函数很好理解了。

Hu_sin

发布了61 篇原创文章 · 获赞 44 · 访问量 6万+

私信关注

CAFFE :classification.cpp c++接口 解读

接下来看其他内容

猜你喜欢

CAFFE :classification.cpp c++接口解读