Caffe source -Layer class

Layer Class Profile

Layer is the basic unit caffe network structures, caffe code contains a large number of various layers Layer base class is derived, by each virtual function Forward () and Backward () to realize their functions. Forward () function for forward calculation process, and calculates top blob loss by the bottom blob, to pass data from the shallow to deep. And Backward () function is used to back-propagation process, calculated by the gradient of the bottom blob top blob, the prediction error of the network is transmitted to the shallow network, to update the parameters of the network.

layer.hpp source

/**
 * @brief An interface for the units of computation which can be composed into a
 *        Net.
 *
 * Layer%s must implement a Forward function, in which they take their input
 * (bottom) Blob%s (if any) and compute their output Blob%s (if any).
 * They may also implement a Backward function, in which they compute the error
 * gradients with respect to their input Blob%s, given the error gradients with
 * their output Blob%s.
 */
template <typename Dtype>
class Layer {
 public:
  /**
   * You should not implement your own constructor. Any set up code should go
   * to SetUp(), where the dimensions of the bottom blobs are provided to the
   * layer.
   */
  explicit Layer(const LayerParameter& param)       //构造函数,根据LayerParameter初始化当前的layer
    : layer_param_(param) {                         //先保存param
      // Set phase and copy blobs (if there are any).
      phase_ = param.phase();                       //设置layer的状态(TRAIN or TEST)
      if (layer_param_.blobs_size() > 0) {          //如果layer中存在blob参数(可学习参数)
        blobs_.resize(layer_param_.blobs_size());   //调整blobs_的大小
        for (int i = 0; i < layer_param_.blobs_size(); ++i) {
          blobs_[i].reset(new Blob<Dtype>());       //指针指向新创建的Blob数据
          blobs_[i]->FromProto(layer_param_.blobs(i));    //读取layer_param_中的BlobProto类型的数据,存入blob中
        }
      }
    }
  virtual ~Layer() {}

  /**
   * @brief Implements common layer setup functionality.
   *
   * @param bottom the preshaped input blobs
   * @param top
   *     the allocated but unshaped output blobs, to be shaped by Reshape
   *
   * Checks that the number of bottom and top blobs is correct.
   * Calls LayerSetUp to do special layer setup for individual layer types,
   * followed by Reshape to set up sizes of top blobs and internal buffers.
   * Sets up the loss weight multiplier blobs for any non-zero loss weights.
   * This method may not be overridden.
   */
  void SetUp(const vector<Blob<Dtype>*>& bottom,
      const vector<Blob<Dtype>*>& top) {
    CheckBlobCounts(bottom, top);   //检查输入输出的blob的个数是否正确
    LayerSetUp(bottom, top);        //虚函数,layer创建时做一些初始化工作,每个子类根据需要自己实现
    Reshape(bottom, top);           //虚函数,调整输出blob的形状和layer中内部缓冲区的大小
    SetLossWeights(top);            //设置输出blob的权重
  }

  /**
   * @brief Does layer-specific setup: your layer should implement this function
   *        as well as Reshape.
   *
   * @param bottom
   *     the preshaped input blobs, whose data fields store the input data for
   *     this layer
   * @param top
   *     the allocated but unshaped output blobs
   *
   * This method should do one-time layer specific setup. This includes reading
   * and processing relevent parameters from the <code>layer_param_</code>.
   * Setting up the shapes of top blobs and internal buffers should be done in
   * <code>Reshape</code>, which will be called before the forward pass to
   * adjust the top blob sizes.
   */
  //特定层有特定的实现方式,一般用于在layer创建时做一些一次性的操作,如从layer_param_中读取点参数之类的
  virtual void LayerSetUp(const vector<Blob<Dtype>*>& bottom,
      const vector<Blob<Dtype>*>& top) {}

  /**
   * @brief Adjust the shapes of top blobs and internal buffers to accommodate
   *        the shapes of the bottom blobs.
   *
   * @param bottom the input blobs, with the requested input shapes
   * @param top the top blobs, which should be reshaped as needed
   *
   * This method should reshape top blobs as needed according to the shapes
   * of the bottom (input) blobs, as well as reshaping any internal buffers
   * and making any other necessary adjustments so that the layer can
   * accommodate the bottom blobs.
   */
  //根据输入blob的形状,调整输出blob的形状和调整内部的缓冲区.该函数会在每次前向计算之前被调用
  virtual void Reshape(const vector<Blob<Dtype>*>& bottom,
      const vector<Blob<Dtype>*>& top) = 0;

  /**
   * @brief Given the bottom blobs, compute the top blobs and the loss.
   *
   * @param bottom
   *     the input blobs, whose data fields store the input data for this layer
   * @param top
   *     the preshaped output blobs, whose data fields will store this layers'
   *     outputs
   * \return The total loss from the layer.
   *
   * The Forward wrapper calls the relevant device wrapper function
   * (Forward_cpu or Forward_gpu) to compute the top blob values given the
   * bottom blobs.  If the layer has any non-zero loss_weights, the wrapper
   * then computes and returns the loss.
   *
   * Your layer should implement Forward_cpu and (optionally) Forward_gpu.
   */
  //前向计算函数,传入输入blob数据,计算输出blob,和loss值(权重参数非0时).
  inline Dtype Forward(const vector<Blob<Dtype>*>& bottom,
      const vector<Blob<Dtype>*>& top);

  /**
   * @brief Given the top blob error gradients, compute the bottom blob error
   *        gradients.
   *
   * @param top
   *     the output blobs, whose diff fields store the gradient of the error
   *     with respect to themselves
   * @param propagate_down
   *     a vector with equal length to bottom, with each index indicating
   *     whether to propagate the error gradients down to the bottom blob at
   *     the corresponding index
   * @param bottom
   *     the input blobs, whose diff fields will store the gradient of the error
   *     with respect to themselves after Backward is run
   *
   * The Backward wrapper calls the relevant device wrapper function
   * (Backward_cpu or Backward_gpu) to compute the bottom blob diffs given the
   * top blob diffs.
   *
   * Your layer should implement Backward_cpu and (optionally) Backward_gpu.
   */
  //给出输出blob数据的误差梯度,计算输入blob数据的误差梯度
  inline void Backward(const vector<Blob<Dtype>*>& top,
      const vector<bool>& propagate_down,   //个数与bottom.size()相等,对应每个输入blob是否需要反向传播
      const vector<Blob<Dtype>*>& bottom);

  /**
   * @brief Returns the vector of learnable parameter blobs.
   */
  vector<shared_ptr<Blob<Dtype> > >& blobs() {    //返回layer中内部可学习的参数的个数
    return blobs_;
  }

  /**
   * @brief Returns the layer parameter.
   */
  const LayerParameter& layer_param() const { return layer_param_; }      //返回layer的配置参数

  /**
   * @brief Writes the layer parameter to a protocol buffer
   */
  virtual void ToProto(LayerParameter* param, bool write_diff = false);   //将layer的blob数据写入param中

  /**
   * @brief Returns the scalar loss associated with a top blob at a given index.
   */
  inline Dtype loss(const int top_index) const {
    return (loss_.size() > top_index) ? loss_[top_index] : Dtype(0);    //返回第top_index个输出blob对应的loss权重
  }

  /**
   * @brief Sets the loss associated with a top blob at a given index.
   */
  inline void set_loss(const int top_index, const Dtype value) {    //设置第top_index个输出blob对应的loss的权重为value
    if (loss_.size() <= top_index) {
      loss_.resize(top_index + 1, Dtype(0));    //个数小于top_index,则调整大小,并用Dtype(0)初始化一下
    }
    loss_[top_index] = value;     //对应的loss权重设置为value
  }

  /**
   * @brief Returns the layer type.
   */
  virtual inline const char* type() const { return ""; }    //返回layer的类型

  /**
   * @brief Returns the exact number of bottom blobs required by the layer,
   *        or -1 if no exact number is required.
   *
   * This method should be overridden to return a non-negative value if your
   * layer expects some exact number of bottom blobs.
   */
  virtual inline int ExactNumBottomBlobs() const { return -1; }   //layer中要求输入blob的准确个数,-1表示无要求
  /**
   * @brief Returns the minimum number of bottom blobs required by the layer,
   *        or -1 if no minimum number is required.
   *
   * This method should be overridden to return a non-negative value if your
   * layer expects some minimum number of bottom blobs.
   */
  virtual inline int MinBottomBlobs() const { return -1; }        //layer中要求输入blob的最小个数,-1表示无要求
  /**
   * @brief Returns the maximum number of bottom blobs required by the layer,
   *        or -1 if no maximum number is required.
   *
   * This method should be overridden to return a non-negative value if your
   * layer expects some maximum number of bottom blobs.
   */
  virtual inline int MaxBottomBlobs() const { return -1; }        //layer中要求输入blob的最大个数,-1表示无要求
  /**
   * @brief Returns the exact number of top blobs required by the layer,
   *        or -1 if no exact number is required.
   *
   * This method should be overridden to return a non-negative value if your
   * layer expects some exact number of top blobs.
   */
  virtual inline int ExactNumTopBlobs() const { return -1; }      //layer中要求输出blob的准确个数,-1表示无要求
  /**
   * @brief Returns the minimum number of top blobs required by the layer,
   *        or -1 if no minimum number is required.
   *
   * This method should be overridden to return a non-negative value if your
   * layer expects some minimum number of top blobs.
   */
  virtual inline int MinTopBlobs() const { return -1; }           //layer中要求输出blob的最小个数,-1表示无要求
  /**
   * @brief Returns the maximum number of top blobs required by the layer,
   *        or -1 if no maximum number is required.
   *
   * This method should be overridden to return a non-negative value if your
   * layer expects some maximum number of top blobs.
   */
  virtual inline int MaxTopBlobs() const { return -1; }           //layer中要求输出blob的最大个数,-1表示无要求
  /**
   * @brief Returns true if the layer requires an equal number of bottom and
   *        top blobs.
   *
   * This method should be overridden to return true if your layer expects an
   * equal number of bottom and top blobs.
   */
  virtual inline bool EqualNumBottomTopBlobs() const { return false; }  //layer中要求输入blob的个数与输出blob的个数是否相等

  /**
   * @brief Return whether "anonymous" top blobs are created automatically
   *        by the layer.
   *
   * If this method returns true, Net::Init will create enough "anonymous" top
   * blobs to fulfill the requirement specified by ExactNumTopBlobs() or
   * MinTopBlobs().
   */
  //是否允许自动创建匿名blob.是则在Net::Init()中会自动创建匿名输出blob,直至其个数达到ExactNumTopBlobs()和MinTopBlobs()的最大值
  virtual inline bool AutoTopBlobs() const { return false; }    //TODO 匿名blob的用途暂时还不了解

  /**
   * @brief Return whether to allow force_backward for a given bottom blob
   *        index.
   *
   * If AllowForceBackward(i) == false, we will ignore the force_backward
   * setting and backpropagate to blob i only if it needs gradient information
   * (as is done when force_backward == false).
   */
  //允许的话则当net设置了强制反传时,layer的每个输出blob都必须遵从net的设置.如果不允许的话,则layer的输出blob是否需要反向传播
  //仍是根据自身的设置来决定,不必考虑net的设置.具体可见 net.cpp -> Init() -> if(param.force_backward())...
  virtual inline bool AllowForceBackward(const int bottom_index) const {    //layer是否允许强制反向传播
    return true;
  }

  /**
   * @brief Specifies whether the layer should compute gradients w.r.t. a
   *        parameter at a particular index given by param_id.
   *
   * You can safely ignore false values and always compute gradients
   * for all parameters, but possibly with wasteful computation.
   */
  inline bool param_propagate_down(const int param_id) {  //返回layer中第param_id个参数blob是否需要计算梯度
    return (param_propagate_down_.size() > param_id) ?
        param_propagate_down_[param_id] : false;          //超出param_id返回为false
  }
  /**
   * @brief Sets whether the layer should compute gradients w.r.t. a
   *        parameter at a particular index given by param_id.
   */
  inline void set_param_propagate_down(const int param_id, const bool value) {  //设置layer中第param_id个参数blob是否需要计算梯度
    if (param_propagate_down_.size() <= param_id) {
      param_propagate_down_.resize(param_id + 1, true);   //调整大小
    }
    param_propagate_down_[param_id] = value;    //设置值为value
  }

 protected:
  /** The protobuf that stores the layer parameters */
  //layer_param_中存放着layer中的各种参数,不过使用了protobuf库,例如layer中的blobs_是以BlobProto类型存放在layer_param_中
  LayerParameter layer_param_;      //layer的配置参数,protobuf消息类型
  /** The phase: TRAIN or TEST */
  Phase phase_;                     //layer的字段,训练模式(TRAIN)还是测试模式(TEST)
  /** The vector that stores the learnable parameters as a set of blobs. */
  vector<shared_ptr<Blob<Dtype> > > blobs_;   //layer中的可学习参数(如卷积层的权重,偏置参数都是以此类型存放)
  /** Vector indicating whether to compute the diff of each param blob. */
  vector<bool> param_propagate_down_;         //只是每个参数blob是否需要计算梯度

  /** The vector that indicates whether each top blob has a non-zero weight in
   *  the objective function. */
  vector<Dtype> loss_;    //个数与top.size()相等,表示每个top blob在计算loss时的权重,非loss layer中默认都是0

  /** @brief Using the CPU device, compute the layer output. */
  virtual void Forward_cpu(const vector<Blob<Dtype>*>& bottom,
      const vector<Blob<Dtype>*>& top) = 0;     //使用cpu进行前向计算
  /**
   * @brief Using the GPU device, compute the layer output.
   *        Fall back to Forward_cpu() if unavailable.
   */
  virtual void Forward_gpu(const vector<Blob<Dtype>*>& bottom,
      const vector<Blob<Dtype>*>& top) {      //使用gpu进行前向计算
    // LOG(WARNING) << "Using CPU code as backup.";
    return Forward_cpu(bottom, top);          //子类中未实现则默认调用Forward_cpu()
  }

  /**
   * @brief Using the CPU device, compute the gradients for any parameters and
   *        for the bottom blobs if propagate_down is true.
   */
  virtual void Backward_cpu(const vector<Blob<Dtype>*>& top,
      const vector<bool>& propagate_down,
      const vector<Blob<Dtype>*>& bottom) = 0;    //使用cpu进行反向计算
  /**
   * @brief Using the GPU device, compute the gradients for any parameters and
   *        for the bottom blobs if propagate_down is true.
   *        Fall back to Backward_cpu() if unavailable.
   */
  virtual void Backward_gpu(const vector<Blob<Dtype>*>& top,
      const vector<bool>& propagate_down,
      const vector<Blob<Dtype>*>& bottom) {       //使用gpu进行反向计算
    // LOG(WARNING) << "Using CPU code as backup.";
    Backward_cpu(top, propagate_down, bottom);    //未实现则调用Backward_cpu()
  }

  /**
   * Called by the parent Layer's SetUp to check that the number of bottom
   * and top Blobs provided as input match the expected numbers specified by
   * the {ExactNum,Min,Max}{Bottom,Top}Blobs() functions.
   */
  virtual void CheckBlobCounts(const vector<Blob<Dtype>*>& bottom,
                               const vector<Blob<Dtype>*>& top) {   //检查输入输出blob的个数是否符合要求
    if (ExactNumBottomBlobs() >= 0) {
      CHECK_EQ(ExactNumBottomBlobs(), bottom.size())    //指定了输入blob的个数,检查是否相等
          << type() << " Layer takes " << ExactNumBottomBlobs()
          << " bottom blob(s) as input.";
    }
    if (MinBottomBlobs() >= 0) {
      CHECK_LE(MinBottomBlobs(), bottom.size())         //指定了输入blob的最小个数,检查是否小于等于
          << type() << " Layer takes at least " << MinBottomBlobs()
          << " bottom blob(s) as input.";
    }
    if (MaxBottomBlobs() >= 0) {
      CHECK_GE(MaxBottomBlobs(), bottom.size())         //指定了输入blob的最大个数,检查是否大于等于
          << type() << " Layer takes at most " << MaxBottomBlobs()
          << " bottom blob(s) as input.";
    }
    if (ExactNumTopBlobs() >= 0) {
      CHECK_EQ(ExactNumTopBlobs(), top.size())          //指定了输出blob的个数,检查是否相等
          << type() << " Layer produces " << ExactNumTopBlobs()
          << " top blob(s) as output.";
    }
    if (MinTopBlobs() >= 0) {
      CHECK_LE(MinTopBlobs(), top.size())               //指定了输出blob的最小个数,检查是否小于等于
          << type() << " Layer produces at least " << MinTopBlobs()
          << " top blob(s) as output.";
    }
    if (MaxTopBlobs() >= 0) {
      CHECK_GE(MaxTopBlobs(), top.size())               //指定了输出blob的最大个数,检查是否大于等于
          << type() << " Layer produces at most " << MaxTopBlobs()
          << " top blob(s) as output.";
    }
    if (EqualNumBottomTopBlobs()) {
      CHECK_EQ(bottom.size(), top.size())               //指定了输入输出blob的个数必须相等,检查是否相等
          << type() << " Layer produces one top blob as output for each "
          << "bottom blob input.";
    }
  }

  /**
   * Called by SetUp to initialize the weights associated with any top blobs in
   * the loss function. Store non-zero loss weights in the diff blob.
   */
  inline void SetLossWeights(const vector<Blob<Dtype>*>& top) {     //为每个输出blob设置对应的权重
    const int num_loss_weights = layer_param_.loss_weight_size();   //layer_param_中设置的loss weight的个数
    if (num_loss_weights) {
      CHECK_EQ(top.size(), num_loss_weights) << "loss_weight must be "
          "unspecified or specified once per top blob.";            //检查loss权重个数与输出blob个数是否相等
      for (int top_id = 0; top_id < top.size(); ++top_id) {
        const Dtype loss_weight = layer_param_.loss_weight(top_id); //第top_id个输出blob对应的loss权重
        if (loss_weight == Dtype(0)) { continue; }                  //loss权重为0,直接跳过
        this->set_loss(top_id, loss_weight);                        //权重保存在loss_中,loss_[top_id]=loss_weight
        const int count = top[top_id]->count();                     //第top_id个输出blob的数据的个数
        Dtype* loss_multiplier = top[top_id]->mutable_cpu_diff();   //第top_id个输出blob的diff的数据指针
        caffe_set(count, loss_weight, loss_multiplier);   //将输出blob的diff数据设置为对应的权重,loss_multiplier[i]=loss_weight,i=[0,count)
      }
    }
  }

 private:
  DISABLE_COPY_AND_ASSIGN(Layer);
};  // class Layer

// Forward and backward wrappers. You should implement the cpu and
// gpu specific implementations instead, and should not change these
// functions.
template <typename Dtype>
inline Dtype Layer<Dtype>::Forward(const vector<Blob<Dtype>*>& bottom,    //前向计算过程,由输入blob计算输出blob和对应的loss
    const vector<Blob<Dtype>*>& top) {
  Dtype loss = 0;
  Reshape(bottom, top);         //先调整输出blob和内部缓冲区的形状
  switch (Caffe::mode()) {      //当前caffe运行的模式
  case Caffe::CPU:
    Forward_cpu(bottom, top);   //cpu模式下,调用Forward_cpu()执行前向计算
    for (int top_id = 0; top_id < top.size(); ++top_id) {   //每个输出blob数据
      if (!this->loss(top_id)) { continue; }                //loss权重为0,跳过
      const int count = top[top_id]->count();               //blob数据的个数
      const Dtype* data = top[top_id]->cpu_data();          //blob数据的cpu指针
      const Dtype* loss_weights = top[top_id]->cpu_diff();  //blob数据的loss权重
      //loss_weights只有在loss layer中才是非0值.而loss layer中的top blob的data_数据就是误差值,即此处的data存放layer的误差,点乘得到总的loss
      loss += caffe_cpu_dot(count, data, loss_weights);     //data点乘loss_weights,得到loss值
    }
    break;
  case Caffe::GPU:              //caffe运行在gpu模式下
    Forward_gpu(bottom, top);   //调用Forward_gpu()执行前向计算过程
#ifndef CPU_ONLY
    for (int top_id = 0; top_id < top.size(); ++top_id) {   //与上类似,提取输出blob的gpu数据并计算总的loss值
      if (!this->loss(top_id)) { continue; }
      const int count = top[top_id]->count();
      const Dtype* data = top[top_id]->gpu_data();
      const Dtype* loss_weights = top[top_id]->gpu_diff();
      Dtype blob_loss = 0;
      caffe_gpu_dot(count, data, loss_weights, &blob_loss);
      loss += blob_loss;
    }
#endif
    break;
  default:
    LOG(FATAL) << "Unknown caffe mode.";
  }
  return loss;    //返回loss值
}

template <typename Dtype>
inline void Layer<Dtype>::Backward(const vector<Blob<Dtype>*>& top,
    const vector<bool>& propagate_down,
    const vector<Blob<Dtype>*>& bottom) {           //执行反向传播过程
  switch (Caffe::mode()) {      //根据模式调用相应的处理函数
  case Caffe::CPU:
    Backward_cpu(top, propagate_down, bottom);
    break;
  case Caffe::GPU:
    Backward_gpu(top, propagate_down, bottom);
    break;
  default:
    LOG(FATAL) << "Unknown caffe mode.";
  }
}

// Serialize LayerParameter to protocol buffer
//将layer中的参数blob写入到LayerParameter中的BlobProto类型的消息中
template <typename Dtype>
void Layer<Dtype>::ToProto(LayerParameter* param, bool write_diff) {
  param->Clear();         //清空param中的所有消息数据
  param->CopyFrom(layer_param_);    //将layer中的消息layer_param_拷贝至param中
  param->clear_blobs();   //清空param中的BlobProto消息数据
  for (int i = 0; i < blobs_.size(); ++i) {
    //调用blob.cpp中的Blob<double/float>::ToProto()函数,将blob数据以BlobProto消息格式写入param中
    blobs_[i]->ToProto(param->add_blobs(), write_diff);
  }
}

summary

  1. Layer class members vector<Dtype> loss_;, loss_ value stored in the output of top blob corresponding weight loss, rather than the actual value of the loss, the variable name seems a bit misleading.
  2. ToProto () function for first param->clear_blobs()operation, then blobs_[i]->ToProto()operation. Layer class when created, blob layer parameters are read from the BlobProto layer_param_ type of incoming message ( blobs_[i]->FromProto(layer_param_.blobs(i))), in this case Layer class, LayerParameter BlobProto type of variable data types and variable Blob content is the same. But with the training network, Blob type variable (learnable parameters) will be constantly updated, and the data LayerParameter in BlobProto is not updated, the data are not consistent, so you need to remove the old BlobProto in ToProto () function data, the Blob type data is converted into BlobProto wherein type.

Caffe source is the first time I read while reading side of the record, to understand and analyze the code there may be errors or omissions, I hope you readers criticism, thank you support!

Guess you like

Origin www.cnblogs.com/Relu110/p/11986637.html