[Notes] Deep learning - caffe tool usage

This article is a summary of my knowledge points using caffe for deep learning, including links to other people's study notes

Introduction to caffe

The author of caffe is Jia Yangqing of UC Berkeley University. Caffe is a c++/CUDA architecture that supports command line, Python, Matlab interfaces and can run on CPU/GPU.

File structure of a Caffe project

caffe (file to be compiled, which contains core files such as data input, network layer, calculation, etc. written in c++)
data (the data to process or transform)

models：

train.prototxt (网络模型)
solve.prototxt（设置训练的一系列参数）
xxx.caffemodel（finetune 时用的初始化参数，训练新模型则不需要）

scripts (code for training the network, which can be python files, shell files, etc.)

caffe installation reference: https://www.jianshu.com/p/9e0a18608527

caffe network structure

Caffe uses the Blob array structure to store, exchange, and process the network (just like numpy's storage structure is narray), and uses the caffe Layer to define the neural network structure, which includes data layers, visual layers and other types

The following takes the code as an example to explain the common layers

train.prototxt (the following is part of the code of the VGG16 model)

data layer

name: "VGG16"   
layer {   
  name: "data"   
  type: "Data" #Input data type 
  top: "data"   
  top: "label"   
  include {   
    phase: TRAIN   
  }   #Data 
  preprocessing to enhance data 
  transform_param {   
    mirror: true   
    crop_size: 224   
    mean_value: 103.939   
    mean_value: 116.779   
    mean_value: 123.68   
  }   
  data_param {   
    source: "data/ilsvrc12_shrt_256/ilsvrc12_train_leveldb" #Database file path 
    batch_size: 64 #Number of network single input data 
    backend: LEVELDB #Select whether to use LevelDB or LMDB 
  }   
}

Caffe supports input data types:

type	data
Data	LMDB/levelDB
MemoryData	memory data
HDF5Data	HDF5 data
ImagesData	图像数据Images
WindowsData	窗口Windows

top:表示输出的方向,bottom:表示输入的数据来源（层的名称），可以有多个top和bottom

注意：在数据层中，至少有一个命名为data的top。如果有第二个top，一般命名为label

LMDB参考资料：https://zhuanlan.zhihu.com/p/23485774

卷积层

layer {  
  bottom: "data"  
  top: "conv1_1"  
  name: "conv1_1"  
  type: "Convolution"  
  param {  
    lr_mult: 1  
    decay_mult: 1  
  }  
  param {  
    lr_mult: 2  
    decay_mult: 0  
  }  
  convolution_param {  
    num_output: 64  
    pad: 1  
    kernel_size: 3  
    weight_filler {  
      type: "gaussian"  
      std: 0.01  
    }  
    bias_filler {  
      type: "constant"  
      value: 0  
    }  
  }  
}

参数
num_output	卷积核数量
kernel_size	卷积核高度/宽度（可分别设置宽高）
weight_filler	参数初始化方案
bias_term	是否给卷积输出添加偏置项
pad	图像周围补0的像素个数
stride	滑动步长
group	指定分组卷积操作的组数
lr_mult	学习率(最终的学习率要乘以 solver.prototxt 配置文件中的 base_lr)
decay_mult	权值衰减
dropout_ratio	丢弃数据的概率
-	-

dropout_ratio和decay_mult设置为了防止数据过拟合

池化层

layer {  
  bottom: "pool1"  
  top: "conv2_1"  
  name: "conv2_1"  
  type: "Convolution"  
  param {  
    lr_mult: 1  
    decay_mult: 1  
  }  
  param {  
    lr_mult: 2  
    decay_mult: 0  
  }  
  convolution_param {  
    num_output: 128  
    pad: 1  
    kernel_size: 3  
    weight_filler {  
      type: "gaussian"  
      std: 0.01  
    }  
    bias_filler {  
      type: "constant"  
      value: 0  
    }  
  }  
}

参数
pool	池化方式，Max:最大池化，AVE：均值池化，STOCHASTIC：随机池化

激活层

layer {  
  bottom: "conv2_2"  
  top: "conv2_2"  
  name: "relu2_2"  
  type: "ReLU"  #常用激活函数，除此之外还有Sigmoid
}

损失函数层

layer {  
  bottom: "fc8"  
  bottom: "label"  
  top: "loss"  
  name: "loss"  
  type: "SoftmaxWithLoss"  
}

type
SoftmaxWithLoss	交叉信息熵损失函数
Softmax	多分类损失函数

caffe网络模型各层信息（详细）：https://wenku.baidu.com/view/f77c73d02f60ddccdb38a025.html

Caffe模型训练

网络可视化

当你写好自己的prototxt文件后，想要检查自己的网络框架是否搭建正确，可以借助 Netscope （在线caffe net可视化工具）http://ethereon.github.io/netscope/#/editor

训练参数设置

caffe模型的训练参数在solve.prototxt文件中，该文件是caffe的核心，它交替调用前向算法和反向传播算法来更新参数，使loss的值达到最小

net: "train_val.prototxt"
test_iter: 833
# make test net, but don't invoke it from the solver itself
test_interval: 1000
display: 200
average_loss: 100
base_lr: 1e-5
lr_policy: "step"
gamma: 0.1
stepsize: 5000
# lr for unnormalized softmax -- see train_val definition
# high momentum
momentum: 0.9
# no gradient accumulation
clip_gradients: 10000
iter_size: 1
max_iter: 80000
weight_decay: 0.02
snapshot: 4000
snapshot_prefix: "weight/VGG_item"
test_initialization: false

参数
train_net	训练所需网络模型（最好写绝对路径）
test_net	测试所需网络模型
test_iter	测试次数（test_iter * batchsize = 训练的数据量)
base_lr	基本学习率
lr_policy	学习率改变的方法
weight_decay	weight decay
momentum	weights representing the last gradient update
max_iter	The maximum number of iterations
snapshot	save model interval
snapshot_prefix	save model path + prefix
solver_mode	Whether to use GPU
average_loss	Take the loss of multiple fowards as an average, and display the output
type	optimization

train the network

1. Command line

We can train the network by typing code on the command line

1 2	> ./build/tools/caffe train -solver solver.prototxt >

(For details, please refer to: "caffe command line analysis" https://www.cnblogs.com/denny402/p/5076285.html )

2、python

We can also use caffe's python interface to write programs to train the network

parameter	illustrate
w	input image size
F	Convolution kernel size (kernel_size)
S	stride size (stride)
P	padding size (pad)