[Notes] Deep learning - caffe tool usage

This article is a summary of my knowledge points using caffe for deep learning, including links to other people's study notes

Introduction to caffe

The author of caffe is Jia Yangqing of UC Berkeley University. Caffe is a c++/CUDA architecture that supports command line, Python, Matlab interfaces and can run on CPU/GPU.

File structure of a Caffe project

  • caffe (file to be compiled, which contains core files such as data input, network layer, calculation, etc. written in c++)
  • data (the data to process or transform)
  • models:
    train.prototxt (网络模型)
    solve.prototxt(设置训练的一系列参数)
    xxx.caffemodel(finetune 时用的初始化参数,训练新模型则不需要)
    
  • scripts (code for training the network, which can be python files, shell files, etc.)

caffe installation reference: https://www.jianshu.com/p/9e0a18608527

caffe network structure

Caffe uses the Blob array structure to store, exchange, and process the network (just like numpy's storage structure is narray), and uses the caffe Layer to define the neural network structure, which includes data layers, visual layers and other types

caffe model

The following takes the code as an example to explain the common layers

train.prototxt (the following is part of the code of the VGG16 model)

  • data layer
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    
    name: "VGG16"   
    layer {   
      name: "data"   
      type: "Data" #Input data type 
      top: "data"   
      top: "label"   
      include {   
        phase: TRAIN   
      }   #Data 
      preprocessing to enhance data 
      transform_param {   
        mirror: true   
        crop_size: 224   
        mean_value: 103.939   
        mean_value: 116.779   
        mean_value: 123.68   
      }   
      data_param {   
        source: "data/ilsvrc12_shrt_256/ilsvrc12_train_leveldb" #Database file path 
        batch_size: 64 #Number of network single input data 
        backend: LEVELDB #Select whether to use LevelDB or LMDB 
      }   
    }
    

Caffe supports input data types:

type data
Data LMDB/levelDB
MemoryData memory data
HDF5Data HDF5 data
ImagesData 图像数据Images
WindowsData 窗口Windows

top:表示输出的方向,bottom:表示输入的数据来源(层的名称),可以有多个top和bottom

注意:在数据层中,至少有一个命名为data的top。如果有第二个top,一般命名为label

LMDB参考资料:https://zhuanlan.zhihu.com/p/23485774

  • 卷积层
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    
    layer {  
      bottom: "data"  
      top: "conv1_1"  
      name: "conv1_1"  
      type: "Convolution"  
      param {  
        lr_mult: 1  
        decay_mult: 1  
      }  
      param {  
        lr_mult: 2  
        decay_mult: 0  
      }  
      convolution_param {  
        num_output: 64  
        pad: 1  
        kernel_size: 3  
        weight_filler {  
          type: "gaussian"  
          std: 0.01  
        }  
        bias_filler {  
          type: "constant"  
          value: 0  
        }  
      }  
    }
    
参数  
num_output 卷积核数量
kernel_size 卷积核高度/宽度(可分别设置宽高)
weight_filler 参数初始化方案
bias_term 是否给卷积输出添加偏置项
pad 图像周围补0的像素个数
stride 滑动步长
group 指定分组卷积操作的组数
lr_mult 学习率(最终的学习率要乘以 solver.prototxt 配置文件中的 base_lr)
decay_mult 权值衰减
dropout_ratio 丢弃数据的概率
- -

dropout_ratio和decay_mult设置为了防止数据过拟合

  • 池化层
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    
    layer {  
      bottom: "pool1"  
      top: "conv2_1"  
      name: "conv2_1"  
      type: "Convolution"  
      param {  
        lr_mult: 1  
        decay_mult: 1  
      }  
      param {  
        lr_mult: 2  
        decay_mult: 0  
      }  
      convolution_param {  
        num_output: 128  
        pad: 1  
        kernel_size: 3  
        weight_filler {  
          type: "gaussian"  
          std: 0.01  
        }  
        bias_filler {  
          type: "constant"  
          value: 0  
        }  
      }  
    }
    
参数  
pool 池化方式,Max:最大池化,AVE:均值池化,STOCHASTIC:随机池化
  • 激活层

    1
    2
    3
    4
    5
    6
    
    layer {  
      bottom: "conv2_2"  
      top: "conv2_2"  
      name: "relu2_2"  
      type: "ReLU"  #常用激活函数,除此之外还有Sigmoid
    }
    
  • 损失函数层

    1
    2
    3
    4
    5
    6
    7
    
    layer {  
      bottom: "fc8"  
      bottom: "label"  
      top: "loss"  
      name: "loss"  
      type: "SoftmaxWithLoss"  
    }
    
type  
SoftmaxWithLoss 交叉信息熵损失函数
Softmax 多分类损失函数

caffe网络模型各层信息(详细):https://wenku.baidu.com/view/f77c73d02f60ddccdb38a025.html

Caffe模型训练

  • 网络可视化

当你写好自己的prototxt文件后,想要检查自己的网络框架是否搭建正确,可以借助 Netscope (在线caffe net可视化工具)http://ethereon.github.io/netscope/#/editor

vgg16 model

  • 训练参数设置

caffe模型的训练参数在solve.prototxt文件中,该文件是caffe的核心,它交替调用前向算法和反向传播算法来更新参数,使loss的值达到最小

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
net: "train_val.prototxt"
test_iter: 833
# make test net, but don't invoke it from the solver itself
test_interval: 1000
display: 200
average_loss: 100
base_lr: 1e-5
lr_policy: "step"
gamma: 0.1
stepsize: 5000
# lr for unnormalized softmax -- see train_val definition
# high momentum
momentum: 0.9
# no gradient accumulation
clip_gradients: 10000
iter_size: 1
max_iter: 80000
weight_decay: 0.02
snapshot: 4000
snapshot_prefix: "weight/VGG_item"
test_initialization: false

参数  
train_net 训练所需网络模型(最好写绝对路径)
test_net 测试所需网络模型
test_iter 测试次数 (test_iter * batchsize = 训练的数据量)
base_lr 基本学习率
lr_policy 学习率改变的方法
weight_decay weight decay
momentum weights representing the last gradient update
max_iter The maximum number of iterations
snapshot save model interval
snapshot_prefix save model path + prefix
solver_mode Whether to use GPU
average_loss Take the loss of multiple fowards as an average, and display the output
type optimization
  • train the network

1. Command line

We can train the network by typing code on the command line

1
2
> ./build/tools/caffe train -solver solver.prototxt
>

(For details, please refer to: "caffe command line analysis" https://www.cnblogs.com/denny402/p/5076285.html )

2、python

We can also use caffe's python interface to write programs to train the network

other problems

  • I/O size

In the process of training or fine-tuning the network with your own data, the size of img and label_img may be different. At this time, carefully analyze the size of the input and output of each layer displayed in the training process, and change the corresponding parameters to make the final training img is the same size as label_img

Convolution kernel and pooling layer output image size calculation formula:

(W-F+2P)/S+1

parameter illustrate
w input image size
F Convolution kernel size (kernel_size)
S stride size (stride)
P padding size (pad)

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324849205&siteId=291194637