caffe network configuration--each computing layer
convolutional layer
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1 // Learning rate coefficient Final learning rate=learning rate in solver*lr_mult
}
param {
lr_mult: 2 // If there are two ir_mult : the first one represents the weight learning rate; the second one represents the bias term learning rate. generally
} The bias term learning rate is twice the weight learning rate.
convolution_param {
num_output: 32 //Number of convolution kernel filters
pad: 2 // default is 0
kernel_size: 5
stride: 1
weight_filler {
type: "gaussian" // Weight initialization. Default is constant : all 0s. Many times use xavier at the beginning
std: 0.0001 initialization, can also be set to gaussian
}
bias_filler {
type: "constant" //The bias item is initialized, and the constant is all 0
}
}
}
Input: n*c0*w0*h0
Output: n*c1*w1*h1
c1 is the number of feature maps
w1 = (w0+2*pad-kernel_size)/stride+1
h1 = (h0+2*pad-kernel_size)/stride+1
Accuracy output
layer {
name: "accuracy"
type: "Accuracy"
bottom: "ip2"
bottom: "label"
top: "accuracy"
include {
phase: TEST
}
}
classification layer
layer {
name: "loss"
type: "SoftmaxWithLoss" // output loss value
bottom: "ip2"
bottom: "label"
top: "loss"
}
layer {
name: "prob"
type: "Softmax" // Output probability value
bottom: "ip2"
bottom: "label"
top: "prob"
}
dimensional transformation layer
layer {
name: "reshape"
type: "Reshape" // Do not change the size of the input data value, only change the dimension
bottom: "input"
top: "output"
reshape_param{
shape{ //0: indicates that the dimension is unchanged, -1 indicates that the system automatically calculates the dimension. Other values: Change the original dimension to other values
dim:0
dim:0
dim:0
dim: -1
}
}
}
The original data is (32*3*28*28) --> shape{0, 0, 14, -1} --> (32*3*14*56)
dropout layer
// prevent overfitting
layer {
name: "drop"
type: "Dropout"
bottom: "ip2" //The input and output are the same layer
top: "ip2"
dropout_param{
dropout_ratio:0.5
}
}
activation function layer
too simple to explain
polling layer
Reference convolutional layer