在上一篇Caffe:Caffe依赖项中我们介绍了Caffe的依赖项,这一篇要尝试用Caffe构建一个LeNet分类网络对Mnist数据集做分类。
文章目录
预处理
下载数据
Caffe提供了下载mnist使用的脚本,位于/data/mnist/get_mnist.sh
,运行脚本会获得4个文件:
- gunzip train-images-idx3-ubyte:训练图片
- gunzip train-labels-idx1-ubyte:训练图片对应的标签
- gunzip t10k-images-idx3-ubyte:测试图片
- gunzip t10k-labels-idx1-ubyte:测试图片对应的标签
数据格式转换
上边获得的四个文件都是二进制数据需要转化为key-value这种关系性数据,即lmdb格式数据。
运行脚本./exmaples/mnist/create_mnist.sh
进行数据转换,得到下面结构的数据集文件:
.
├── mnist_test_lmdb
│ ├── data.mdb
│ └── lock.mdb
└── mnist_train_lmdb
├── data.mdb
└── lock.mdb
数据集转换的源代码位于caffe/examples/mnist/mnist_train_lmdb/convert_mnist_data.cpp
。
LeNet-5模型
得到数据之后我们需要建立LeNet-5模型,Caffe模型需要定义在.prototxt中。所以我们新建一个lenet_trainval.prototxt文件,将网络名称定义为LeNet:
name:"LeNet" //网络名称
数据层
首先需要定义数据层:完成数据的读取工作,并输出图片和对应label。
由于训练和测试阶段使用数据不同,batchsize大小不同,训练阶段可能进行数据变换,扩充等增强,所以这两个阶段需要不同的数据层。
先定义训练阶段的数据层:
layer{ //定义一个层,语法和结构体非常相似
name:"mnist" //层的名称
type:"Data" //定义层的类型为“数据层”
top:"data" //top指这一层的输出
top:"label" //输出data和label
include{
phase:TRAIN //训练阶段使用
}
transform_param{
scale:0.00390625 //数据变换使用的缩放因子
}
data_param{ //数据层参数
source:"examples/mnist/mnist_train_lmdb" //训练集路径
batch_size:64
backend:LMDB //数据格式
}
}
定义测试阶段的数据层:
layer{
name:"mnist"
type:"Data"
top:"data"
top:"label"
include{
phase:TEST
}
transform_param{
scale:0.00390625
}
data_param{
source:"examples/mnist/mnist_test_imdb"
batch_size:100
backend:LMDB
}
}
卷积层
LeNet有2个卷积层,这里只列出第一个的,第二个参数稍有不同,最后会放出完整代码:
layer{
name:"conv1"
type:"Convolution" //层的类型为卷积层
bottom:"data" //接收data
top:"conv1" //输出conv1
param{
lr_mult:1 //weight学习倍率,乘在学习率上
}
param{
lr_mult:2 //bias学习倍率,乘在学习率上
}
convolution_param{
num_output:20 //输出通道数
kernel_size:5 //卷积核尺寸
stride:1 //步长
weight_filler{
type:"xavier" //weight初始化方式
}
bias_filler{
type:"constant" //bias初始化方式
}
}
}
pooling层
同样的,两个pooling,只列出第一个:
layer{
name:"pool1"
type:"Pooling" //层的类型为pooling层
bottom:"conv1"
top:"pool1"
pooling_param{
pool:MAX //最大pooling
kernel_size:2 //窗口尺寸
stride:2 //步长
}
}
全连接层
layer{
name:"ip1"
type:"InnerProduct"
bottom:"pool2"
top:"ip1"
param{
lr_mult:1
}
param{
lr_mult:2
}
inner_product_param{
num_output:500 //该层神经元格个数
weight_fiiler{
type:"xavier"
}
bias_filler{
type:"constant"
}
}
}
Relu层
layer{
name:"relu1"
type:" ReLU"
bottom:"ip1"
top:"ip1"
}
分类准确层
此层接收最后一个全连接层和label作为输入,输出准确率:
layer{
name:"accuracy"
type:"Accuracy"
bottom:"ip2"
bottom:"label"
top:"accuracy"
include{
phase:TEST
}
}
损失层
layer{
name="loss"
type:"SoftmaxWithLoss"
bottom:"ip2"
bottom:"label"
top:"loss"
}
完整网络结构
name: "LeNet"
layer {
name: "mnist"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
scale: 0.00390625
}
data_param {
source: "examples/mnist/mnist_train_lmdb"
batch_size: 32
backend: LMDB
shuffle: true
}
}
layer {
name: "mnist"
type: "Data"
top: "data"
top: "label"
include {
phase: TEST
}
transform_param {
scale: 0.00390625
}
data_param {
source: "examples/mnist/mnist_test_lmdb"
batch_size: 100
backend: LMDB
}
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 20
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 50
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "ip1"
type: "InnerProduct"
bottom: "pool2"
top: "ip1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 500
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "ip1"
top: "ip1"
}
layer {
name: "ip2"
type: "InnerProduct"
bottom: "ip1"
top: "ip2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 10
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "accuracy"
type: "Accuracy"
bottom: "ip2"
bottom: "label"
top: "accuracy"
include {
phase: TEST
}
}
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "ip2"
bottom: "label"
top: "loss"
}
可视化
Caffe中自带了专门用来可视化网络结构的脚本,路径是caffe/python/draw_net.py
,运行如下命令:
python /caffe/python/draw_net.py lenet_trainval.prototxt mlp_train.png --rankdir BT
可能需要pydot
conda install -c https://conda.binstar.org/sstromberg pydot
训练超参数
网络模型建立完还需要指定一些训练超参数,新建一个lenet_solver.prototxt
:
// 指定网络模型描述文件,这样训练时就只需要传超参数文件路径,不用再传模型文件路径了
net:"examples/mnist/lenet_trainval.prototxt"
//预测阶段迭代次数
test_iter:100
//训练时每迭代500次,进行一次预测
test_interval:500
//网络的基础学习率,冲量和权值衰减量
base_lr:0.01
momentum:0.9
weight_decay:0.0005
//学习率衰减策略
lr_policy:"inv"
gamma:0.0001
power:0.75
//每经过100次迭代,在屏幕打印log
display:100
//最大迭代次数
max_iter:10000
//每5000次打印一次快照
snapshot:5000
snapshot_prefix:"examples/mnist/lenet"
//Caff求解模式为CPU模式
solver_mode:CPU
训练和测试
训练
./build/tools/caffe train --solver=examples/mnist/lenet_solver.prototxt
训练结束会生成模型权值文件lenet_iter_10000.caffemodel
和训练记录文件lenet_iter_10000.solverstate
,他们都是Protobuffer二进制格式文件。
测试
./build/tools/caffe test\
--model examples/mnist/lenet_trainval.prototxt \
--weights examples/mnist/lenet_iter_10000.caffemodel \
-iterations 100
因为模型权值文件是二进制格式文件,所以还要加载相应的模型描述文件。10000个测试样本,batch_size为100,iter设成100,正好覆盖全部数据。
build/tools/caffe的用法
caffe: command line brew
usage: caffe <command> <args>
commands:
train train or finetune a model 训练或微调模型
test score a model 对模型打分
device_query show GPU diagnostic information 显示GPU诊断信息
time benchmark model execution time 评估模型执行时间
Flags from /home/server2/yz/workspace/code/person_search/caffe/tools/caffe.cpp:
-gpu (Run in GPU mode on given device ID.) type: int32 default: -1
-iterations (The number of iterations to run.) type: int32 default: 50
-model (The model definition protocol buffer text file..) type: string
default: ""
-snapshot (Optional; the snapshot solver state to resume training.)
type: string default: ""
-solver (The solver definition protocol buffer text file.) type: string
default: ""
-weights (Optional; the pretrained weights to initialize finetuning. Cannot
be set simultaneously with snapshot.) type: string default: ""