caffe 训练

1. train.bat

..\caffe-windows-master\bin\caffe.exe //安装配置时生成的caffe.exe路径

train //表示训练过程
--solver=.\model\solver.prototxt //参数设置文件

--weights=.\model\bvlc_reference_caffenet.caffemodel //现有权值文件，在此权值基础上进行权值微调

其中的<args>参数有：

-solver
-gpu
-snapshot
-weights
-iteration
-model
-sighup_effect
-sigint_effect
注意前面有个-符号。对应的功能为：

-solver：必选参数。一个protocol buffer类型的文件，即模型的配置文件。如：

# ./build/tools/caffe train -solver examples/mnist/lenet_solver.prototxt
-gpu: 可选参数。该参数用来指定用哪一块gpu运行，根据gpu的id进行选择，如果设置为'-gpu all'则使用所有的gpu运行。如使用第二块gpu运行：

# ./build/tools/caffe train -solver examples/mnist/lenet_solver.prototxt -gpu 2
-snapshot:可选参数。该参数用来从快照（snapshot)中恢复训练。可以在solver配置文件设置快照，保存solverstate。如：

# ./build/tools/caffe train -solver examples/mnist/lenet_solver.prototxt -snapshot examples/mnist/lenet_iter_5000.solverstate
-weights:可选参数。用预先训练好的权重来fine-tuning模型，需要一个caffemodel，不能和-snapshot同时使用。如：

# ./build/tools/caffe train -solver examples/finetuning_on_flickr_style/solver.prototxt -weights models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel
-iterations: 可选参数，迭代次数，默认为50。 如果在配置文件文件中没有设定迭代次数，则默认迭代50次。

-model:可选参数，定义在protocol buffer文件中的模型。也可以在solver配置文件中指定。

-sighup_effect：可选参数。用来设定当程序发生挂起事件时，执行的操作，可以设置为snapshot, stop或none, 默认为snapshot

-sigint_effect: 可选参数。用来设定当程序发生键盘中止事件时（ctrl+c), 执行的操作，可以设置为snapshot, stop或none, 默认为stop

 

刚才举例了一些train参数的例子，现在我们来看看其它三个<command>：

test参数用在测试阶段，用于最终结果的输出，要模型配置文件中我们可以设定需要输入accuracy还是loss. 假设我们要在验证集中验证已经训练好的模型，就可以这样写

# ./build/tools/caffe test -model examples/mnist/lenet_train_test.prototxt -weights examples/mnist/lenet_iter_10000.caffemodel -gpu 0 -iterations 100
这个例子比较长，不仅用到了test参数，还用到了-model, -weights, -gpu和-iteration四个参数。意思是利用训练好了的权重（-weight)，输入到测试模型中(-model)，用编号为0的gpu(-gpu)测试100次(-iteration)。

time参数用来在屏幕上显示程序运行时间。如：

# ./build/tools/caffe time -model examples/mnist/lenet_train_test.prototxt -iterations 10
这个例子用来在屏幕上显示lenet模型迭代10次所使用的时间。包括每次迭代的forward和backward所用的时间，也包括每层forward和backward所用的平均时间。

# ./build/tools/caffe time -model examples/mnist/lenet_train_test.prototxt -gpu 0
这个例子用来在屏幕上显示lenet模型用gpu迭代50次所使用的时间。

# ./build/tools/caffe time -model examples/mnist/lenet_train_test.prototxt -weights examples/mnist/lenet_iter_10000.caffemodel -gpu 0 -iterations 10
利用给定的权重，利用第一块gpu，迭代10次lenet模型所用的时间。

device_query参数用来诊断gpu信息。

# ./build/tools/caffe device_query -gpu 0
最后，我们来看两个关于gpu的例子

# ./build/tools/caffe train -solver examples/mnist/lenet_solver.prototxt -gpu 0,1
# ./build/tools/caffe train -solver examples/mnist/lenet_solver.prototxt -gpu all
这两个例子表示： 用两块或多块GPU来平行运算，这样速度会快很多。但是如果你只有一块或没有gpu, 就不要加-gpu参数了，加了反而慢。

2. solver

# The train/test net protocol buffer definition
net: "train_res34.prototxt"
# test_iter specifies how many forward passes the test should carry out.
# In the case of MNIST, we have test batch size 100 and 100 test iterations,
# covering the full 10,000 testing images.
test_iter: 100
# Carry out testing every 500 training iterations.
test_interval: 300
# The base learning rate, momentum and the weight decay of the network.
base_lr: 0.002


#加快梯度下降的速度
momentum: 0.95




weight_decay: 0.0001
# The learning rate policy
lr_policy: "inv"
gamma: 0.0001
power: 0.75
# Display every 100 iterations
display: 300
# The maximum number of iterations
max_iter: 1000000
# snapshot intermediate results
snapshot: 10000
snapshot_prefix: "save_path/"
# solver mode: CPU or GPU
solver_mode: GPU







正则化
regularization_type: "L2"  
weight_decay: 0.0008 weight_decay   lamda
 
学习率衰减设置
# The base learning rate, momentum and the weight decay of the network.
base_lr: 0.000002
# The learning rate policy
lr_policy: "inv"
gamma: 0.0001
power: 0.75
 
展示设置
# Display every 100 iterations
display: 500
迭代次数
# The maximum number of iterations
max_iter: 20000
 
保存快照
# snapshot intermediate results
snapshot: 5000
snapshot_prefix: "save_path/"
 
优化
# solver mode: CPU or GPU
solver_mode: GPU
 
1.SGD
momentum:0.9
type: "SGD" (默认不写)  
 
2.AdaDelta
momentum:0.9
type: "AdaDelta"  
delta: 1e-6  
 
 3.Adam
momentum:0.9
momentum2: 0.999
type:"Adam"
 
 4.AdaGrad
type: "AdaGrad"  
 
5.NAG
momentum: 0.95
type: "Nesterov"   
 
6.RMSprop
momentum: 0.95  
type: "RMSProp"  
rms_decay: 0.98

猜你喜欢