Caffe学习之finetune

finetune的好处想必大家都知道，在此不多说，那么在caffe中又是如何实现的呢。上代码：

 ./build/tools/caffe train -solver xxx.prototxt -weights xxx.caffemodel

意思就是用xxx.caffemodel里的训练好的权重初始化xxx.prototxt，里所要初始化的网络。

那么如何将xxx.caffemodel里的参数运用到自己的模型中呢，需要注意的是一下几点：

1.xxx.caffemodel 是别人训练好的参数，需要你从网络下载下来，比如here 或者自己训练过程中保存下来

2.自己的模型中，需要finetune的那几层layer的name需要和xxx.caffemodel中的name一样，type也一样。bottom和top的name一样不一样无所谓。

3.并不是name一样就可以了，需要finetune的这一层layer的bottom和top的shape必须和caffemodel对应的layer一致，不然会报错

4.要是xxx.cafemodel对应的layer的bottom，top和自己的model不一致，需要把该layer的name修改成与xxx.caffemodel中不一样就可，比如用lenet训练mnist之后的model来finetune自己的一个2分类的model，明显最后全连接的输出从10变成了2，则此全连接的layer的name不能与lenet相同。

为了说明问题，下面给出一个例子（不考虑finetune的是否合理性），用lenet_train_test.prototxt训练好的caffemodel网络来finetune自己的model

lenet_train_test网络如下：

name: "LeNet"
layer {
  name: "mnist"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  transform_param {
    scale: 0.00390625
  }
  data_param {
    source: "examples/mnist/mnist_train_lmdb"
    batch_size: 64
    backend: LMDB
  }
}
layer {
  name: "mnist"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  }
  transform_param {
    scale: 0.00390625
  }
  data_param {
    source: "examples/mnist/mnist_test_lmdb"
    batch_size: 100
    backend: LMDB
  }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 20
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 50
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "conv2"
  top: "pool2"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "pool2"
  top: "ip1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 500
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "ip1"
  top: "ip1"
}
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 10
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "ip2"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "ip2"
  bottom: "label"
  top: "loss"
}

被finetune的网络如下：

name: "MyFinetuneNet"
layer {
  name: "mnist"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  transform_param {
    scale: 0.00390625
  }
  data_param {
    source: "examples/mnist/mnist_train_lmdb"
    batch_size: 64
    backend: LMDB
  }
}
layer {
  name: "mnist"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  }
  transform_param {
    scale: 0.00390625
  }
  data_param {
    source: "examples/mnist/mnist_test_lmdb"
    batch_size: 100
    backend: LMDB
  }
}
layer {
  name: "fc1"
  type: "InnerProduct"
  bottom: "data"
  top: "fc1"
  param {
    lr_mult: 10
  }
  param {
    lr_mult: 20
  }
  inner_product_param {
    num_output: 800
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu3"
  type: "ReLU"
  bottom: "fc1"
  top: "fc1"
}


layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "fc1"
  top: "ip1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 500
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu4"
  type: "ReLU"
  bottom: "ip1"
  top: "ip1"
}


layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 10
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "ip2"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "ip2"
  bottom: "label"
  top: "loss"
}

简单来说lenet的网络为

mnist-conv1-pool1-conv2-pool2-ip1-relu1-ip2-loss

而自己的model为：

mnist-fc1-relu3-ip1-relu4-ip2-loss

在这里finetune的layer为ip1和ip2，

为了说明问题，人为的将自己model中的ip1和lenet的ip1进行finetune，但是不做任何处理的话这两个layer的输入shape是不一致的，无法finetune，运行时会报错，

因此人为的增加fc1，使得fc1的top的shape==lenet的ip1的bottom，这样便可finetune

这说明只要name，type，shape等一致，两个网络中的任意两个layer都可finetune。（当然还要考虑合理性）

lenet_train_test网络如下：

被finetune的网络如下：

猜你喜欢