Caffe学习之finetune

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/BVL10101111/article/details/74838770

finetune的好处想必大家都知道,在此不多说,那么在caffe中又是如何实现的呢。上代码:

 ./build/tools/caffe train -solver xxx.prototxt -weights xxx.caffemodel

意思就是用xxx.caffemodel里的训练好的权重初始化xxx.prototxt,里所要初始化的网络。

那么如何将xxx.caffemodel里的参数运用到自己的模型中呢,需要注意的是一下几点:

1.xxx.caffemodel 是别人训练好的参数,需要你从网络下载下来,比如here 或者自己训练过程中保存下来

2.自己的模型中,需要finetune的那几层layer的name需要和xxx.caffemodel中的name一样,type也一样。bottom和top的name一样不一样无所谓。

3.并不是name一样就可以了,需要finetune的这一层layer的bottom和top的shape必须和caffemodel对应的layer一致,不然会报错

4.要是xxx.cafemodel对应的layer的bottom,top和自己的model不一致,需要把该layer的name修改成与xxx.caffemodel中不一样就可,比如用lenet训练mnist之后的model来finetune自己的一个2分类的model,明显最后全连接的输出从10变成了2,则此全连接的layer的name不能与lenet相同。

为了说明问题,下面给出一个例子(不考虑finetune的是否合理性),用lenet_train_test.prototxt训练好的caffemodel网络来finetune自己的model

lenet_train_test网络如下:

name: "LeNet"
layer {
  name: "mnist"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  transform_param {
    scale: 0.00390625
  }
  data_param {
    source: "examples/mnist/mnist_train_lmdb"
    batch_size: 64
    backend: LMDB
  }
}
layer {
  name: "mnist"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  }
  transform_param {
    scale: 0.00390625
  }
  data_param {
    source: "examples/mnist/mnist_test_lmdb"
    batch_size: 100
    backend: LMDB
  }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 20
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 50
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "conv2"
  top: "pool2"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "pool2"
  top: "ip1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 500
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "ip1"
  top: "ip1"
}
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 10
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "ip2"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "ip2"
  bottom: "label"
  top: "loss"
}

被finetune的网络如下:

name: "MyFinetuneNet"
layer {
  name: "mnist"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  transform_param {
    scale: 0.00390625
  }
  data_param {
    source: "examples/mnist/mnist_train_lmdb"
    batch_size: 64
    backend: LMDB
  }
}
layer {
  name: "mnist"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  }
  transform_param {
    scale: 0.00390625
  }
  data_param {
    source: "examples/mnist/mnist_test_lmdb"
    batch_size: 100
    backend: LMDB
  }
}
layer {
  name: "fc1"
  type: "InnerProduct"
  bottom: "data"
  top: "fc1"
  param {
    lr_mult: 10
  }
  param {
    lr_mult: 20
  }
  inner_product_param {
    num_output: 800
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu3"
  type: "ReLU"
  bottom: "fc1"
  top: "fc1"
}


layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "fc1"
  top: "ip1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 500
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu4"
  type: "ReLU"
  bottom: "ip1"
  top: "ip1"
}


layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 10
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "ip2"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "ip2"
  bottom: "label"
  top: "loss"
}

简单来说lenet的网络为

mnist-conv1-pool1-conv2-pool2-ip1-relu1-ip2-loss

而自己的model为:

mnist-fc1-relu3-ip1-relu4-ip2-loss

在这里finetune的layer为ip1和ip2,

为了说明问题,人为的将自己model中的ip1和lenet的ip1进行finetune,但是不做任何处理的话这两个layer的输入shape是不一致的,无法finetune,运行时会报错,

因此人为的增加fc1,使得fc1的top的shape==lenet的ip1的bottom,这样便可finetune

这说明只要name,type,shape等一致,两个网络中的任意两个layer都可finetune。(当然还要考虑合理性)

猜你喜欢

转载自blog.csdn.net/BVL10101111/article/details/74838770