VGG_face Caffe 微调(finetuing)详细教程(一)

引言

VGG Face Descriptor 是牛津大学VGG小组的工作,现在已经开源训练好的网络结构和模型参数,本文将基于此模型在caffe上使用其他的人脸数据进行模型微调,并验证微调结果。

一.额外的人脸数据集和预微调模型

要对训练好的模型进行微调,首先我们得自己准备额外的人脸数据。这里我选用IMM人脸数据库,该人脸数据库包括了240张人脸图片,共40个人(7女33男),每人6张人脸图片,每张人脸图片被标记了58个特征点。
下载连接:http://www.imm.dtu.dk/~aam/
可能上面的连接下载速度会过慢,我上传到了博客上面供大家下载:
https://download.csdn.net/download/huxiny/10751345

下图是IMM数据库的人脸图:
在这里插入图片描述

既然要进行微调,我们得有训练好的caffemodel,到下面地址下载:
http://www.robots.ox.ac.uk/~vgg/software/vgg_face/

二.数据处理

IMM数据库里的图片是不能直接用的,这里我们需要做一下数据处理,将图片数据转化为lmdb数据。
将图片数据转化为lmdb数据我们需要一个txt文件,这个txt文件内容是怎么样的? 如下图所示:
在这里插入图片描述
文件内容格式很简单,文件名 (空格) 文件所属类。
我们这里拥有40个小伙子,当然就有40个类,不过类的编号要从0开始(这是caffe的规定),也就是0~39。

这个文件的生成很简单,先别急,我们等会在关心这个文件,在此之前我们需要将这240张图片分配到3个文件夹下面,这3个文件夹分别是train 、test和val。
train文件夹用于存放对模型微调所需要的人脸数据;
test文件夹用于存放微调过程中测试所用的人脸数据;
val文件夹用于存放模型微调结束后用于验证模型的人脸数据。

我们直接使用下面命令移动图面就好了:

mv *-6*.jpg test
mv *-5*.jpg val

40个人,每个人有6张照片,我们把所有人的第6张照片放到test文件夹中,同理将第5张照片放到val文件夹中,剩下的放到train文件夹中用于训练。

接下来就该生成train.txt和test.txt文件了(不用生成val.txt),这里我用的是python脚本生成的,很简单,但是我还是把代码贴一下:

import os

def file_name(file_dir):
    for root, dirs, files in os.walk(file_dir):
        return files

def writeToFile(fileName, lines):
    with open(fileName, 'w') as f:    # 若文件不存在则会创建一个
        f.writelines(lines)

lines = []
fileNames = file_name('yourPath/train')
for i in range(0, len(fileNames)):
    num = str(int(fileNames[i].split('-')[0]) - 1)
    line = fileNames[i] + ' ' + num + '\n'
    lines.append(line)
writeToFile('yourPath/train.txt', lines)

改一下路径就可以用了。

接下来到我们的这小一步的最终目的,将train文件夹和test文件夹内的人脸数据转化为lmdb数据。
这里我们使用的是caffe提供给我们的一个convert_imageset工具,该工具在caffe_root/build/tools/ 目录下。
这里写一个脚本来完成lmdb的转化:(如果你不嫌麻烦的话你可以直接调用命令来实现)

#!/usr/bin/env sh
MY=/home/pzs/husin/caffePython/husin_download/VGG_face

echo "Create train lmdb..."
rm -rf $MY/img_train_lmdb
/home/pzs/caffe/caffe/build/tools/convert_imageset --shuffle --resize_height=224 --resize_width=224 $MY/train/ $MY/train.txt $MY/img_train_lmdb

echo "Create test lmdb..."
rm -rf $MY/img_test_lmdb
/home/pzs/caffe/caffe/build/tools/convert_imageset --shuffle --resize_height=224 --resize_width=224 $MY/test/ $MY/test.txt $MY/img_test_lmdb


echo "All Done"

同样的,改一下路径就可以用了。

这里说明一下,vgg_face输入的图片大小格式为224×224,所以使用resize_height和resize_width参数将图片大小转换为224×224。

运行脚本后,得到两个文件夹:
在这里插入图片描述
每个文件夹内都有两个文件:
在这里插入图片描述

不过我们不用关心这两个文件夹里的内容,我们只需要得到这两个文件夹就够了,这里只是提一下,有兴趣的同学可以深入了解。

三.生成均值文件

caffe样本均值文件的作用这里就不说了,百度一下就都明白了,这里只说明怎么生成均值文件。
caffe程序提供了一个计算均值的文件compute_image_mean.cpp,我们直接使用就可以了:

    # sudo build/tools/compute_image_mean yourPath/img_train_lmdb yourPath/mean.binaryproto

当然你可以可写成脚本,方便下次使用。

compute_image_mean带两个参数,第一个参数是lmdb训练数据位置,第二个参数设定均值文件的名字及保存路径。
运行成功后,会在 yourPath/ 下面生成一个mean.binaryproto的均值文件。

四.caffe训练相关配置文件修改

接下来我们要创建一个train_test.prototxt文件,该文件内容是神经网络结构的描述,你可以自己写,也可以复制下面的代码,注意要改动的地方都标出来了。

name: "VGG_FACE_16_Net"
layer {
  name: "data"
  type: "Data"           
  top: "data"
  top: "label"
  data_param {
    source: "$/train_lmdb"                     # 这里修改,train数据的lmdb文件地址
    backend:LMDB
    batch_size: 100                               # 这里修改,根据自己情况修改,我改为了4
  }
  transform_param {
     mean_file: "$/mean.binaryproto"    # 这里修改,均值文件地址
     mirror: true
  }
  include: { phase: TRAIN }
}

layer {
  name: "data"
  type: "Data"
  top: "data"
  top: "label"
  data_param {
    source: "$/test_lmdb"                      # 这里修改,test数据的lmdb文件地址
    backend:LMDB
    batch_size: 25                               # 这里修改,根据自己情况修改,我改为了4
  }
  transform_param {
    mean_file: "$/mean.binaryproto"   # 这里修改,均值文件地址
    mirror: true
  }
  include: { 
    phase: TEST 
  }
}
layer {
  name: "conv1_1"
  type: "Convolution"
  bottom: "data"
  top: "conv1_1"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 64
    kernel_size: 3
    pad: 1
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "relu1_1"
  type: "ReLU"
  bottom: "conv1_1"
  top: "conv1_1"
}
layer {
  name: "conv1_2"
  type: "Convolution"
  bottom: "conv1_1"
  top: "conv1_2"
  param {
    lr_mult: 1 
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 64
    kernel_size: 3
    pad: 1
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  } 
}
layer {
  name: "relu1_2"
  type: "ReLU"
  bottom: "conv1_2"
  top: "conv1_2"
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1_2"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "conv2_1"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2_1"
  param {
    lr_mult: 1
    decay_mult: 1
  } 
  param {
    lr_mult: 2
    decay_mult: 0
  } 
  convolution_param {
    num_output: 128
    kernel_size: 3
    pad: 1
    weight_filler {
      type: "gaussian"
      std: 0.01
    } 
    bias_filler {
      type: "constant"
      value: 0
    } 
  } 
}
layer {
  name: "relu2_1"
  type: "ReLU"
  bottom: "conv2_1"
  top: "conv2_1"
}
layer { 
  name: "conv2_2"
  type: "Convolution"
  bottom: "conv2_1"
  top: "conv2_2"
  param {
    lr_mult: 1
    decay_mult: 1
  } 
  param {
    lr_mult: 2
    decay_mult: 0
  } 
  convolution_param {
    num_output: 128
    kernel_size: 3
    pad: 1
    weight_filler {
      type: "gaussian"
      std: 0.01 
    } 
    bias_filler {
      type: "constant"
      value: 0
    }
  } 
}
layer {
  name: "relu2_2"
  type: "ReLU"
  bottom: "conv2_2"
  top: "conv2_2"
}
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "conv2_2"
  top: "pool2"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "conv3_1"
  type: "Convolution"
  bottom: "pool2"
  top: "conv3_1"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 256
    kernel_size: 3
    pad: 1
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "relu3_1"
  type: "ReLU"
  bottom: "conv3_1"
  top: "conv3_1"
}
layer {
  name: "conv3_2"
  type: "Convolution"
  bottom: "conv3_1"
  top: "conv3_2"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 256
    kernel_size: 3
    pad: 1
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "relu3_2"
  type: "ReLU"
  bottom: "conv3_2"
  top: "conv3_2"
}
layer {
  name: "conv3_3"
  type: "Convolution"
  bottom: "conv3_2"
  top: "conv3_3"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 256
    kernel_size: 3
    pad: 1
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "relu3_3"
  type: "ReLU"
  bottom: "conv3_3"
  top: "conv3_3"
}
layer {
  name: "pool3"
  type: "Pooling"
  bottom: "conv3_3"
  top: "pool3"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "conv4_1"
  type: "Convolution"
  bottom: "pool3"
  top: "conv4_1"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 512
    kernel_size: 3
    pad: 1
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "relu4_1"
  type: "ReLU"
  bottom: "conv4_1"
  top: "conv4_1"
}
layer {
  name: "conv4_2"
  type: "Convolution"
  bottom: "conv4_1"
  top: "conv4_2"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 512
    kernel_size: 3
    pad: 1
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "relu4_2"
  type: "ReLU"
  bottom: "conv4_2"
  top: "conv4_2"
}
layer {
  name: "conv4_3"
  type: "Convolution"
  bottom: "conv4_2"
  top: "conv4_3"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 512
    kernel_size: 3
    pad: 1
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "relu4_3"
  type: "ReLU"
  bottom: "conv4_3"
  top: "conv4_3"
}
layer {
  name: "pool4"
  type: "Pooling"
  bottom: "conv4_3"
  top: "pool4"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "conv5_1"
  type: "Convolution"
  bottom: "pool4"
  top: "conv5_1"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 512
    kernel_size: 3
    pad: 1
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "relu5_1"
  type: "ReLU"
  bottom: "conv5_1"
  top: "conv5_1"
}
layer {
  name: "conv5_2"
  type: "Convolution"
  bottom: "conv5_1"
  top: "conv5_2"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 512
    kernel_size: 3
    pad: 1
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "relu5_2"
  type: "ReLU"
  bottom: "conv5_2"
  top: "conv5_2"
}
layer {
  name: "conv5_3"
  type: "Convolution"
  bottom: "conv5_2"
  top: "conv5_3"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 512
    kernel_size: 3
    pad: 1
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "relu5_3"
  type: "ReLU"
  bottom: "conv5_3"
  top: "conv5_3"
}
layer {
  name: "pool5"
  type: "Pooling"
  bottom: "conv5_3"
  top: "pool5"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}

layer {
  name: "fc6"
  type: "InnerProduct"
  bottom: "pool5"
  top: "fc6"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 4096
    weight_filler {
      type: "gaussian"
      std: 0.005
    }
    bias_filler {
      type: "constant"
      value: 1
    }
  }
}
layer {
  name: "relu6"
  type: "ReLU"
  bottom: "fc6"
  top: "fc6"
}
layer {
  name: "drop6"
  type: "Dropout"
  bottom: "fc6"
  top: "fc6"
  dropout_param {
    dropout_ratio: 0.5
  }
}
layer {
  name: "fc7"
  type: "InnerProduct"
  bottom: "fc6"
  top: "fc7"
  # Note that lr_mult can be set to 0 to disable any fine-tuning of this, and any other, layer
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 4096
    weight_filler {
      type: "gaussian"
      std: 0.005
    }
    bias_filler {
      type: "constant"
      value: 1
    }
  }
}
layer {
  name: "relu7"
  type: "ReLU"
  bottom: "fc7"
  top: "fc7"
}
layer {
  name: "drop7"
  type: "Dropout"
  bottom: "fc7"
  top: "fc7"
  dropout_param {
    dropout_ratio: 0.5
  }
}
layer {
  name: "fc8_flickr"
  type: "InnerProduct"
  bottom: "fc7"
  top: "fc8_flickr"
  # lr_mult is set to higher than for other layers, because this layer is starting from random while the others are already trained
  propagate_down: false
  inner_product_param {
    num_output: 356                         #这里修改,改为对应的分类数,有40人,应改为40
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "fc8_flickr"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "fc8_flickr"
  bottom: "label"
  top: "loss"
}

接着我们还要创建一个solver.prototxt文件,该文件内容是整个神经网络的一些超参数:

net: "/home/pzs/husin/caffePython/husin_download/VGG_face/train_test.prototxt"   # 改
test_iter: 40
test_interval: 10
# lr for fine-tuning should be lower than when starting from scratch
base_lr: 0.001
lr_policy: "step"
gamma: 0.1
# stepsize should also be lower, as we're closer to being done
stepsize: 20000
display: 20
max_iter: 1000
momentum: 0.9
weight_decay: 0.0005
snapshot: 20
snapshot_prefix: "/home/pzs/husin/caffePython/husin_download/VGG_face/snapshot"    # 改
# uncomment the following to default to CPU mode solving
solver_mode: CPU

这个文件没什么要改的了(但路径还是要该的),直接复制就好了,若了解参数的含义可以自行按照自己的需求优化调整。

五.训练

在训练之前,我们先创建一个snapshot文件夹,该文件夹保存训练过程中的caffemodel文件,solver.prototxt文件中的snapshot_prefix项指定路径,snapshot:20表示迭代20次保存一次caffemodel。

开始训练:

caffe_root/build/tools/caffe train -solver yourPath/solver.prototxt -weights yourPath/VGG_FACE.caffemodel 

运行上面命令就可以开始训练了,使用的是caffe自带的工具来运行的,solver参数后面跟着solver文件地址,weights参数后面跟着需要进行微调的caffemodel。

训练结果:
这里我用单核CPU迭代了1000次,花费了近8个小时,所以有GPU就用GPU吧。
在这里插入图片描述

精度竟然达到了100%,你敢信?

接下来我还会写一篇验证模型的文章,介绍如何使用训练好的caffe模型。

结束语

该文章是本人对caffe学习的阶段总结,有错误之处还请大家指出,以共同学习。

And let us not be weary in well-doing, for in due season, we shall reap, if we faint not

猜你喜欢

转载自blog.csdn.net/HUXINY/article/details/83510523