用了好久,都没写,还是按惯例写一下。总体步骤其实就是上面流程图那样。安装、配置、下载预训练模型、放入自己数据、修改模型结构然后训练和测试,每个模型都是这个流程。
安装
1.下载源码
https://github.com/Orpine/py-R-FCN
Py=python版本
2.安装caffe依赖和caffe
1)安装依赖
pip install cython
pip install easydict
apt-get install python-opencv
2)下载caffe
git clone https://github.com/Microsoft/caffe.git
3)配置caffe
4)打开终端,cd 你的RFCN路径/lib,然后make一下
5)编译caffe的python接口:make pycaffe
安装完毕
下面下载预训练模型测试试一试:
这个要翻墙下载,里面有res50和res101的imagenet预训练模型。我下好放在了百度云上。
链接:https://pan.baidu.com/s/1-M0r13ULm-8qdq34qfHoPQ
提取码:a72o
测试
把模型放到rfcn项目的对应位置
$RFCN_ROOT/data/rfcn_models/resnet50_rfcn_final.caffemodel
$RFCN_ROOT/data/rfcn_models/resnet101_rfcn_final.caffemodel
打开终端,运行
cd $RFCN_ROOT
./tools/demo_rfcn.py --net ResNet-50
训练自己的数据
在data文件夹下放入自己的数据集,格式如下
VOCdevkit/VOC2007
VOC2007里面就是自己的数据了,主要有三个文件夹
JPEGImages、Annotations、ImageSets
1)修改模型的结构参数
因为自己的数据类别和预训练的不一样,所以输出神经元节点的数量不一样,要自己设置。这里涉及到一个问题,那就是预训练模型和测试模型的区别。
预训练模型是别人保存的中间网络权重参数,其实尾巴的输出神经元数量没有保存,所以自己微调一下就能用,而训练好的测试模型就是将模型参数全部保存下来了,如果你的数据类别和别人一样那么你也可以用,当然,不一样的可能性还是很大的。就是改改数字而已。
修改的文件一共有7个:
<1>修改class-aware/train_ohem.prototxt
<2>修改class-aware/test.prototxt
<3>修改train_agnostic.prototxt
<4>修改train_agnostic_ohem.prototxt
<5>修改test_agnostic.prototxt
<6>$RFCN/lib/datasets/pascal_voc.py
<7>$RFCN_ROOT/lib/datasets/imdb.py
都在Models/pascal_voc里面,res50和res101分别对应不同文件夹,用哪个改哪个,这里以50的以end2end为例,
打开$RFCN_ROOT/models/pascal_voc/ResNet-50/rfcn_end2end
cls_num=数据集的类别数+1(背景)
eg:15类的数据,+1类背景,cls_num=16.
<1>修改class-aware/train_ohem.prototxt
layer {
name: 'input-data'
type: 'Python'
top: 'data'
top: 'im_info'
top: 'gt_boxes'
python_param {
module: 'roi_data_layer.layer'
layer: 'RoIDataLayer'
param_str: "'num_classes': 16" #cls_num
}
}
layer {
name: 'roi-data'
type: 'Python'
bottom: 'rpn_rois'
bottom: 'gt_boxes'
top: 'rois'
top: 'labels'
top: 'bbox_targets'
top: 'bbox_inside_weights'
top: 'bbox_outside_weights'
python_param {
module: 'rpn.proposal_target_layer'
layer: 'ProposalTargetLayer'
param_str: "'num_classes': 16" #cls_num
}
}
layer {
bottom: "conv_new_1"
top: "rfcn_cls"
name: "rfcn_cls"
type: "Convolution"
convolution_param {
num_output: 784 #cls_num*(score_maps_size^2)
kernel_size: 1
pad: 0
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
param {
lr_mult: 1.0
}
param {
lr_mult: 2.0
}
}
layer {
bottom: "conv_new_1"
top: "rfcn_bbox"
name: "rfcn_bbox"
type: "Convolution"
convolution_param {
num_output: 3136 #4*cls_num*(score_maps_size^2)
kernel_size: 1
pad: 0
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
param {
lr_mult: 1.0
}
param {
lr_mult: 2.0
}
}
layer {
bottom: "rfcn_cls"
bottom: "rois"
top: "psroipooled_cls_rois"
name: "psroipooled_cls_rois"
type: "PSROIPooling"
psroi_pooling_param {
spatial_scale: 0.0625
output_dim: 16 #cls_num
group_size: 7
}
}
layer {
bottom: "rfcn_bbox"
bottom: "rois"
top: "psroipooled_loc_rois"
name: "psroipooled_loc_rois"
type: "PSROIPooling"
psroi_pooling_param {
spatial_scale: 0.0625
output_dim: 64 #4*cls_num
group_size: 7
}
}
<2>修改class-aware/test.prototxt
layer {
bottom: "conv_new_1"
top: "rfcn_cls"
name: "rfcn_cls"
type: "Convolution"
convolution_param {
num_output: 784 #cls_num*(score_maps_size^2)
kernel_size: 1
pad: 0
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
param {
lr_mult: 1.0
}
param {
lr_mult: 2.0
}
}
layer {
bottom: "conv_new_1"
top: "rfcn_bbox"
name: "rfcn_bbox"
type: "Convolution"
convolution_param {
num_output: 3136 #4*cls_num*(score_maps_size^2)
kernel_size: 1
pad: 0
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
param {
lr_mult: 1.0
}
param {
lr_mult: 2.0
}
}
layer {
bottom: "rfcn_cls"
bottom: "rois"
top: "psroipooled_cls_rois"
name: "psroipooled_cls_rois"
type: "PSROIPooling"
psroi_pooling_param {
spatial_scale: 0.0625
output_dim: 16 #cls_num
group_size: 7
}
}
layer {
bottom: "rfcn_bbox"
bottom: "rois"
top: "psroipooled_loc_rois"
name: "psroipooled_loc_rois"
type: "PSROIPooling"
psroi_pooling_param {
spatial_scale: 0.0625
output_dim: 64 #4*cls_num
group_size: 7
}
}
layer {
name: "cls_prob_reshape"
type: "Reshape"
bottom: "cls_prob_pre"
top: "cls_prob"
reshape_param {
shape {
dim: -1
dim: 16 #cls_num
}
}
}
layer {
name: "bbox_pred_reshape"
type: "Reshape"
bottom: "bbox_pred_pre"
top: "bbox_pred"
reshape_param {
shape {
dim: -1
dim: 64 #4*cls_num
}
}
}
<3>修改train_agnostic.prototxt
layer {
name: 'input-data'
type: 'Python'
top: 'data'
top: 'im_info'
top: 'gt_boxes'
python_param {
module: 'roi_data_layer.layer'
layer: 'RoIDataLayer'
param_str: "'num_classes': 16" #cls_num
}
}
layer {
bottom: "conv_new_1"
top: "rfcn_cls"
name: "rfcn_cls"
type: "Convolution"
convolution_param {
num_output: 784 #cls_num*(score_maps_size^2) ###
kernel_size: 1
pad: 0
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
param {
lr_mult: 1.0
}
param {
lr_mult: 2.0
}
}
layer {
bottom: "rfcn_cls"
bottom: "rois"
top: "psroipooled_cls_rois"
name: "psroipooled_cls_rois"
type: "PSROIPooling"
psroi_pooling_param {
spatial_scale: 0.0625
output_dim: 16 #cls_num ###
group_size: 7
}
}
<4>修改train_agnostic_ohem.prototxt
layer {
name: 'input-data'
type: 'Python'
top: 'data'
top: 'im_info'
top: 'gt_boxes'
python_param {
module: 'roi_data_layer.layer'
layer: 'RoIDataLayer'
param_str: "'num_classes': 16" #cls_num ###
}
}
layer {
bottom: "conv_new_1"
top: "rfcn_cls"
name: "rfcn_cls"
type: "Convolution"
convolution_param {
num_output: 784 #cls_num*(score_maps_size^2) ###
kernel_size: 1
pad: 0
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
param {
lr_mult: 1.0
}
param {
lr_mult: 2.0
}
}
layer {
bottom: "rfcn_cls"
bottom: "rois"
top: "psroipooled_cls_rois"
name: "psroipooled_cls_rois"
type: "PSROIPooling"
psroi_pooling_param {
spatial_scale: 0.0625
output_dim: 16 #cls_num ###
group_size: 7
}
}
<5>修改test_agnostic.prototxt
layer {
bottom: "conv_new_1"
top: "rfcn_cls"
name: "rfcn_cls"
type: "Convolution"
convolution_param {
num_output: 784 #cls_num*(score_maps_size^2) ###
kernel_size: 1
pad: 0
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
param {
lr_mult: 1.0
}
param {
lr_mult: 2.0
}
}
layer {
bottom: "rfcn_cls"
bottom: "rois"
top: "psroipooled_cls_rois"
name: "psroipooled_cls_rois"
type: "PSROIPooling"
psroi_pooling_param {
spatial_scale: 0.0625
output_dim: 16 #cls_num ###
group_size: 7
}
}
layer {
name: "cls_prob_reshape"
type: "Reshape"
bottom: "cls_prob_pre"
top: "cls_prob"
reshape_param {
shape {
dim: -1
dim: 16 #cls_num ###
}
}
}
2)修改部分代码
因为自己的数据集标签具体名字也不一定,要自己设置
<1>$RFCN/lib/datasets/pascal_voc.py
class pascal_voc(imdb):
def __init__(self, image_set, year, devkit_path=None):
imdb.__init__(self, 'voc_' + year + '_' + image_set)
self._year = year
self._image_set = image_set
self._devkit_path = self._get_default_path() if devkit_path is None \
else devkit_path
self._data_path = os.path.join(self._devkit_path, 'VOC' + self._year)
self._classes = ('__background__', # always index 0
'你的标签1','你的标签2',你的标签3','你的标签4'
)
<2>$RFCN_ROOT/lib/datasets/imdb.py
这里会报错,参考:
http://blog.csdn.net/xzzppp/article/details/52036794
修改迭代次数在lib/data/pascal_voc.py里面
3)训练
cd RFCN根目录
./experiments/scripts/rfcn_end2end_ohem.sh 0 ResNet-50 pascal_voc
4)测试
cd RFCN根目录
./tools/demo_rfcn.py --net ResNet-50
参考文献:
https://blog.csdn.net/sinat_30071459/article/details/53202977