用了好久，都没写，还是按惯例写一下。总体步骤其实就是上面流程图那样。安装、配置、下载预训练模型、放入自己数据、修改模型结构然后训练和测试，每个模型都是这个流程。

安装

1.下载源码

https://github.com/Orpine/py-R-FCN

Py=python版本

2.安装caffe依赖和caffe

1）安装依赖

pip install cython

pip install easydict

apt-get install python-opencv

2）下载caffe

git clone https://github.com/Microsoft/caffe.git

3）配置caffe

4）打开终端，cd 你的RFCN路径/lib，然后make一下

5）编译caffe的python接口：make pycaffe

安装完毕

下面下载预训练模型测试试一试：

这个要翻墙下载，里面有res50和res101的imagenet预训练模型。我下好放在了百度云上。

链接：https://pan.baidu.com/s/1-M0r13ULm-8qdq34qfHoPQ

提取码：a72o

测试

把模型放到rfcn项目的对应位置

$RFCN_ROOT/data/rfcn_models/resnet50_rfcn_final.caffemodel

$RFCN_ROOT/data/rfcn_models/resnet101_rfcn_final.caffemodel

打开终端，运行

cd $RFCN_ROOT

./tools/demo_rfcn.py --net ResNet-50

训练自己的数据

在data文件夹下放入自己的数据集，格式如下

VOCdevkit/VOC2007

VOC2007里面就是自己的数据了，主要有三个文件夹

JPEGImages、Annotations、ImageSets

1)修改模型的结构参数

因为自己的数据类别和预训练的不一样，所以输出神经元节点的数量不一样，要自己设置。这里涉及到一个问题，那就是预训练模型和测试模型的区别。

预训练模型是别人保存的中间网络权重参数，其实尾巴的输出神经元数量没有保存，所以自己微调一下就能用，而训练好的测试模型就是将模型参数全部保存下来了，如果你的数据类别和别人一样那么你也可以用，当然，不一样的可能性还是很大的。就是改改数字而已。

修改的文件一共有7个：

<1>修改class-aware/train_ohem.prototxt

<2>修改class-aware/test.prototxt

<3>修改train_agnostic.prototxt

<4>修改train_agnostic_ohem.prototxt

<5>修改test_agnostic.prototxt

<6>$RFCN/lib/datasets/pascal_voc.py

<7>$RFCN_ROOT/lib/datasets/imdb.py

都在Models/pascal_voc里面，res50和res101分别对应不同文件夹，用哪个改哪个，这里以50的以end2end为例，

打开$RFCN_ROOT/models/pascal_voc/ResNet-50/rfcn_end2end

cls_num=数据集的类别数+1（背景）

eg:15类的数据，+1类背景，cls_num=16.

<1>修改class-aware/train_ohem.prototxt

layer {
  name: 'input-data'
  type: 'Python'
  top: 'data'
  top: 'im_info'
  top: 'gt_boxes'
  python_param {
    module: 'roi_data_layer.layer'
    layer: 'RoIDataLayer'
    param_str: "'num_classes': 16" #cls_num
  }
}

layer {
  name: 'roi-data'
  type: 'Python'
  bottom: 'rpn_rois'
  bottom: 'gt_boxes'
  top: 'rois'
  top: 'labels'
  top: 'bbox_targets'
  top: 'bbox_inside_weights'
  top: 'bbox_outside_weights'
  python_param {
    module: 'rpn.proposal_target_layer'
    layer: 'ProposalTargetLayer'
    param_str: "'num_classes': 16" #cls_num
  }
}

layer {
    bottom: "conv_new_1"
    top: "rfcn_cls"
    name: "rfcn_cls"
    type: "Convolution"
    convolution_param {
        num_output: 784 #cls_num*(score_maps_size^2)
        kernel_size: 1
        pad: 0
        weight_filler {
            type: "gaussian"
            std: 0.01
        }
        bias_filler {
            type: "constant"
            value: 0
        }
    }
    param {
        lr_mult: 1.0
    }
    param {
        lr_mult: 2.0
    }
}

layer {
    bottom: "conv_new_1"
    top: "rfcn_bbox"
    name: "rfcn_bbox"
    type: "Convolution"
    convolution_param {
        num_output: 3136 #4*cls_num*(score_maps_size^2)
        kernel_size: 1
        pad: 0
        weight_filler {
            type: "gaussian"
            std: 0.01
        }
        bias_filler {
            type: "constant"
            value: 0
        }
    }
    param {
        lr_mult: 1.0
    }
    param {
        lr_mult: 2.0
    }
}

layer {
    bottom: "rfcn_cls"
    bottom: "rois"
    top: "psroipooled_cls_rois"
    name: "psroipooled_cls_rois"
    type: "PSROIPooling"
    psroi_pooling_param {
        spatial_scale: 0.0625
        output_dim: 16  #cls_num
        group_size: 7
    }
}

layer {
    bottom: "rfcn_bbox"
    bottom: "rois"
    top: "psroipooled_loc_rois"
    name: "psroipooled_loc_rois"
    type: "PSROIPooling"
    psroi_pooling_param {
        spatial_scale: 0.0625
        output_dim: 64 #4*cls_num
        group_size: 7
    }
}

<2>修改class-aware/test.prototxt

layer {
    bottom: "conv_new_1"
    top: "rfcn_cls"
    name: "rfcn_cls"
    type: "Convolution"
    convolution_param {
        num_output: 784 #cls_num*(score_maps_size^2)
        kernel_size: 1
        pad: 0
        weight_filler {
            type: "gaussian"
            std: 0.01
        }
        bias_filler {
            type: "constant"
            value: 0
        }
    }
    param {
        lr_mult: 1.0
    }
    param {
        lr_mult: 2.0
    }
}

layer {
    bottom: "conv_new_1"
    top: "rfcn_bbox"
    name: "rfcn_bbox"
    type: "Convolution"
    convolution_param {
        num_output: 3136 #4*cls_num*(score_maps_size^2)
        kernel_size: 1
        pad: 0
        weight_filler {
            type: "gaussian"
            std: 0.01
        }
        bias_filler {
            type: "constant"
            value: 0
        }
    }
    param {
        lr_mult: 1.0
    }
    param {
        lr_mult: 2.0
    }
}

layer {
    bottom: "rfcn_cls"
    bottom: "rois"
    top: "psroipooled_cls_rois"
    name: "psroipooled_cls_rois"
    type: "PSROIPooling"
    psroi_pooling_param {
        spatial_scale: 0.0625
        output_dim: 16  #cls_num
        group_size: 7
    }
}

layer {
    bottom: "rfcn_bbox"
    bottom: "rois"
    top: "psroipooled_loc_rois"
    name: "psroipooled_loc_rois"
    type: "PSROIPooling"
    psroi_pooling_param {
        spatial_scale: 0.0625
        output_dim: 64  #4*cls_num
        group_size: 7
    }
}
layer {
    name: "cls_prob_reshape"
    type: "Reshape"
    bottom: "cls_prob_pre"
    top: "cls_prob"
    reshape_param {
        shape {
            dim: -1
            dim: 16  #cls_num
        }
    }
}

layer {
    name: "bbox_pred_reshape"
    type: "Reshape"
    bottom: "bbox_pred_pre"
    top: "bbox_pred"
    reshape_param {
        shape {
            dim: -1
            dim: 64  #4*cls_num
        }
    }
}

<3>修改train_agnostic.prototxt

layer {
  name: 'input-data'
  type: 'Python'
  top: 'data'
  top: 'im_info'
  top: 'gt_boxes'
  python_param {
    module: 'roi_data_layer.layer'
    layer: 'RoIDataLayer'
    param_str: "'num_classes': 16"  #cls_num
  }
}
layer {
    bottom: "conv_new_1"
    top: "rfcn_cls"
    name: "rfcn_cls"
    type: "Convolution"
    convolution_param {
        num_output: 784 #cls_num*(score_maps_size^2)   ###
        kernel_size: 1
        pad: 0
        weight_filler {
            type: "gaussian"
            std: 0.01
        }
        bias_filler {
            type: "constant"
            value: 0
        }
    }
    param {
        lr_mult: 1.0
    }
    param {
        lr_mult: 2.0
    }
}

layer {
    bottom: "rfcn_cls"
    bottom: "rois"
    top: "psroipooled_cls_rois"
    name: "psroipooled_cls_rois"
    type: "PSROIPooling"
    psroi_pooling_param {
        spatial_scale: 0.0625
        output_dim: 16 #cls_num   ###
        group_size: 7
    }
}

<4>修改train_agnostic_ohem.prototxt

layer {
  name: 'input-data'
  type: 'Python'
  top: 'data'
  top: 'im_info'
  top: 'gt_boxes'
  python_param {
    module: 'roi_data_layer.layer'
    layer: 'RoIDataLayer'
    param_str: "'num_classes': 16" #cls_num ###
  }
}

layer {
    bottom: "conv_new_1"
    top: "rfcn_cls"
    name: "rfcn_cls"
    type: "Convolution"
    convolution_param {
        num_output: 784 #cls_num*(score_maps_size^2)   ###
        kernel_size: 1
        pad: 0
        weight_filler {
            type: "gaussian"
            std: 0.01
        }
        bias_filler {
            type: "constant"
            value: 0
        }
    }
    param {
        lr_mult: 1.0
    }
    param {
        lr_mult: 2.0
    }
}

layer {
    bottom: "rfcn_cls"
    bottom: "rois"
    top: "psroipooled_cls_rois"
    name: "psroipooled_cls_rois"
    type: "PSROIPooling"
    psroi_pooling_param {
        spatial_scale: 0.0625
        output_dim: 16 #cls_num   ###
        group_size: 7
    }
}

<5>修改test_agnostic.prototxt

layer {
    bottom: "conv_new_1"
    top: "rfcn_cls"
    name: "rfcn_cls"
    type: "Convolution"
    convolution_param {
        num_output: 784 #cls_num*(score_maps_size^2) ###
        kernel_size: 1
        pad: 0
        weight_filler {
            type: "gaussian"
            std: 0.01
        }
        bias_filler {
            type: "constant"
            value: 0
        }
    }
    param {
        lr_mult: 1.0
    }
    param {
        lr_mult: 2.0
    }
}

layer {
    bottom: "rfcn_cls"
    bottom: "rois"
    top: "psroipooled_cls_rois"
    name: "psroipooled_cls_rois"
    type: "PSROIPooling"
    psroi_pooling_param {
        spatial_scale: 0.0625
        output_dim: 16 #cls_num   ###
        group_size: 7
    }
}

layer {
    name: "cls_prob_reshape"
    type: "Reshape"
    bottom: "cls_prob_pre"
    top: "cls_prob"
    reshape_param {
        shape {
            dim: -1
            dim: 16 #cls_num   ###
        }
    }
}

2）修改部分代码

因为自己的数据集标签具体名字也不一定，要自己设置

<1>$RFCN/lib/datasets/pascal_voc.py

class pascal_voc(imdb):
    def __init__(self, image_set, year, devkit_path=None):
        imdb.__init__(self, 'voc_' + year + '_' + image_set)
        self._year = year
        self._image_set = image_set
        self._devkit_path = self._get_default_path() if devkit_path is None \
                            else devkit_path
        self._data_path = os.path.join(self._devkit_path, 'VOC' + self._year)
        self._classes = ('__background__', # always index 0
                         '你的标签1','你的标签2',你的标签3','你的标签4'
                      )

<2>$RFCN_ROOT/lib/datasets/imdb.py

这里会报错，参考：

http://blog.csdn.net/xzzppp/article/details/52036794

修改迭代次数在lib/data/pascal_voc.py里面

3）训练

cd RFCN根目录

./experiments/scripts/rfcn_end2end_ohem.sh 0 ResNet-50 pascal_voc

4）测试