Semantic image segmentation - use Deeplab v3 + train their own solutions to data loss shock

Problem Description:

    DeeplabV3 + training in the use of their own data set, loss has been around 0.4 shock, the test set MIOU value in the vicinity of 0.55 (poor results), toss a long time, and finally improved, most recently calculated result: test set MIOU> 0.8, and had intended together is not obvious.

Reference links:

1. https://blog.csdn.net/u011974639/article/details/80948990;

2. https://blog.csdn.net/qq_32799915/article/details/80070711

3. https://github.com/tensorflow/models/issues/3730

Data Set Description:

    Data of 2 classes, wherein: Images [256, 256, 3], jpg format; Labels [256, 256, 1], png format.

Specific modification steps:

1. Change the picture size

    Write a script, the picture size will be expanded, modified as: Images [512, 512, 3], Labels [512, 512, 1]. As to why you want to change the picture size, I will be explained later.

2. RUN datasets build_voc2012_data.py

    .Tfrecord generated data. For convenience, I will replace their own data sets directly inside VOC2012 data sets, because before the VOC2012 data set has been run successfully, if do not know how to run VOC2012 data set, please refer to: use DeeplabV3 + VOC2012 training data set .

3. Modify datasets under segmentation_dataset.py

       Modify according to their own data set, the data set because I only like 2, so num_classes take 2.

_PASCAL_VOC_SEG_INFORMATION = DatasetDescriptor(
    splits_to_sizes={
        'train': 1689,  # 在PASCAL数据集上更改为自己的数据
        # 'train_aug': 10582,
        # 'trainval': 2913,
        'val': 564,  # 样本数:1689 + 564 = 2253
    },
    num_classes=2,  # 一共有2类,0:背景 1:**
    ignore_label=255,  # ignore_label 用来 crop size 做填充的,默认为255

4. Modify the train_utils.py utils

     Since the data set imbalance, loss of weight coefficient is modified. After calculation, the pixel ratio px (0): px (1) = 15: 1, so take label0_weight = 1, label1_weight = 15.

    # 训练自己的数据集,针对数据不平衡,此处进行修改
    ignore_weight = 0
    label0_weight = 1  # 背景的权重系数
    label1_weight = 15  # ** 的权重系数


    not_ignore_mask = tf.to_float(tf.equal(scaled_labels, 0)) * label0_weight + \
                      tf.to_float(tf.equal(scaled_labels, 1)) * label1_weight + \
                      tf.to_float(tf.equal(scaled_labels, ignore_label)) * ignore_weight

tf.losses.softmax_cross_entropy(
        one_hot_labels,
        tf.reshape(logits, shape=[-1, num_classes]),
        weights=not_ignore_mask,
        scope=loss_scope)

    At the same time, modify exclude_list.

exclude_list = ['global_step', 'logits']  # 训练自己的数据集时,此处进行修改

 5. Modify train.py

    Set the parameters according to their own computer, I will be here num_clones set to 2, because it contains two 1080tiGPU.

   How to need to train BN layer, batch_size a value greater than 12, if memory is not enough, adjustable Crop_size size, but not less than [321, 321], I had been bad effect, because the Crop_size set smaller.

   After the training has been completed so far, the Crop_size from [256, 256] to [321, 321], model MIOU value from 0.55 to> 0.8, is still optimization.

python train.py \
  --logtostderr \
  --num_clones=2 \ # 设置GPU的数量,默认为1
  --train_split="train" \  # 选择用于训练的数据集 
  --model_variant="xception_65" \  
  --atrous_rates=6 \
  --atrous_rates=12 \
  --atrous_rates=18 \
  --output_stride=16 \
  --decoder_output_stride=4 \
  --train_crop_size=321 \  # 该值最小为[321, 321]
  --train_crop_size=321 \
  --train_batch_size=12 \
  --initialize_last_layer=False \
  --last_layers_contain_logits_only=True \
  --training_number_of_steps=30000 \  
  --fine_tune_batch_norm=True \  # 当batch_size大于12时,设置为True
  --tf_initial_checkpoint='./weights/deeplabv3_pascal_train_aug/model.ckpt' \ 
  --train_logdir='./checkpoint' \ # 保存训练的中间结果的路径
  --dataset_dir='./datasets/tfrecord'  # 生成的tfrecord的路径

 Once try:

1. Modify the pre-training weights

    Download a different pre-training weights used to initialize from the official website, but the results did not change.

2. Modify the value of initialize_last_layer and last_layers_contain_logits_only

   These three cases have been calculated, the results changed little.

3. Adjust Batch_size and Learning_rate

    Tried different batch_size and learning_rate value, the maximum number of means of transport fell 100K step, MIOU difference does not exceed 0.1.

4. Modify Crop_size

    Crop_size setup requirements:

    Not less than 1. [321, 321]

    2. (Crop_size - 1) / 4 = integer

  The Crop_size set to [256, 256], the result will not be good, because it has ASPP (atrous spatial pyramid pooling) module, if the image is too small, no large expansion of the scope of the convolution when the feature map, it requires a minimum . That is why in the beginning, will enlarge their pictures.

    

最后一句:

    因为一个参数(crop_size)设置错误,导致结果迟迟不理想,想想觉得真的是……

    哎……

 

 

Guess you like

Origin blog.csdn.net/weixin_41713230/article/details/81937763