Semantic Segmenter implemented in Pytorch

Github Code Practice: Semantic Segmenter Implemented by Pytorch

Example of output *e2e_mask_rcnn-R-101-FPN_2x* with Detectron pretrained weights

Github Code Practice: Semantic Segmenter Implemented by Pytorch

Relevant example output from Detectron

Github Code Practice: Semantic Segmenter Implemented by Pytorch

Example of using Detectron pretrained weights to output *e2e_keypoint_rcnn-R-50-FPN_s1x*

This code is implemented according to Detectron's installation architecture and only supports partial functionality. You can get more information by clicking this link.

With this code, you can...

  1. Train the model from the sketch;

  2. Inference is performed by using the pre-trained weights (*.pk) obtained from Detectron;

This repository was originally built on jwyang/faster-rcnn.pytorch, but after many revisions, the structure has changed a lot and is now more similar to Detectron. In order to retrieve the results directly from the official pretrained weights file, I deliberately made everything similar or identical to how Detectron works.

This tool has the following features:

  • This is entirely Pytorch code, of course, with some CUDA code as well.

  • It supports batch processing training of multiple images.

  • It supports training on multiple GPUs.

  • It supports three merging methods, but it should be noted that only the roi align can be improved to match the Caffe2 installation. So, just use it.

  • It makes efficient use of memory. For batching of data, there are two alternative techniques to reduce memory usage: 1) Group by class: a group of images in the same batch have similar class ratios 2) Crop by class: the cropped image is too long . Category grouping is run in Detectron, so it is used by default, and category clipping is from jwyang/faster-rcnn.pytorch, so it cannot be used by default.

In addition to that, I provide a custom model nn.DataParallel that makes different batches of jumbled models appear on different graphics processors. You can find more details on this in the My nn.DataParallel section.

Supported Network Models

  • Main Architecture:

ResNet series: ResNet50_conv4_body, ResNet50_conv5_body, ResNet101_Conv4_Body, ResNet101_Conv5_Body, ResNet152_Conv5_Body

FPN: fpn_ResNet50_conv5_body, fpn_ResNet50_conv5_P2only_body, fpn_ResNet101_conv5_body,fpn_ResNet101_conv5_P2only_body, fpn_ResNet152_conv5_body, fpn_ResNet152_conv5_P2only_body

  • ResNeXt has also been run but not tested.

  • Box head: ResNet_roi_conv5_head, roi_2mlp_head

  • Mask head: mask_rcnn_fcn_head_v0upshare, mask_rcnn_fcn_head_v0up, mask_rcnn_fcn_head_v1up4convs, mask_rcnn_fcn_head_v1up

  • Keypoints head: roi_pose_head_v1convX

Note: This naming is similar to the one used in Detectron. Just remove the prepended add_, if any.

Supported datasets

Currently only COCO is supported. However, the entire dataset library runs almost identical to Detectron's, so adding more datasets with Detectron support is straightforward.

configuration options

A single explicit configuration file for the architecture is placed under configs. All options in the general configuration file lib/core/config.py are basically the same default values ​​as Detectron. So it is very easy to convert monolithic configs to Detectron.

How to convert configuration files from Detectron

1. Remove MODEL.NUM_CLASSES. It is set during the initial assignment of JsonDataset.

2. Delete TRAIN.WEIGHTS, TRAIN.DATASETS and TEST.DATASETS.

3. In the model type selection,

(eg: MODEL.CONV_BODY, FAST_RCNN.ROI_BOX_HEAD...) If add_ exists in the string, delete it.

4. If you want to load more ImageNet pretrained weights into the model, add RESNETS.IMAGENET_PRETRAINED_WEIGHTS pointing to the pretrained weights file. If not set MODEL.LOAD_IMAGENET_PRETRAINED_WEIGHTS to False

more details

Some options are not available because the related functionality has not been implemented. But there are some reasons why it can't be used because I installed the program in a different way.

Here are some options that have no impact but are worth noting:

  • SOLVER.LR_POLICY, SOLVER.MAX_ITER, SOLVER.STEPS, SOLVER.LRS : As it stands, the training commands are controlled by these command line parameters:

--epochs: how many epochs to train. An epoch means to traverse the entire training set and set the default value to 6.

--lr_decay_epochs: Decay learning rate for each epochs. Decay occurs at the beginning of each epoch. Epoch starts with 0 index, the default value is [4, 5].

For more command line arguments, please refer to python train_net.py --help

  • SOLVER.WARM_UP_ITERS, SOLVER.WARM_UP_FACTOR, SOLVER.WARM_UP_METHOD: Accurate warm-up training on paper, Large Minibatch SGD: Training ImageNet within one hour cannot be run.

  • OUTPUT_DIR : Use the command line argument instead of --output_base_dir to specify the output directory.

When more options are provided:

  • MODEL.LOAD_IMAGENET_PRETRAINED_WEIGHTS = True: Whether to load ImageNet pretrained weights.

    RESNETS.IMAGENET_PRETRAINED_WEIGHTS = '': Path to the weights file for the pretrained network. If it starts with '/', it is an absolute path. Otherwise, it will be treated as a path related to ROOT_DIR.

  • TRAIN.ASPECT_CROPPING = False, TRAIN.ASPECT_HI = 2, TRAIN.ASPECT_LO = 0.: Option to limit the image class ratio range according to the clipping of class options.

  • RPN.OUT_DIM_AS_IN_DIM = True, RPN.OUT_DIM = 512, RPN.CLS_ACTIVATION = 'sigmoid' : RPN formal runs have the same input and output feature channels, they use sigmoid as activation function as output prediction for fg/bg class, In jwyang's implementation, it determines the output path number as 512 and uses softmax as the activation function.

my nn.DataParallel

TBA

Start

Clone this repository:

git clone https://github.com/roytseng-tw/mask-rcnn.pytorch.git

Order

Tested under python3.

  • python installation package

pytorch==0.3.1 (cuda80, cudnn7.1.2)

torchvision==0.2.0

numpy

scipy

opencv

pyyaml

pycocotools — dedicated to the COCO dataset, can also be installed via pip.

tensorboardX — Can log losses on Tensorboard.

  • An NVIDIA GPU and CUDA 8.0 or higher. Some operations only have gpu installations.

  • Note: Different versions of Pytorch packages have different memory usage.

write

Write CUDA code:

cd lib # please change to this directory

sh make.sh

If you are using Volta GPU, uncomment this line in lib/mask.sh file and remember to put a backslash after the above line. The CUDA_PATH path defaults to /usr/loca/cuda. If you want to use the CUDA library in a different path, change this line according to the actual situation.

This statement will compile all the modules you need, including the NMS, ROI_Pooing, ROI_Crop and ROI_Align modules. (In fact the GPU NMS module is never used...)

In particular, if you use CUDA_VISIBLE_DEVICES to set the GPU, make sure at least one GPU is visible when compiling the code.

data preparation

Create a data folder under the repo,

cd {repo_root}

mkdir data

  • COCO: Download coco image data and annotations obtained from the coco website

Make sure to place the files according to the following file structure:

coco

├── Notes

| ├── instances_minival2014.json

│ ├── instances_train2014.json

│ ├── instances_train2017.json

│ ├── instances_val2014.json

│ ├── instances_val2017.json

│ ├── instances_valminusminival2014.json

│ ├── person_keypoints_train2014.json

│ ├── person_keypoints_train2017.json

│ ├── person_keypoints_val2014.json

│ └── person_keypoints_val2017.json

└── pictures

train2014

├── train2017

val2014

└── val2017

Links to download instances_minival2014.json and instances_valminusminival2014.json

Just put the dataset anywhere you want, and then soft-link the dataset to the data/ folder:

ln -s path/to/coco data/coco

It is recommended to input images into SSD network for better training effect.

From my experience, COCO2014 has some mask annotations of different sizes (h, w) for related image data. Possibly instances_minival2014.jsoninstances_valminusminival2014.json contains the wrong mask annotation. However, the COCO2017 dataset does not have this problem.. The COCO train2017 dataset is said to be comparable to (COCO train 2014 + COCO minival 2014), and the COCO test 2017 dataset is comparable to the COCO valminusminival 2014 dataset. Therefore, it is possible to redo the results using the COCO 2017 train-validation split.

pretrained model

I use ImageNet data to pretrain weights for the backbone network in Caffe.

  • ResNet50,ResNet101,ResNet152

  • VGG16 (vgg backbone network is not finished yet)

Download them and put them under the path {repo_root}/data/pretrained_model.

You can download them all using the following command line statement:

- Additional required packages: argparse_color_formater, colorama

python tools/download_imagenet_weights.py

Note: Caffe's pre-trained weights are slightly better than Pytorch's pre-training. We redo the results using Caffe to pretrain the model linked above. By the way, Detectron (an open source object detection library) also uses weights pre-trained by Caffe.

If you want to pretrain the model with pytorch, remember to convert the image data from a BGR matrix to an RGB matrix, and also use the same data processing method (de-averaging and normalization) as the pytorch pretraining process.

Training

  • Train mask-rcnn network from scratch based on res50 backbone network

python tools/train_net.py --dataset coco2017 --cfg configs/e2e_mask_rcnn_R-50-C4.yml --use_tfboard --bs {batch_size} --nw {num_workers}

Use --bs to change the default batch size (eg 8) to something suitable for your GPU. Similarly there is --nw (data loading threads default to 4 in config.py).

Use --use_tfboard to display the logarithm of the loss function on Tensorboard.

  • At the end of each training epoch, a general presentation of the training effect is done with the exact same settings.

python tools/train_net.py --dataset coco2017 --cfg configs/e2e_mask_rcnn_R-50-C4.yml --resume --load_ckpt {path/to/the/checkpoint} --bs {batch_size}

Difference between w/ and w/o --resume: if --resume is specified, the optimizer state will be loaded from the checkpoint file, otherwise it will not be loaded.

  • Train keypoint-rcnn network

python tools/train_net.py --dataset keypoints_coco2017 ...

  • Adjustment of Detectron pre-training weights

python train_net.py --dataset coco2017 --cfg cfgs/e2e_mask_rcnn_R-50-C4.yml --load_detectron {path/to/detectron/weight} --bs {batch_size}

Note: The optimizer state (momentum of SGD) is not loaded (or implemented).

Inference results

python tools/infer_simple.py --dataset coco --cfg cfgs/e2e_mask_rcnn_R-50-C4.yml --load_detectron {path/to/detectron/weight} --image_dir {dir/of/input/images} --output_dir {dir/to/save/visualizations}

--output_dir defaults to infer_outputs.

Metrics

keypoint_rcnn

  • e2e_keypoint_rcnn_R-50-FPN

Training command line: python tools/train_net.py --dataset keypoints_coco2017 --cfg configs/e2e_keypoint_rcnn_R-50-FPN.yaml --bs 8

Taking 8 images as a batch training set, the dataset is trained for 6 cycles, and the learning rate decays at a rate of 0.1 times after the 5th and 6th training cycles. Iterations per cycle (113198 / 8) (rounded down) = 14149 times.

Dataset: keypoints_coco_2017_val

Task: Frame the Box

Github Code Practice: Semantic Segmenter Implemented by Pytorch

Mission: Key Points

The values ​​in the table are the AP values ​​obtained by the Detectron e2e_keypoint_rcnn_R-50-FPN_1x network, which is iteratively trained 90,000 times with 16 images as the batch training set, and the learning rate decays at a rate of 0.1 after the 60,000th and 80,000th iterations.

visualization

The e2e_mask_rcnn_R-50_C4 network was trained from scratch based on the coco_train_2017 data, processed in batches of 4 images, and trained for 1 training cycle:

Github Code Practice: Semantic Segmenter Implemented by PytorchGithub Code Practice: Semantic Segmenter Implemented by PytorchGithub Code Practice: Semantic Segmenter Implemented by PytorchGithub Code Practice: Semantic Segmenter Implemented by Pytorch

Github original address https://github.com/roytseng-tw/Detectron.pytorch/blob/master/README.md

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325471546&siteId=291194637