pytorch deep learning case (2) - aerial street semantic segmentation

data set

The dataset used is kagglethe Semantic segmentation of aerial imagery
whose data is organized in the form
insert image description here

project structure

insert image description here

utils

dataConvert.py

dataConvert mainly includes the transformation process of data

function effect
loadColorMap Colormap for loading labels
voc_colormap2label Get the mapping relationship from color labels to value labels
voc_rand_crop for clipping data
voc_label_indices Convert RGB labels to numeric labels
one hot Convert labels to one hot

dataLoader.py

dataLoader.py contains the data loading process

class/function effect
SemanticDataset Data loading class, including data normalization, data clipping process, used to load data
load_data_voc Call SemanticDataset to load the training set and test set in batches

losses.py

Define the loss function. In this project, the addition of Focal loss and Dice loss is used as the loss function

model.py

Contains U-net model and deeplabv3+ two models, which can be selected by modifying parameters during training and testing

prepare module

This module is executed before training and is a preparation for the entire project

function

function effect
semantic2dataset Aerial datasets are converted into semantically segmented datasets
trainValSplit Split training and test sets
getMeanStd Get mean and variance
writeColorClasses Save colors and categories

parameter

There are only two parameters, namely color mapping and category. In this project, these two parameters are

VOC_COLORMAP = [[226, 169, 41], [132, 41, 246], [110, 193, 228], [60, 16, 152], [254, 221, 58], [155, 155, 155]]
VOC_CLASSES = ['Water', 'Land (unpaved area)', 'Road', 'Building', 'Vegetation', 'Unlabeled']

train module

function

train
trains according to the parameters passed in

parameter

parameter effect
batch_size Batch size, which can be set smaller in semantic segmentation
crop_size crop image size
model_choice Model selection, optional U-net, deeplabv3+
in_channels Input the number of image channels, RGB image is 3, grayscale image is 1
out_channels Output label category, 6 in this project
num_epochs Total rounds of training
auto_save Interval rounds for automatically saving weights
lr learning rate
device The environment used for training, when cuda is available, it is automatically set to cuda, otherwise it is automatically set to cpu

predict module

The predict module is just a shallow test of the accuracy and effect of the model. If necessary, the application can call the predict function to predict and combine it with the actual application

function

function effect
label2image Convert numeric labels to RGB labels
predict Single Image Prediction
read_voc_images read pictures
plotPredictAns plot test results

parameter

parameter effect
you_dir path to test data
means image mean
stds image variance
device The environment used for training, when cuda is available, it is automatically set to cuda, otherwise it is automatically set to cpu
batch_size batch size
model_choice Model selection, optional U-net, deeplabv3+

download link

GitHub download address: Semantic-segmentation-for-aerial

Explain in detail

Semantic segmentation project (1) - data overview and preprocessing

Semantic Segmentation Project (2) - Label Conversion and Data Loading

Semantic Segmentation Project (3) - Semantic Segmentation Model (U-net and deeplavb3+)

Semantic Segmentation Project (4) - Model Training and Prediction

Guess you like

Origin blog.csdn.net/DuLNode/article/details/129118733