data set

The dataset used is kagglethe Semantic segmentation of aerial imagery
whose data is organized in the form
insert image description here

project structure

insert image description here

utils

dataConvert.py

dataConvert mainly includes the transformation process of data

function	effect
loadColorMap	Colormap for loading labels
voc_colormap2label	Get the mapping relationship from color labels to value labels
voc_rand_crop	for clipping data
voc_label_indices	Convert RGB labels to numeric labels
one hot	Convert labels to one hot

dataLoader.py

dataLoader.py contains the data loading process

class/function	effect
SemanticDataset	Data loading class, including data normalization, data clipping process, used to load data
load_data_voc	Call SemanticDataset to load the training set and test set in batches

losses.py

Define the loss function. In this project, the addition of Focal loss and Dice loss is used as the loss function

model.py

Contains U-net model and deeplabv3+ two models, which can be selected by modifying parameters during training and testing

prepare module

This module is executed before training and is a preparation for the entire project

function

function	effect
semantic2dataset	Aerial datasets are converted into semantically segmented datasets
trainValSplit	Split training and test sets
getMeanStd	Get mean and variance
writeColorClasses	Save colors and categories

parameter

There are only two parameters, namely color mapping and category. In this project, these two parameters are

VOC_COLORMAP = [[226, 169, 41], [132, 41, 246], [110, 193, 228], [60, 16, 152], [254, 221, 58], [155, 155, 155]]
VOC_CLASSES = ['Water', 'Land (unpaved area)', 'Road', 'Building', 'Vegetation', 'Unlabeled']

train module

function

train
trains according to the parameters passed in

parameter

parameter	effect
batch_size	Batch size, which can be set smaller in semantic segmentation
crop_size	crop image size
model_choice	Model selection, optional U-net, deeplabv3+
in_channels	Input the number of image channels, RGB image is 3, grayscale image is 1
out_channels	Output label category, 6 in this project
num_epochs	Total rounds of training
auto_save	Interval rounds for automatically saving weights
lr	learning rate
device	The environment used for training, when cuda is available, it is automatically set to cuda, otherwise it is automatically set to cpu

predict module

The predict module is just a shallow test of the accuracy and effect of the model. If necessary, the application can call the predict function to predict and combine it with the actual application

function

function	effect
label2image	Convert numeric labels to RGB labels
predict	Single Image Prediction
read_voc_images	read pictures
plotPredictAns	plot test results

parameter

parameter	effect
you_dir	path to test data
means	image mean
stds	image variance
device	The environment used for training, when cuda is available, it is automatically set to cuda, otherwise it is automatically set to cpu
batch_size	batch size
model_choice	Model selection, optional U-net, deeplabv3+