data set
The dataset used is kaggle
the Semantic segmentation of aerial imagery
whose data is organized in the form
project structure
utils
dataConvert.py
dataConvert mainly includes the transformation process of data
function | effect |
---|---|
loadColorMap | Colormap for loading labels |
voc_colormap2label | Get the mapping relationship from color labels to value labels |
voc_rand_crop | for clipping data |
voc_label_indices | Convert RGB labels to numeric labels |
one hot | Convert labels to one hot |
dataLoader.py
dataLoader.py contains the data loading process
class/function | effect |
---|---|
SemanticDataset | Data loading class, including data normalization, data clipping process, used to load data |
load_data_voc | Call SemanticDataset to load the training set and test set in batches |
losses.py
Define the loss function. In this project, the addition of Focal loss and Dice loss is used as the loss function
model.py
Contains U-net model and deeplabv3+ two models, which can be selected by modifying parameters during training and testing
prepare module
This module is executed before training and is a preparation for the entire project
function
function | effect |
---|---|
semantic2dataset | Aerial datasets are converted into semantically segmented datasets |
trainValSplit | Split training and test sets |
getMeanStd | Get mean and variance |
writeColorClasses | Save colors and categories |
parameter
There are only two parameters, namely color mapping and category. In this project, these two parameters are
VOC_COLORMAP = [[226, 169, 41], [132, 41, 246], [110, 193, 228], [60, 16, 152], [254, 221, 58], [155, 155, 155]]
VOC_CLASSES = ['Water', 'Land (unpaved area)', 'Road', 'Building', 'Vegetation', 'Unlabeled']
train module
function
train
trains according to the parameters passed in
parameter
parameter | effect |
---|---|
batch_size | Batch size, which can be set smaller in semantic segmentation |
crop_size | crop image size |
model_choice | Model selection, optional U-net, deeplabv3+ |
in_channels | Input the number of image channels, RGB image is 3, grayscale image is 1 |
out_channels | Output label category, 6 in this project |
num_epochs | Total rounds of training |
auto_save | Interval rounds for automatically saving weights |
lr | learning rate |
device | The environment used for training, when cuda is available, it is automatically set to cuda, otherwise it is automatically set to cpu |
predict module
The predict module is just a shallow test of the accuracy and effect of the model. If necessary, the application can call the predict function to predict and combine it with the actual application
function
function | effect |
---|---|
label2image | Convert numeric labels to RGB labels |
predict | Single Image Prediction |
read_voc_images | read pictures |
plotPredictAns | plot test results |
parameter
parameter | effect |
---|---|
you_dir | path to test data |
means | image mean |
stds | image variance |
device | The environment used for training, when cuda is available, it is automatically set to cuda, otherwise it is automatically set to cpu |
batch_size | batch size |
model_choice | Model selection, optional U-net, deeplabv3+ |
download link
GitHub download address: Semantic-segmentation-for-aerial
Explain in detail
Semantic segmentation project (1) - data overview and preprocessing
Semantic Segmentation Project (2) - Label Conversion and Data Loading
Semantic Segmentation Project (3) - Semantic Segmentation Model (U-net and deeplavb3+)
Semantic Segmentation Project (4) - Model Training and Prediction