Remote Sensing Resources Broadcasting (Part 1): Use open source code to train a land classification model

Summary of content: Land classification is one of the important application scenarios of remote sensing imagery. This article introduces several common methods of land classification, and uses open source semantic segmentation code to create a land classification model.

Original: HyperAI Super Nerve

Keywords: remote sensing data set, semantic segmentation, machine vision

Remote sensing images are important data for the development of surveying and mapping geographic information. They are of great significance for geographic national conditions monitoring and geographic information database updates. They are playing an increasingly important role in military, commerce, and people's livelihood fields.

In recent years, with the improvement of national satellite image acquisition capabilities, the efficiency of remote sensing image data collection has been greatly improved, forming a pattern of coexistence of multiple sensors such as low spatial resolution, high spatial resolution, wide viewing angles and multiple angles, and radar.

Landsat 2 in orbit is collecting remote sensing data from the Earth


This satellite is the second in NASA’s land satellite program and was launched in 1975 to obtain global seasonal data at medium resolution

The complete range of sensors meets the needs of earth observation for different purposes, but it also causes problems such as inconsistent remote sensing image data format and consuming a lot of storage space, and often faces greater challenges in the image processing process.

Take land classification as an example. In the past, remote sensing images were used to classify land, which often relied on a large amount of manpower for labeling and statistics, which took months or even a year; coupled with the complex and diverse land types, man-made statistical errors would inevitably occur.

With the development of artificial intelligence technology, the acquisition, processing, and analysis of remote sensing images have become more intelligent and efficient.

Common land classification methods

Commonly used land classification methods are basically divided into three categories: traditional classification methods based on GIS, classification methods based on machine learning algorithms, and classification methods using neural network semantic segmentation.

Traditional method: using GIS geographic information system classification 

GIS is a tool often needed to process remote sensing images. Its full name is Geographic Information System, also known as geographic information system.

It integrates advanced technologies such as relational database management, efficient graphics algorithms, interpolation, zoning, and network analysis to make spatial analysis simple and easy.

Spatial analysis of the eastern tributary area of ​​the Elizabeth River using GIS


Using the spatial analysis technology of GIS, the spatial location, distribution, form, formation and evolution of the corresponding land type can be obtained, and the land characteristics can be identified and judged.

Machine learning: classification using algorithms 

Traditional land classification methods include supervised classification and unsupervised classification.

Supervised classification is also called training classification method. It refers to using training sample pixels of confirmed categories to compare and identify pixels with unknown categories to complete the classification of the entire land type.

In supervised classification, when the accuracy of the training sample is not enough, the training area is usually re-selected or visually modified to ensure the accuracy of the training sample pixels.

Supervised and classified remote sensing image (left), red is construction land, green is non-construction land


Unsupervised classification means that there is no need to obtain prior classification criteria in advance, but to perform statistical classification based on the spectral characteristics of pixels in remote sensing images. This method has a high degree of automation and little human intervention.

With the help of machine learning algorithms such as support vector machines and maximum likelihood methods, the efficiency and accuracy of supervised classification and unsupervised classification can be greatly improved.

Neural Network: Using Semantic Segmentation and Classification 

Semantic segmentation is an end-to-end pixel-level classification method that can strengthen the machine's understanding of environmental scenes and is widely used in fields such as autonomous driving and land planning.

Semantic segmentation technology based on deep neural networks performs better than traditional machine learning methods when processing pixel-level classification tasks.

Use semantic segmentation algorithms to identify and judge remotely sensed images of a certain place


High-resolution remote sensing images have complex scenes, rich details, and uncertain spectral differences between ground objects, which can easily lead to low segmentation accuracy and even invalid segmentation.

Using semantic segmentation to process high-resolution and ultra-high-resolution remote sensing images can more accurately extract the pixel characteristics of the image, quickly and accurately identify specific land types, and thereby increase the processing speed of remote sensing images.

Commonly used open source models for semantic segmentation

Commonly used pixel-level semantic segmentation open source models include FCN, SegNet and DeepLab.

1. Fully Convolutional Network (FCN)

Features: End-to-end semantic segmentation
Advantages: no limitation on image size, versatility and high efficiency

Disadvantages: unable to perform real-time reasoning quickly, processing results are not fine enough, and not sensitive to image details

2 、 SegNet 

Features: Transfer the maximum pooling index to the decoder to improve the segmentation resolution.
Advantages: fast training speed, high efficiency, and less memory.
Disadvantages: it is not feed-forward during testing and needs to be optimized to determine the MAP label


DeepLab is released by Google AI and advocates using DCNN to solve semantic segmentation tasks. It includes four versions: v1, v2, v3, and v3+.

In order to solve the problem of information loss caused by pooling, DeepLab-v1 proposes a method of hole convolution, which increases the receptive field without increasing the number of parameters, while ensuring that information is not lost.

DeepLab-v1 model process demonstration


On the basis of v1, DeepLab-v2 adds multi-scale parallelism and solves the problem of simultaneous segmentation of objects of different sizes.

DeepLab-v3 applies the hole convolution to the cascade module and improves the ASPP module.

DeepLab-v3+  uses the SPP module on the encoder-decoder structure to recover fine object edges. Refine the segmentation results.

Model training preparation

Purpose : On the basis of DeepLab-v3+, develop 7 classification models for land classification

Data: 304 remote sensing images of a certain area from Google Earth. In addition to the original image, it also includes professionally annotated matching 7-category map, 7-category mask, 25-category map, and 25-category mask images. The image resolution is 560*560, and the space allocation rate is 1.2m.

The original image and the corresponding 7-category map splicing example, the upper part is the original image, and the lower part is the 7-category map


The tuning code is as follows:

net = DeepLabV3Plus(backbone = 'xception')
criterion = CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.05, momentum=0.9,weight_decay=0.00001) 
lr_fc=lambda iteration: (1-iteration/400000)**0.9
exp_lr_scheduler = lr_scheduler.LambdaLR(optimizer,lr_fc,-1)

Training details

Computing power selection: NVIDIA T4

Training framework: PyTorch V1.2

Number of iterations: 600 epoch

Training duration: about 50h

IoU: 0.8285 (training data)

AC: 0.7838 (training data)

Data set link:

Direct link to detailed training process:

The seven classifications of remote sensing image land public tutorial details


 Tutorial use 

The sample display file in the tutorial is predict.ipynb. Running this file will install the environment and display the recognition effect of the existing model.

 Project path 

- Test Picture path:


- Mask Picture path:


- prediction image path:


- Training data list: train.csv

Test data list: test.csv

 Instructions for use 

The trained model enters semantic_pytorch, and the trained model is saved in model/ 

The model uses DeepLabV3plus, and in the training parameters, Loss uses binary cross entropy. Epoch is 600, and the initial learning rate is 0.05.

 Training instructions:


If you use the model we have already trained, use saved in the model folder and call it directly in

 Predictive instructions:


Tutorial address:

Model author

Question 1: In order to develop this model, what channels and information did you use?

Wang Yanxin: Mainly through the technical community, GitHub and other channels, I checked some DeepLab-v3+ papers and related project cases, and learned about the pits in advance and how to overcome them, so that you can inquire and solve problems at any time during the subsequent model development process. , Made a relatively sufficient preparation.

Question 2: What obstacles were encountered in the process? How to overcome it?

Wang Yanxin: The amount of data is not enough, resulting in mediocre performance of IoU and AC. Next time, you can try the public remote sensing data set with richer data.

Question 3: What other directions do you want to try regarding remote sensing?

Wang Yanxin: This time I will classify the land. Next, I would like to use a combination of machine learning and remote sensing technology to analyze the ocean landscape and ocean elements, or use acoustic technology to try to identify and judge the seabed topography.

The amount of data used in this training is small, and the performance of IoU and AC on the training set is average. You can also try to use the existing public remote sensing data set for model training. In general, the more adequate the training and the richer the training data, the better the model performance .

In the (next) article of this series, we have compiled 11 mainstream public remote sensing data sets and categorized them. You can select and train a more complete model based on the training ideas provided in this article.



-- Finish--




Guess you like