High-Resolution Land Cover Mapping Using Deep Learning in ArcGIS Pro

This article explains in detail the application of deep learning in high-resolution land cover mapping. The author of this article: Amin Tayyebi, the article explains the details from data preparation to training U-Net model and so on. This translation is only made using Google Translate. The article may have wrong sentences and irregularities, so it is for reference only. If necessary, click on the end of the article to read the original text and jump to the English version of the original text.

Applying a deep learning model using Keras and ArcGIS to provide an overview of high-resolution land cover in Alabama.

01

land cover mapping

Global land cover maps have been used in a wide variety of applications, including ecosystem services, climate change, hydrological processes, and policy development at local and regional scales. Although low resolution, spatial (e.g., 30m) and temporal (e.g., every 5 years), various agencies (e.g., USGS, USDA, NASA) have developed land cover maps for the entirety of Europe and the United States, creating real-time depleted regions High-resolution spatiotemporal land cover maps at scales (eg, 1 meter). The land change science community has been pursuing this goal since the early 2000s without widespread success.

In this blog post, I'll introduce you to the model we developed to create disaggregated land cover maps at 1-meter resolution from pixel-level imagery from the National Agricultural Imagery Program (NAIP) (Figure 1). The trained model classifies NAIP images into six land cover categories: 1) buildings, 2) roads or parking lots, 3) water, 4) harvested, open or bare land, 5) forest, and 6) planted or dark farmland.

Figure 1. Original NAIP image and classified image

In addition to the land cover classification methods you can find in other blogs, I am also interested in showing here how to leverage the ArcGIS API for Python and ArcGIS Pro and integrate them with deep learning tools such as Keras here. This enables you to prepare geospatial data (raster or vector data) faster by taking advantage of the geoprocessing tools available in Python in ArcGIS Pro and visualize the progress in ArcGIS Pro (Figure 2).

Figure 2. Integrating ArcGIS Pro, Python API, and deep learning

0 2 Image segmentation

Image segmentation is one of the key problems in the field of computer vision. Image segmentation is the division of an image into multiple segments. In other words, image segmentation is the process of assigning a label to each pixel in an image such that pixels with the same label share certain characteristics. Image segmentation is an ideal method for land cover classification because within each land cover class, pixels have similar characteristics across multiple bands. The importance of image segmentation has been demonstrated in various applications such as self-driving cars, human-computer interaction, virtual reality, etc.

0 3 Data sources

Our study area is part of the state of Alabama. We collected 12 NAIP images [Ref 1], 8 northern NAIP images and 4 southern NAIP images using ArcGIS Online [Ref 2]. NAIP images were acquired with horizontal accuracy at a ground sampling distance of 1 m. NAIP's spectral resolution has four bands, including natural color (red, green, and blue, or RGB) and near-infrared.

First, I created an empty mosaic dataset in the geodatabase using the Create Mosaic Dataset function in ArcGIS Pro [Ref 3]. Second, I added the NAIP imagery to the empty mosaic dataset using the "Add Raster to Mosaic Dataset" function in ArcGIS Pro [Ref 4]. Therefore, the Add Raster to Mosaic Dataset function creates two feature classes called 1) a boundary layer showing the extent of the study area, 2) a footprint layer showing the extent of each NAIP imagery, and one called the imagery layer The raster contains a mosaic of NAIP images (Figure 3).

Figure 3. Geoprocessing tools, NAIP imagery, and study area

0 4 ArcGIS Pro and ArcGIS API for Python for data preparation

For the training run, I need to provide labeled data for each land cover class like other deep learning models. I annotated six land cover classes on NAIP imagery using the training sample manager in ArcGIS Pro [Ref 5]. Labels collected randomly across the study area (Fig. 4).

Figure 4. Training sample manager and sampling labels

The input and output of the image segmentation model should be in raster format for the training run. Since the label data is in feature class or vector format, I use the feature to raster feature in ArcGIS Pro [Ref 6] to convert the label data in feature class format to raster format.

Since labeling the whole NAIP image is time consuming, I have unlabeled regions in the NAIP image. For areas where I did not label data, ArcGIS Pro assigned No Data. I used the Reclassify function in ArcGIS Pro [Ref 7] to convert No-Data to zero and keep the values ​​of other land classes the same. You can think of regions with a value of 0 as background classes that have no effect on the training run. I will explain how I minimize the influence of the background class later in the loss function.

Since NAIP images take up a lot of space (~7.5km × ~6.5km) and they cannot be directly fed into the model, I converted the NAIP images and corresponding labeled data in raster format into smaller images. To do this, I used Export Training Data For Deep Learning in ArcGIS Pro [Ref 8] to convert the NAIP mosaic raster as input to the model and the corresponding raster labeled data into smaller chips. This tool allows you to choose the size of each chip and the step size for the X and Y axes. I chose a chip size of 256 and a stride size of 64 along the X and Y axes. This tool only exports chips that have both NAIP and marker data (Figure 5; Python #1). I defined the format of images and labels in TIFF format. The total number of microarrays per NAIP image depends on the labeling data for each NAIP image.

Figure 5. Exporting training data for deep learning

If more than 50% of the chips had a background class (0 value), they were removed from further analysis. Due to the large volume and number of images in each folder (12 folders corresponding to 12 NAIP images), I stacked chips in folders and converted to separate HDF5 format for each land cover class (Python #2 ). This allows me to track the number of chips per land cover class.

05

data augmentation

Deep learning models require large amounts of data for training. From the training data, I generated 420, 438, 702, 1008, 837, 891 chips for buildings, roads, water, cultivated land, forest, and planted land, respectively. The main solution to lack of training data is to use data augmentation to increase the amount of training data. In data augmentation, I only used HDF5 files for rare land cover classes and doubled or tripled the number of rare classes.

I used three common data augmentation methods to increase the amount of training data for rare land cover classes (buildings, roads or parking lots, and water): 1) Transfer: Export deep learning training data in ArcGIS Pro [Ref 8] Stride options (distance to move when creating the next image chip) in the X and Y directions. The exported chips are 256 along the X and Y directions. I set stride 64 in X and Y direction to get more chips. This happens in the data processing step, which I didn't do again here, 2) Rotation: In each data augmentation, four values ​​[-180, -90, 90, 180] are randomly chosen per chip to create new chips , 3) Scaling: In each data augmentation run, a scaling factor [0.05, 0.45] is randomly selected for each chip in the given range to create new chips (Fig. 6; Python #3). I then merged the newly generated data with the existing usual land cover categories (harvested or bare land, forest, planted or dark cropland). Finally, I normalized each band of the NAIP image and shuffled the training samples.

Figure 6. Data augmentation example

06

Modify and train the U-Net model

The U-Net architecture is an encoder-decoder architecture. U-Net is a completely traditional network consisting of three parts: 1) an encoder-like contraction path, 2) a decoder-like symmetric expansion path and 3) skip connections through feature maps (e.g., residual difference neural network) from the encoder part to the decoder part.

Since I did not have a large dataset, I had to modify U-Net to a new structure with fewer parameters (Fig. 7). The new U-Net model has 1,941,351 parameters (Python #3). As discussed, in each chip there are cells with value 0 or background class, which is not of our interest. I have to generate this class because labeling all cells in an image is usually not feasible. To overcome this during the training run, I had to write a custom loss function that ignores zeros when calculating the loss. This custom loss function handles this by defining weights for each land cover class. I set the weight of the background class close to zero. In the training run, I used 90% of the data for calibration and left 10% for validation. I defined Intersection Over Union (IoU) to calculate the accuracy of the model using the validation data from the training run. I ran the model for 30 epochs. The model stops training at epoch 20 because the validation loss does not improve significantly.

Figure 7. Modified structure of U-Net

07

Deploy the model in ArcGIS Pro and run the trained model on NAIP images

U-Net models are saved in HDF5 format. ArcGIS Pro has a delightful way to deploy models and run them at scale [ref 8]. I use the Python raster functions in ArcGIS Pro to deploy models. The raster functions in ArcGIS Pro use parallel processing to run models faster. ArcGIS Pro has two geoprocessing tools that can run deep learning models: Detect Objects Using Deep Learning and Classify Pixels Using Deep Learning. Since I'm running a segmentation model, I'm using the Classify Pixels option. Integration of external deep learning model frameworks currently works with any deep learning framework, provided you can provide a raster function. Out of the box, raster functions are provided for the TensorFlow Object Detection API and some other frameworks. After training the model, you can use an Esri model definition file (.emd) to run geoprocessing tools to detect or classify features in ArcGIS Pro. You also need to have the appropriate deep learning framework and supporting Python library (TensorFlow, CNTK, PyTorch, or Keras) installed in the ArcGIS Pro Python environment; otherwise, you will get an error adding the .emd file to the tool. .emd files are JSON files that describe a trained deep learning model. It contains the model definition parameters required to run the inference tool and should be modified by the data scientist training the model.

Figure 8. EMD file structure of U-Net model

After creating the .emd file (Figure 8), I performed inference on 12 NAIP images (Figure 9).

Figure 9. Raw NAIP imagery and classified land cover maps from U-Net

Accuracy Evaluation

I isolated a NAIP image that was labeled and not used in the training run. I use this NAIP image for test runs. I run inference on this image and compare the model's output to a rasterized version of the labeled data. The result of this comparison is a contingency table, often used in remote sensing for accuracy assessment. I calculated precision and recall for each land cover class. For a given region, the overall accuracy of the U-Net model is about 85%. It is not surprising that the model performs better on common classes compared to rare classes.

Table 1. Accuracy evaluation of U-Net model (Precision and Recall in %)

09GeoAI Cookiecutter Data Science Template

Sharing data science projects with other data scientists is always challenging because everyone has their own templates for building data science projects. A common format for data science projects enables data scientists to expect a specific format when sharing or receiving projects from others. Here, I've used the Cookiecutter Data Science template, which is a logical, reasonably standardized, yet flexible project structure for performing and sharing data science work. You can easily set up a template for your project with a few command lines. Our team (Esri GeoAI team) implemented a new cookiecutter template for geospatial projects based on the data science cookiecutter template.

More learning resources: Tree Valley Database Resource Encyclopedia (updated on March 16)

Guess you like

Origin blog.csdn.net/hu397313168/article/details/129853090