2022 change detection remote sensing image change detection paper with code

1. Remote Sensing Change Detection using Denoising Diffusion Probabilistic Models
paper code 22-6

Motivation: There are fewer annotated training images available for training CD models, and attention should be paid to mining as much information as possible from the millions of freely available, unlabeled, and uncurated remote sensing images to improve CD accuracy and robustness. Stickiness.
Current pre-training methods either require aerial scene classification datasets for supervised pre-training or paired multi-temporal images for self-supervised pre-training, which limits their ability to utilize information from millions of readily available unlabeled remote sensing images.

Introduction: A novel approach is proposed to incorporate millions of off-the-shelf, unlabeled remote sensing images acquired through different Earth observation programs into the training process by denoising diffusion probabilistic models.
Diffusion models are trained on millions of off-the-shelf remote sensing images to learn the key semantics of aerial images, and then multi-scale features from the Diffusion Decoder are used as input to train a CD classifier with limited availability of pixel-level labels.
insert image description here
Since the diffusion model can get better semantics given the extracted image, it is also able to extract the hierarchical feature representation of the input image. The article utilizes the multi-scale semantics (i.e., deep feature representation) of the diffusion model as input, followed by a channel-spatial attention module, and then a convolutional classifier to obtain the final change prediction map.

Unlike other deep learning architectures, it is also possible to obtain enhanced versions of different feature representations by varying the amount of noise added to the input, further training a robust generalized CD model under limited pixel-level variation labels.

2. IDET: Iterative Difference-Enhanced Transformers for High-Quality Change Detection
paper code 2022-7

Motivation: Most existing works focus on designing advanced network architectures to map feature differences to the final change graph, while ignoring the impact of feature difference quality.

Introduction: This paper studies CD from different perspectives, how to optimize feature differences to highlight changes and suppress invariant regions.

IDET designs three transformers, two transformers are used to extract the remote information of two images, and the third transformer uses the output of the first two transformers to iteratively guide the enhancement of feature differences.
In order to achieve more effective refinement, a multi-scale idet-based change detection is further proposed, using UNet to extract multi-scale convolutional features for multiple feature difference refinement, and a coarse-fine fusion strategy to combine all the refined feature differences .

3.Transition Is a Process: Pair-to-Video Change Detection Networks for Very High Resolution Remote Sensing Images
论文 代码 IEEE Transactions on Image Processing 2022-7

Motivations: 1) temporal modeling is incomplete, 2) spatio-temporal coupling makes it difficult for the network to focus on one aspect at a time.

Introduction: We propose a more explicit and complex approach to temporal modeling, and accordingly build a framework for change detection in video (P2V-CD).
insert image description here
Inspired by the fact that video recognition is divided into frames and that optical flow can be manipulated both in time and space, we interpret CD as a video understanding problem. According to the input image pair, a pseudo-transition video carrying rich temporal information is constructed as the input of the temporal encoder.

In terms of space, a stitching of bitemporal images is used as input to construct a spatial encoder with a sequence of spatial blocks (s-blocks) to capture the spatial context that helps localize regions of change.

In terms of time, a sequence of pseudo-video frames is constructed to obtain a finer-grained view of temporal data. Then, assign a temporal encoder consisting of a stem and four temporal blocks (t-blocks) to mine temporal information about changes.

In the backbone, the constructed video is spatially downsampled to reduce spatial information so that the temporal encoder can focus more on the temporal dimension. The output of the fourth t-block is deeply supervised to enhance the discriminative ability of temporal features.

4. TINYCD A (Not So) Deep Learning Model For Change Detection
paper code 2022-7

Motivation: Existing deep learning-based change detection models are too complex and bulky to be applied to industrial scenarios and edge applications.

Introduction: A new model, TinyCD, is proposed and demonstrated to be both lightweight and effective, capable of achieving comparable or even better performance than the state-of-the-art with 13-150 times fewer parameters.
insert image description here
Use PW-MLP (pixel-level mask generator) to process this information and generate a score as spatio-temporal attention. designed to produce a mask tensor Mk ∈ RH×W.
Skip Connections By multiplying the mask Mk, we reweight each pixel to mitigate misleading information from upsampling.

5. A Transformer-Based Siamese Network for Change Detection
IGARSS 2022-7
paper code

Motivation: The latest CD research mainly focuses on increasing the receptive field of CD models, and Transformer networks have relatively larger effective receptive fields than deep ConvNets.

Introduction: This method unifies a hierarchically structured transformer encoder with a multi-layer perceptual (MLP) decoder in a Siamese network architecture to efficiently render the multi-scale long-range details required for accurate CD.
insert image description here

6. Fully Transformer Network for Change Detection of Remote Sensing Images
ACCV 2022-12 paper code

Motivation: Due to the limited representation ability of the extracted visual features, current methods usually only get incomplete CD regions and irregular CD boundaries. These existing methods do not give full play to the ability of transformer in multi-level feature learning.
1) As the resolution of remote sensing images increases, the rich semantic information contained in high-resolution images is not fully utilized.
2) Boundary information is often missing in complex remote sensing images.
3) The temporal information contained in bi-phase remote sensing images has not been fully utilized.

Introduction: In this paper, we propose a novel CD learning framework for remote sensing images, Total Variation Network (FTN), which improves feature extraction from a global perspective and combines multi-level visual features in a pyramidal manner.
insert image description here
The framework consists of three key parts, namely Siamese Feature Extraction (SFE), Deep Feature Enhancement (DFE), and Progressive Change Prediction (PCP).
1) SFE takes bi-phase remote sensing images as input, and first extracts multi-level visual features through two shared Swin transformers.
At each stage of the original Swin Transformer, the feature resolution is halved while the channel dimension is doubled. More specifically, the feature resolution is reduced from (H/4) × (W/4) to (H/32) × (W/32), and the channel size is increased from C to 8C. In addition, in order to reduce the amount of computation, we uniformly reduce the channel dimension to C.
High-level features capture more global semantic information, while low-level features retain more local detail information. They both help to detect regions of change.
It can help learn more discriminative global-level features and obtain complete CD regions.
To exploit global-level information, we introduce an additional Swin Transformer block to widen the receptive field of feature maps.
2) Utilize multi-level visual features to generate sum features and difference features, and use temporal information to highlight changing regions.
3) By integrating the above features, a pyramid structure grafted with Progressive Attention Module (PAM) is introduced to improve the feature representation ability through channel attention for the final CD prediction.
4) To better train the framework, we leverage deep supervised learning with multiple boundary-aware loss functions.

7. SARAS-Net Scale and Relation Aware Siamese Network for Change Detection
2022-12 paper code

Motivation: Deep Learning Models These methods ignore spatial information and scaling changes between objects, resulting in blurred or wrong boundaries and lower performance. Besides, the interaction information between two different images is ignored.

Introduction: We propose a Siamese network that operates on two input images before and after feature subtraction to detect regions of change and achieve state-of-the-art performance on remote sensing datasets.
insert image description here
The network performs two operations before and after feature subtraction, using a relation-aware module before subtraction, and a scale-aware module and a cross-transformer module after subtraction.
1) The resnet backbone network extracts N-layer features separately.
2) The relationship-aware module performs cross-attention and cross-self-attention operations on the features extracted from each layer, with the goal of enhancing the interactive relationship between feature maps extracted from two input images and improving the ability of feature recognition change detection.
3) Feature subtraction.
4) After feature subtraction, a scale-aware attention module computes cross-scaled attention on the subtraction map to handle scene changes caused by objects of multiple sizes.
The scale-aware module not only computes attention on feature maps of the same scale, but also on other scales to solve the scale-aware problem in change detection.
5) Finally, the cross-transformer module incorporates multi-level features, aiming to pay more attention to spatial information and easily separate foreground and background, thereby reducing false positives.

Supongo que te gusta

Origin blog.csdn.net/qq_40994007/article/details/129107465
Recomendado
Clasificación