segment anything for high-resolution remote sensing image segmentation and effect comparison

1. Introduction to SAM model

  1. Segment Anything Model, or SAM for short, is the first basic image segmentation model in history released by Meta in early April. It is a large model formed by combining three interrelated elements: Task, Model and Data. The composition of the Task is as shown in the figure below. By inputting segmentation prompts and pictures, the mask
    insert image description here
    SAM is generated through model operation. The input prompts can be marked points, regular/irregular frame boundaries, or input words. If you enter "Cat", the model will recognize the cat and generate a mask.

The main operations in the Model are words and images. The architecture is shown in the figure below:
insert image description here

Finally, there is Data, which is used as input for model training. During the training process, the data is annotated to achieve model optimization. SAM’s explanation of Data is as follows:
insert image description here

2. Model use

  1. set-up
import numpy as np
import torch
import matplotlib.pyplot as plt
import cv2

# 用来显示掩膜
def show_anns(anns):
    if len(anns) == 0:
        return
    sorted_anns = sorted(anns, key=(lambda x: x['area']), reverse=True)
    ax = plt.gca()
    ax.set_autoscale_on(False)

    img = np.ones((sorted_anns[0]['segmentation'].shape[0], sorted_anns[0]['segmentation'].shape[1], 4))
    img[:,:,3] = 0
    for ann in sorted_anns:
        m = ann['segmentation']
        color_mask = np.concatenate([np.random.random(3), [0.65]])
        img[m] = color_mask
    ax.imshow(img)
  1. example imageImport pictures
image = cv2.imread('images/lzu.jpg')
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
  1. automatic mask generationautomatic mask generator
import sys
sys.path.append("..")
from segment_anything import sam_model_registry, SamAutomaticMaskGenerator, SamPredictor

sam_checkpoint = "sam_vit_b_01ec64.pth" # 模型
model_type = "vit_b"

device = "cuda" # 使用GPU

sam = sam_model_registry[model_type](checkpoint=sam_checkpoint)
sam.to(device=device)

mask_generator = SamAutomaticMaskGenerator(sam)

Note: SamAutomaticMaskGeneratorThe generator has several adjustable parameters that can be used to control sampling density or remove low-quality and duplicate masks, as well as set the generator to run on smaller objects after cropping to improve performance, and post-process to remove stray pixels. and generated holes, etc. SamAutomaticMaskGeneratorThe parameter settings are as follows:

# SamAutomaticMaskGenerator 参数输入
mask_generator_2 = SamAutomaticMaskGenerator(
    model=sam,
    points_per_side=32,
    pred_iou_thresh=0.86,
    stability_score_thresh=0.92,
    crop_n_layers=1,
    crop_n_points_downscale_factor=2,
    min_mask_region_area=100,  # Requires open-cv to run post-processing
)
  1. Call generatethe method to generate the mask. If you use the generator of custom parameters here mask_generator_2 , just replace it with to masks = mask_generator_2.generate(image)generate the mask.
masks = mask_generator.generate(image)
  1. show image image display, including segmentation pictures and masks
plt.figure(figsize=(20,20))
plt.imshow(image)
show_anns(masks2)
plt.axis('off')
plt.show() 

3. Comparison of results

1. Urban complex image segmentation: the goal is building extraction

Fig1. Default parameter segmentation result ( mask_generatorgenerator) Default parameter segmentation results
Fig2. Custom parameter segmentation result ( mask_generator_2generator)
insert image description here

2. Segmentation of high-resolution remote sensing images with simple categories: the goal is water extraction

Fig1. Segmentation results with default parameters ( mask_generatorgenerator)
insert image description here

Fig2. Customized parameter segmentation results ( mask_generator_2generator)
insert image description here

3. Conclusion

It can be seen that:
(1) For complex urban images: default parameters may miss the recognition of buildings, and the integrity of custom parameter recognition is relatively high; (2)
For images with simple content: use default parameters or custom parameters There is not much difference in the results;
(3) Comparing Scheme 1 and Scheme 2, it can be seen that the accuracy of segmentation of complex images (such as cities) by the model is not as high as that of simple images; (4) In short,
this model can be used for automatic segmentation, and then artificial segmentation The results are processed. On the one hand, it can improve the efficiency of image segmentation and recognition, on the other hand, it can improve the accuracy compared to manual digitization;
(5) Judging from the SAM official examples, the data sets used in the official examples are also relatively complex, but the segmentation effect is not very good. The difference between our experiment and the official example is that the official ones are all taken pictures, while in this example we use high-resolution remote images. Combining the results of Example 1 and Example 2, we guess that this may be related to the pixel problem of these two images, so the segmentation effect is not as good as the official example. Later, you can try to compare the segmentation effect of remote sensing images with the effect of pictures.

Guess you like

Origin blog.csdn.net/qq_29517595/article/details/131188625