Facebook's new model, called SAM, or Segment Anything Model, has the potential to cause positive change in the computer vision industry. This breakthrough model is unlike any other image segmentation model used before.
Traditionally, separate models are trained for different types of images, such as people or cars, but SAM removes the need for separate models by providing a general segmentation solution. SAM is similar to the GPT era in the natural language processing industry, and has the potential to be used for various image segmentation tasks, such as sentiment analysis or satellite image segmentation.
Facebook has generously released the SAM and the dataset used to train it, containing over 11 million images and 1.1 billion masks, under the permissive Apache 2.0 license. This open source initiative has made significant contributions to the computer vision industry. The potential impact of SAM on the computer vision industry and its importance in image segmentation is self-evident.
If you are interested in using SAM in your local environment, the GitHub repository documentation provides detailed information on getting started.
-
Code repository: https://github.com/facebookresearch/segment-anything
Blog: https://ai.facebook.com/blog/segment-anything-foundation-model-image-segmentation/
Paper: https://arxiv.org/abs/2304.02643
KadirNar created a repository that provides a simplified (packaged) version of SAM to make it easier for us to use: https://github.com/kadirnar/segment-anything-video
Later, we will use SAM in local environment using KadirNar repository.
However, before diving into the technical details, let's take a look at an online demo provided by Facebook: https://segment-anything.com/demo. Here you will see a page where you can upload any image you like or choose from their dataset. We'll pick a random image to show for the demo.
Once an image is selected or uploaded, SAM starts working. It takes a few seconds to process the image, but when it's done, you'll see your selected object perfectly segmented.
Here are the images we use:
Let's see the result of image segmentation:
It segmented our image perfectly. The accuracy of the SAM is truly impressive, and it's easy to see why this model was so revolutionary.
Once you're ready to try it yourself, the GitHub repository documentation provides clear steps on how to use SAM in your local environment.
I recommend using the KadirNar repository to try out SAM. Also, it might be more beneficial to utilize Google Colab as it eliminates possible issues with different library versions.
Let's see how it works! First, we need to install metaseg using pip.
!pip install metaseg -q
Let's take a look at the image we are going to process:
from IPython.display import Image
Image(“image.jpg”)
output:
Next, we need to import SegAutoMaskGenerator, which uses model_type to detect segmented shapes from images.
autoseg_image = SegAutoMaskGenerator().save_image(
source="image.jpg",
model_type="vit_l", # vit_l, vit_h, vit_b
points_per_side=16,
points_per_batch=64,
min_area=0,
)
The parameter "model_type" determines the type of model we will use.
Facebook offers three different models:
-
default or vit_h: ViT-H SAM model
vit_l: ViT-L SAM model
vit_b: ViT-B SAM model
Each model has its advantages. Their official blog provides a detailed description of each model.
Once the above code is run, the split image will be saved in your current directory, let's see the result.
from IPython.display import Image
Image(“output.jpg”)
output:
For example, selecting any vegetable in the output image is given by the shape SAM.
Disclaimer: Part of the content comes from the Internet and is only for the purpose of academic communication among readers. Article belongs to the original author. If there is something wrong, please contact to delete.
· END ·
HAPPY LIFE