The SAM semantic segmentation model is open source. In the era of AIGC, the image matting tools are unified by the big model? (Down)

Hello everyone, I am Spirited Away, and I am very happy to share my learning experience on ChatGPT with you again today!

This time is "SAM Semantic Segmentation Model Open Source, AIGC Era, Image Matting Tools Are Unified by Large Models? "The final version of the series.

In the previous two sections, we introduced the Segment Anything model for segmenting everything, and the Segment-and-Track Anything model for segmenting video. It can only be said that the SAM segmentation model has improved too fast!

Don't tell me the name of the project brought today, if you were given an image segmentation algorithm model for matting, how would you optimize it?

1. Optimizing the application scenarios of the model

First of all, the easiest thing to think of is to change the application scenario from image to video, and then the Segment-and-Track Anything model of the novel was born.

Optimizing from the application scenario is a point of optimization, so what other innovative points of optimization are there?

2. Optimize the structure of the model itself

Of course the model itself! Then it is today's project. With the continuous popularization of 5G technology, we also have high requirements for computing speed, so today's project is to compress and lighten the model.

alt

1. The birth of the MobileSAM model

I believe that everyone still remembers the large model where we used SAM to split everything. The following is the home page of the Demo deployment website.

alt

Second, the structure of the MobileSAM model

The original SAM model mainly consists of two parts, image encoder and prompt-guided mask decoder.

Among them, the image encoder uses vit, the model parameters are very large, the model is relatively large, and the vit-h model has 632M;

alt

The second part is very lightweight, including mask decoder and prompt encoder, with fewer model parameters, no more than 4M.

alt

3. Deployment of the MobileSAM model

The difference between the segmentation model of MobileSAM and SAM is that mobileSAM can be deployed on CPU;

Due to the large size of the model, the SAM model can only be deployed on the GPU server in the cloud. We deploy the MobileSAM model on a local PC.

1. Conda creates a new environment named "mobileSAM", Python 3.8 version

(还包含PyTorch 1.7及以上版本和Torchvision 0.8及以上版本)

conda create -n mobileSAM python=3.8

2.激活创建的模型环境

conda activate mobileSAM

3.安装完成后,我们检查你的环境中的Python,PyTorch和Torchvision的版本,你可以使用以下命令:

python --version
python -c "import torch; print(torch.__version__)"
python -c "import torchvision; print(torchvision.__version__)"

4.安装Mobile Segment Anything的依赖:

pip install git+https://github.com/ChaoningZhang/MobileSAM.git

5.本地运行mobileSAM模型

cd app
python app.py

6.安装最新版本的gradio与timm安装包

pip3 install gradio timm

7.修改部分代码,进行服务链接部署

运行项目是部署于Google colab的云端实验室,使用gradio安装包进行快速web UI应用的搭建,创建部署调用的界面。

    def clear():
        return None, None

    def clear_text():
        return None, None, None

    # clear_btn_e.click(clear, outputs=[cond_img_e, segm_img_e])
    clear_btn_p.click(clear, outputs=[cond_img_p, segm_img_p])

demo.queue()
#设置share=True,将部署的demo网页链接设置为公网访问
demo.launch(share=True)

终端输出日志效果:

alt

点击public URL链接访问部署的服务,界面如下图所示:

alt

出现以上页面,表示项目推理的接口服务部署成功!

四、MobileSAM模型的测试

通过以上的Google colab的环境的服务部署,同时该项目在 HuggingFace社区已经发布Demo服务。

1.上传图片,点击待筛选的图像分割

alt

2.点击restart按钮,进行待分割标签的复原

alt

3.HuggingFace社区运行在CPU平台

alt

该项目可以运行在CPU平台,而原始模型SAM由于体积较大,仅仅在支持运行在GPU平台,MobileSAM模型支持运行在CPU等移动端平台。

五、MobileSAM性能对比

以上关于MobileSAM模型的介绍就结束了,回想起来,最开始的SAM原始图像分割一切模型。

到SAM分割并追踪的模型用于视频图像分割,再到现在的MobileSAM部署于CPU端以及移动端的分割一切模型,只能说大模型真的是杀疯了!太快了!

那么移动版本的效果与原始的SAM分割模型对比效果如何?

alt

可以看出来,origin SAM模型与MobileSAM模型的分割效果相类比之下,还算不错。

以下是MobileSAM分割一切轻量化模型部署的链接地址(点击可用):

https://huggingface.co/spaces/dhkim2810/MobileSAM

大家可以点击试一试,分割效果真的不错,一起追赶大模型的发展!

I'm Spirited Away, a coder who only talks about dry goods, see you next time!

This article is published by mdnice multi-platform

Guess you like

Origin blog.csdn.net/baidu_39629638/article/details/131887737