Segment-and-Track-Anything operation reproduction

insert image description here
Segment and Track Anything is an open source project focused on automatic and interactive methods for segmenting and tracking anything in video. The main algorithms used include SAM (Segment Anything Models), DeAOT (Decoupled Features Associating Objects with Transformers) for automatic/interactive keyframe segmentation and (NeurIPS2022) for efficient multi-object tracking and propagation. The SAM-Track pipeline enables SAM to dynamically and automatically detect and segment new objects, while DeAOT is responsible for tracking all identified objects.
Source code location: https://github.com/zx-yang/Segment-and-Track-Anything

1. Download the code

pip install git+https://github.com/z-x-yang/Segment-and-Track-Anything.git

2. Installation

Requirements: python>=3.9, pytorch>=1.10, torchvision>=0.11
1. Installation dependencies

bash script/install.sh

2. Download the model
Create a new ckpt folder and put the model file into it

https://dl.fbaipublicfiles.com/segment_anything/sam_vit_b_01ec64.pth
https://drive.google.com/file/d/1QoChMkTVxdYZ_eBlZhK2acq9KMQZccPJ/view

Download default weights

bash script/download_ckpt.sh

3. Run Demo

Put the video to be processed into the ./assets folder
Run demo.ipynb with jupyter notebool
to generate the file is also in the ./assets folder
The parameters of SAM-Track, DeAOT and SAM can be manually modified in model_args.py

or web open

python app.py

Use a browser to open the given network link
Click input-video to upload video
Adjust SAM-Track parameters
Click Seg and Track to get the result

parameter:

  • aot_model : Used to select the DeAOT/AOT version for tracking and propagation. sam_gap : Used to control how often SAM
    adds newly appearing objects at the specified frame interval. Increase to reduce the frequency of discovering new objects, but significantly increase the speed of inference.
  • points_per_side : Used to control the number of points per side used to generate the mask by sampling the grid on the image.
    Increasing the size can enhance the ability to detect small objects, but larger objects may be segmented into finer granularity.
  • max_obj_num : Used to limit the maximum number of objects that SAM-Track can detect and track. The higher the number of objects, the higher the memory utilization, and about 16GB
    of memory can handle up to 255 objects.

Guess you like

Origin blog.csdn.net/koukutou_mikiya/article/details/130283937