Facebook open source PySlowFast video understanding code library

In recent years, Facebook Artificial Intelligence Research (FAIR) has been making significant contributions to video understanding research. At ICCV in October 2019, the team launched a Python-based code library, PySlowFast. FAIR is now open source PySlowFast, also has a library of pre-trained models, and promises to continue adding cutting-edge resources to the project.

insert image description here

The name "PySlowFast" comes from a novel duality - the model has both a slow path that runs at low frame rates to capture spatial semantics, and a lightweight fast path that runs at high frame rates to Fine temporal resolution captures motion, and useful temporal information can be learned for video recognition.

2023-07-28T03:28:48.png

The introduction of PySlowFast addresses some of the needs of ML researchers. First, there is no concise, efficient, and easily modifiable code base for video understanding in the open source community. Second, rebuilding today's state-of-the-art deep learning models can be a headache, as such models typically require tens of GFlops, days of training, and repeated experimental tweaks to get every detail right. For many researchers, this can be very time-consuming and resource-intensive.

insert image description here

PySlowFast will enable researchers to easily reproduce video classification and motion detection algorithms, whether they are basic or cutting-edge. FAIR also open-sources some pre-trained models, saving researchers the trouble of repeating training sessions.

insert image description here

Performance of PySlowFast on the video classification database Kinetics 400

PySlowFast also includes a dedicated interface that supports tasks such as multimodal video understanding, video self-supervised learning, etc., through simple editing. FAIR said that PySlowFast will actively update the cutting-edge algorithm in real time to ensure that it remains the latest and reliable benchmark in the field of video understanding.

After the installation is complete, users can download the pre-trained model and corresponding configuration files provided by model_ZOO, and run the following code to test performance on different video databases:

python tools/run_net.py \
--cfg configs/Kinetics/C2D_8x8_R50.yaml \
DATA.PATH_TO_DATA_DIR path_to_your_dataset \
NUM_GPUS 2 \

The PySlowFast codebase is on GitHub. Related papers are arXiv: Slow Fast Networks and Nonlocal Neural Networks for Video Recognition.

Guess you like

Origin blog.csdn.net/virone/article/details/131977140