Progressive Sparse Local Attention for Video object detection - 代码天地

Progressive Sparse Local Attention for Video object detection

其他 2019-04-14 12:11:22 阅读次数: 0

motivation:

之前使用flownet的方法有诸多弊端。

1.在检测框架中加入光流网络极大地增加了检测器模型的参数，无法用在移动端。

2.光流原本是描述两张图片间像素点的位移的，直接将其用在high-level的feature map上会引入人为的干扰。特别的，high-level的feature map上的像素点移动一格，对应的图片上可能存在10-20个像素点的位移，光流估计大位移容易出错。

　　因此本文舍弃了光流网络，提出了一个叫做Progressive Sparse Local Attention(PSLA)的新模型用来替代光流网络，在高层语义特征之间做特征传播。

具体来说，\(F_t,F_{t+\epsilon}\)分别为帧\(I_t,I_{t+1}\)的特征，PSLA首先计算两特征之间的correspondence weights，然后用这个计算出的权重与特征做卷积来进行特征对齐。这个机制和attention很像但有不同之处，后面会介绍。

和之前的视频目标检测方法类似，本文也是仅在稀疏的关键帧上做特征提取，并用PSLA得到非关键帧的特征。PSLA用在两个地方：

1.将关键帧的特征传播到非关键帧；此外，一个轻量的质量网络被用在非关键帧上，将非关键帧的low-level feature用来同传播来的high-level feature做补充。文章称之为Dense Feature Transforming(DFT).

2.在关键帧之间进行特征传播；此外，一个更新网络被用来递归地更新关键帧上的特征。文章称之为Recursive Feature Updating(RFU).

所提出的框架概览：

图1.以两张关键帧\(I^{K1},I^{k2}\)和一张非关键帧\(I^i\)为例来简单说明文章的算法框架。关键帧首先送到\(N_f\)来得到高层特征\(F_h^k\)，非关键帧送入一个轻量的网络\(N_l\)来提取低层特征\(F_l^i\)。

时序特征\(F_t)用RFU来增强高层特征，其中\(F_t\)是由更新网络结合高层特征来递归更新得到的。与此同时，用DFT在关键帧和非关键帧之间传播特征。

PSLA：

猜你喜欢

转载自www.cnblogs.com/hf19950918/p/10704500.html

Progressive Sparse Local Attention for Video object detection

Shifting More Attention to Video Salient Object Detection （CVPR 2019）

Motion Guided Attention for Video Salient Object Detection论文详读

论文笔记：Progressive Attention Guided Recurrent Network for Salient Object Detection

Patchwork: A Patch-wise Attention Network for Efficient Object Detection and Segmentation in Video Streams

Prime Sample Attention in Object Detection

Towards Universal Object Detection by Domain Attention

Object Detection in Video with Spatiotemporal Sampling Networks

Towards High Performance Video Object Detection

Towards High Performance Video Object Detection for Mobiles

Relation Distillation Networks for Video Object Detection

Fast Object Detection in Compressed Video论文详读

Fully Sparse Fusion for 3D Object Detection

[Sparse R-CNN]Sparse R-CNN: End-to-End Object Detection with Learnable Proposals笔记

【Sparse R-CNN】《Sparse R-CNN：End-to-End Object Detection with Learnable Proposals》

Progressive Fusion Video Super-Resolution Network via Exploiting Non-Local Spatio-Temporal Correlati

视频物体检测(VID) Impression Network for Video Object Detection

Flow-Guided Feature Aggregation for Video Object Detection

《Video Saliency Detection Using Object Proposals》阅读笔记

Flow Guided Recurrent Neural Encoder for Video Salient Object Detection

20.Flow-Guided Feature Aggregation for Video Object Detection

Object detection from video tubelets with CNN翻译（未完成）

Fully Motion-Aware Network for Video Object Detection

CATDET: Cascaded Tracked Detector for Efficient Object Detection from Video

Video Object Detection with an Aligned Spatial-Temporal Memory

论文：Moving Object Detection in HEVC Video by Frame Sub-sampling

视频目标检测(video object detection)简单综述

Temporal Context Enhanced Feature Aggregation for Video Object Detection

Video Salient Object Detection via Fully Convolutional Networks

目标检测 - Sparse R-CNN: End-to-End Object Detection with Learnable Proposals

今日推荐

《美国对全球网络空间安全与发展的威胁和破坏》报告发布

火速冲上 GitHub 热榜 —— 开源编程语言、框架哪有这么可爱？

北京人形机器人创新中心发布全球首个纯电驱拟人奔跑的全尺寸人形机器人“天工”

LFOSSA 源来如此公开课 | 掌握云原生未来：CNCF 认证全面攻略与备考秘籍

周排行

循环神经网络（rnn）讲解

Tigao教程四：单独的关节运动

金蝶K3WISE15.0-注册套打教程

如何在Mac上配置Kubernetes

Android应用结束自身进程的方法

SpringMVC学习十三拦截器栈

中国驻洛杉矶总领馆举行新春招待会

HttpClient get post 发送

11 - three.js 笔记 - 绘制三维字体模型

Mysql递归获取某个父节点下面的所有子节点和子节点上的所有父节点

每日归档

更多

2024-05-01(4)

2024-04-30(1)

2024-04-29(40)

2024-04-28(0)

2024-04-27(56)

2024-04-26(39)

2024-04-25(22)

2024-04-24(36)

2024-04-23(26)

2024-04-22(39)