Read the paper notes (21)] [CVPR2017: Deep Spatial-Temporal Fusion Network for Video-Based Person Re-Identi fi cation

Introduction

(1)Motivation:

The current relationship between the characteristics of the image sequence CNN can not extract; RNN more information ignored frames of a video sequence early, also for lack extract specific information such as gait; Siamese Triplet losses and loss lack of consideration of label information (???).

 

(2)Contribution:

Proposed a new framework to-end network, called CNN and RNN Fusion (CRF), combined with the Siamese, Softmax loss of joint function. Characteristics of the body and body parts were the model train, get more discrimination in the representation.

 

Method

(1) Frame:

 

(2) Input:

Input consists of two parts, the original image information, an optical flow information (such pedestrian gait, the operation more clearly).

 

(3) CNN layers:

This layer uses the same reference CNN, details refer to [ paper reading notes (X) [CVPR2016]: Recurrent Convolutional Network for Video-based Person Re-Identi fi cation ]

Convolution consists of three modules, each module comprising: a convolution layer (kernel size is 5 * 5), the maximum cell layer, RELU layer. The input sequence is defined as: where T = 16, the CNN layer may be defined as:

Wherein the resulting expressed as:

 

(4) Time pooled layer:

Using pooled mean operation, is defined as:

 

(5) RNN layer:

Node is calculated as follows:

Time pooled layer:

 

(6) a fusion temporal characteristics:

Since RNN more neglect of the early frames, need to make up for lost information, the output CNN, RNN twice binding, is calculated as follows:

 

(7) Multi-layer loss:

Loss function contains Siamese Softmax loss and loss:

 

(8) local / global feature fusion:

Pedestrian body into the upper half and a lower half, features are extracted, the overall integration is performed:

 

 

Experiments

(1) The experimental setup is:

① 数据集设置:PRID-2011、iLIDS-VID、MARS;

② 参数设置:epochs > 10,视频序列长度 = 16,W1 = W2 = W3 = 1.

 

(2)实验结果:

 

Guess you like

Origin www.cnblogs.com/orangecyh/p/12304250.html