cvpr 2016 Video object segmentation study paper

 

Abstract— Video object segmentation, a binary labelling
problem is vital in various applications including object tracking,
action recognition, video summarization, video editing, object
based encoding and video retrieval(检索). This paper presents an
overview of recent strategies in video object segmentation(分类),
focusing on the techniques for solving challenges like complex
Moving background and, Illumination (light / illumination) Changes, occlusions (occlusion) ,
motion blur (motion blur) , Shadow Effect Variation and View Point. Significant
works evolved in this research field over recent years are
categorized based on the challenges solved by the researchers. A
list of challenging datasets and evaluation metrics(指标) available for
video object segmentation is presented. Finally, research gaps in
this domain(领域) are discussed.
Abstract: video classification, a video clip, an important method of retrieving video encoding and video based segmentation algorithm, in which a variety of fields, such as object tracking, motion recognition.
This thesis presents a recent video segmentation algorithm strategy, it is mainly to solve the complex mobile background lighting changes, occlusion goods, motion blur, shadows change problems, and perspectives.
This important paper work and challenges involved are divided into many categories, and at the same time, he also provided data sets and evaluation to learn, finally, he also discussed points of disagreement this field.
Recent internet world is engaged with massive amount of
video data thanks to the development in storage devices and
handy imaging systems. Huge terabyte(兆字节) of video are regularly

generated for various useful applications like surveillance(监控),

news broadcasting, telemedicine, etc. Based on the
information provided by CISCO on ‘Visual Networking Index
(VNI)’, the growth of internet video traffic will be three fold
from 2015 to 2020. Manually extracting semantic(语义) information
from this enormous amount of internet video is highly
unfeasible, seeking the need for automated methods to
annotate (comment) / derive (export) Useful Information from the Data for at The Video
video management and retrieval [1]. Hence, one of the
essential steps for video processing and retrieval is video
object segmentation, a binary labelling problem for
differentiating the foreground objects accurately from the
background. Video object segmentation aims at partitioning(分离)
every frame(帧) in a video into meaningful objects by grouping
Along pixels The SPATIO-temporal (time and space)  direction that in Exhibit
coherency in appearance and motion [2]. Video object
segmentation task is highly challenging due to the following
reasons: (i) unknown number of objects in a video (ii) varying
background in a video and (iii) occurrence of multiple objects
in a video [3]. Existing approaches in video segmentation can
be broadly classified into two categories viz, interactive(交互式)
and Method, Unsupervised (unsupervised) Method,. Interaction Objects
segmentation method human intervention in initialization
process while unsupervised approaches can perform object
segmentation automatically. In Semi supervised approaches,
user intervention is required for annotating initial frames and
these annotations are transferred to the entire frames in the
video. Automated object segmentation approaches [7][8][9]
can segment any video data into meaningful objects without
user interaction based on object proposals and motion cues
from the video. The common assumption followed by most of
the automated methods is that only single object is moving
throughout the video and use only the motion information for
segmenting the object from the background. This assumption
will lead to poor segmentation under discontinuous motion of
object [11]. Referring the literature [12] [13] [14] [15] for
survey on video object segmentation which describes the
techniques available for image segmentation, not to video
data. In [15] authors classified the approaches in video
segmentation as inference and feature modes. The
segmentation techniques propose so far to improve the
segmentation results are grouped as inference modes and
methods that depend on features like depth, motion and
histogram are termed as feature modes. From this observation,
it is evident that none of the researchers have discussed the
segmentation approaches from the perspective of the
challenges solved by the algorithm. Hence this paper
categorizes the significant work contributed by researchers in
video object segmentation based on the issues resolved by the
respective authors. Several issues degrading the segmentation
performance are moving back ground, moving camera,
illumination variation, occlusion, shadow effect, viewpoint
variation, etc. Moreover the proposed algorithm should 
provide tradeoff between segmentation accuracy and
complexity. As depicted in fig. 1, this paper classifies the
video object segmentation task as:
1. Issue tackling mode
2. Complexity reduction mode and
3. Inference mode
The main contributions of this paper are:
x Summarizing the recent activities in video object
segmentation domain.
x Categorizing the significant works in this research
field meaningfully and
x Presenting a list of database and evaluation metrics
needed for developing an efficient video object
segmentation framework.
Organization of this paper: Section II describes the algorithms
contributed significantly in tackling the issues (discussed
earlier) involved in video object segmentation. Section III
presents an overview on segmentation approaches with
reduced complexity available in literature. Section IV provides
a gist on object segmentation techniques that fall under
inference mode. Section V lists the dataset and the evaluation
metrics used in these segmentation approaches and discusses
about research gaps in video object segmentation field.
Section IV concludes this study.
 
Recently, our online world filled with a variety of video information. . . . . In short with a large section of the words to tell you it is very important to you, then that is our purpose (video segmentation algorithm) is separated from each frame of the video, which show the movement of objects in harmony video (an estimated consistency), however, the video division has the following difficulties:
1. I do not know how many there are in the target object video
2. changeable background
3. Multiple target objects
Now mainly two methods: interactive, unsupervised methods. Of course, we are sure this article unsupervised method, automatic segmentation of video objects.
In the semi-supervised learning method, frame initialization beginning, and we want to split the object is certainly necessary, but unsupervised method does not require this, there is now a lot of video segmentation algorithm is assumed that only a single object object mobile, but not the face of a continuously moving target time, will lead to adverse effects. The authors believe that in 15 citations video segmentation algorithm should be based on " feature extraction and concluded that " the index is about 12-15 summarizes the methods of image segmentation. Based on current observations, it is clear very few people summed up the method of video segmentation algorithm from the point of view, therefore this article summarizes some of the issues that may degrade the accuracy of the video segmentation, such as moving backgrounds, move the camera, the light changes , shield, shadow effects, the viewing angle changes.
Furthermore, the algorithm proposed in this article will consider the trade-off between complexity and accuracy of the algorithm.
So this article is architecture;
1. solving approach
2. Reduced complexity
3. Analysis of interference
The main contribution of this article three: 1 summarizes recent work in this area 2. The method currently used classification 3. Provide some data sets and data standards for the reader to practice.
II. ISSUE TACKLING MODE
This section details about ‘issue tackling mode’, first
category of the video object segmentation approach. Though
several issues (as discussed earlier) affect the performance of
the segmentation approaches, commonly occurring problems
are moving background, occlusion, shadow, rain , moving
camera, illumination and view point variation.
A. Surveillance video systems
The traffic surveillance systems include detection and
recognition of moving vehicles (objects) from traffic video
sequence. For any traffic surveillance system, vehicle
segmentation is the fundamental step and base for tracking the
vehicle movements. But, Vehicle segmentation in traffic
video is still challenging due to the moving objects and
illumination variations. To solve this issue, an unsupervised
neural network based background modelling has been
proposed for real time objects segmentation. In this work,
neural network serves as both adaptive model of the
background in a video sequence and a classifier of pixels as
background/foreground. The segmentation time taken by the
neural network is improved by implementing it in FPGA kit.
Though this neural network based background subtraction
method achieves good segmentation accuracy, it works well
only under slightly varying illumination and moving
background. A high cost is involved in reducing time
complexity [16]. Followed by this, [17]Appiah et. Al proposed
an integrated hardware implementation of moving object
segmentation in real time video stream under varying lighting
conditions. Two algorithms for multimodal background
modelling and connected component analysis is implemented
on a single chip FPGA. This method segments objects under
varying illumination condition at high processing speed. The
two algorithms described so far do not take raining issue into
account. Under raining situation, shadows and colour
reflections are the major problems to be tackled. A
conventional video object segmentation algorithm that
combines the background construction-based video object
segmentation and the foreground extraction-based video
object segmentation has been proposed. The foreground is
separated from the background using histogram-based change
detection technique and object regions are segmented
accurately by detecting the initial moving object masks based
on a frame difference mask. Shadow and colour reflection
regions are removed by diamond window mask and colour
analysis of moving object respectively. Segmentation of
moving objects are refined by morphological operations. The
segmentation results of moving objects under rainy situations.In the future, we will adaptively
obtain the threshold and adjustthe content of the video
automatically. Later, Chien et al [19] proposed a video object
segmentation and tracking technique for smart cameras in
visual surveillance networks. A multi-background model
based on threshold decision algorithm for video object
segmentation under drastic changes in illumination and
background clutter has been developed. In this method, the
threshold is selected robustly without user requirement and it
is different from per pixel background model which avoids
possible error propagations. Another algorithm for extracting
objects from videos captured by static camera has been
proposed to solve issues like waving tree, camouflage region
and sleeping is also proposed [20]. In this method, reference
background is obtained by averaging of some initial frames.
Temporal processing for object extraction do not consider
spatial correlation amongst the moving objects across frames.
Hence, an approximate motion field is derived using the
background subtraction and temporal difference mechanism.
The background model adapts temporal changes (swaying
trees, rippling water, etc) which extract the complementary
object in the scene.
using [18] is shown in fig.2 . For traffic detection, the most important thing is to classify a wide variety of vehicles, however, because the object is always the cause of the movement, so it is still difficult to identify. So in order to solve this problem, a unsupervised neural networks are adaptive model we used as a video in the foreground and background colors and pixel classifier. The case of the neural network computation time can be loaded him up in fgpa reduction, although this neural network achieved a high classification results as a method "screen out background", but can hardly move in little light and shadows and backgrounds use, while reducing the high cost of the time complexity, therefore, Appiah et. Al presents a viable hardware on the integrated algorithm, the algorithm can be implemented both on a single core FGPA, and it is well solution to lighting problems. But he does not solve the problem of rain, in the rain, shadows and reflected light is the main question. The traditional algorithms were mixed together based on the background classification and separated foreground object architecture. Should be used based on the foreground "Histogram" change detection technique, the target area should also be segmented, the first method is the detection of a moving object on the basis of differences in the mask and a frame mask moving object (which is what is meant, currently not understood) . Anyway, he said that part of the shadow and reflection color will be a diamond window mask and color analysis algorithms are a moving target to deal with. This algorithm is a form of geometric points, moving objects are segmented need to limit this algorithm, the results are presented in fig2 out. In the future, we let the algorithm automatically adaptively adjust the "Threshold" and "adjustment of content." After, an algorithm is proposed for Chien miniature camera visual neural network, in this method, the "threshold value" can help the user does not need to give robust, but it is also different algorithms pixel by pixel, avoiding possible mistakes propaganda? ? (What do you mean, do not understand). Another method is to use a still camera, designed to capture are shaking the tree, there are some things in disguise. Some use when initializing the average initial frame (what does that mean ??), but this does not consider spatial correlation, especially those who jog objects? ? .
 
Personal goes renderings in conjunction with the algorithm, that is that you can filter out the light and shadow effects, retained only real goal.
 
These last words really do not understand, so I direct Google translation? ? ? ?
 
Thus, use of the derived approximate stadium
background subtraction and temporal difference mechanisms.
Background model to adapt to the time change (swing
trees, rippling water, etc.) to extract complementary
objects in the scene.
? ? ? ?
 
B. Generic video sequences
Moving foreground object extraction from a given generic
video shot is one of the vital tasks for content representation
and retrieval in many computer vision applications. An
iterative method based on energy minimization has been
proposed for segmenting the primary moving object efficiently
from moving camera video sequences. Initial object
segmentation obtained using graph-cut is improved repeatedly
by the features extracted over a set of neighbouring frames
[21]. Thus, this iterative method can efficiently segment the
objects in video shots captured on a moving camera. A
conditional random field model based video object
segmentation system, capable of segmenting multiple moving
objects from complex background has been proposed [22]. In
this work, a complementary property of point and region
trajectories is utilized effectively by transferring the labels of
sparse point trajectories to region trajectories. Region
trajectories based on shape consistency provides robust design
to segment spatially overlapping region trajectories. As region
trajectories are extracted from hierarchical image over
segmentation, it segments meaningful regions over time.
time and computational complexity. Unsupervised
segmentation of moving camera video sequence using inter
frame change detection has been proposed [23].
Universal Video Sequence
Reference is made to a kind of "iterative algorithm", he is initialized by a few picture frames beginning of the division, so this algorithm can extract information from "mobile camera" from some elements in adjacent frames extracted? ? ? Papers filed in 22 months
Video Object conditional random field model-based
segmentation system can be divided more mobile
has been proposed objects (??? Google translation) from the complex background
22 The main thesis mentioned algorithm tracks from sparse to dense track?
23 papers mentioned unsupervised learning methods?
 
 
 
 
 
 
 

Guess you like

Origin www.cnblogs.com/coolwx/p/11462639.html