Hello everyone, here is [Come to a Scallion Cake], this time I brought a paper sharing on target tracking, and I will share it with you~
I have done research on target tracking algorithms (mainly single target tracking SOT) for a period of time before, and have studied more than 40 top conference papers. Therefore, I set up a new column Object Tracking (SOT)|top meeting papers|study notes , paper notes to share with you, so that you can quickly understand the progress of target tracking and master different algorithm ideas. Welcome everyone to discuss and write your own thoughts in the comment area~
This article is the target tracking paper notes of ICCV-6, and I will share it with you. For analysis notes of other top conference papers, see other articles in the column, welcome to pay attention~
For specific paper analysis notes, see other articles in the column, everyone is welcome to pay attention, the link is as follows:
Target Tracking | Last Three Years | 45 Top Conference Papers Organized
Target Tracking | Seven Datasets | Organized
Target Tracking | Paper Note Sharing | ICCV- 6 papers
on target tracking|Paper notes sharing|ICCV-2 papers
on target tracking|Paper notes sharing|ECCV-6 papers
on target tracking|Paper notes sharing|CVPR-12 papers
on target tracking|Paper notes sharing|CVPR-10 papers (1)
target Tracking|Paper notes sharing|CVPR-10 articles (2)
Article directory
- 1. Thesis topic
- 2. Main idea
- 3. Specific articles
-
- Learning Spatio-Temporal Transformer for Visual Tracking
- Learning to Track Objects from Unlabeled Videos
- Saliency-Associated Object Tracking
- HiFT: Hierarchical Feature Transformer for Aerial Tracking
- Learn to Match: Automatic Matching Network Design for Visual Tracking
- Learning to Adversarially Blur Visual Object Tracking
1. Thesis topic
Essay topic |
---|
Learning to Track Objects from Unlabeled Videos |
Learning Spatio-Temporal Transformer for Visual Tracking |
Learning to Adversarially Blur Visual Object Tracking |
HiFT: Hierarchical Feature Transformer for Aerial Tracking |
Learn to Match: Automatic Matching Network Design for Visual Tracking |
Saliency-Associated Object Tracking |
2. Main idea
It mainly uses the transformer to use spatio-temporal features, unsupervised, focus on local salient areas, automatically match the network and improve the matching operator of siamese to resist attacks
3. Specific articles
Learning Spatio-Temporal Transformer for Visual Tracking
Learning spatiotemporal transformers for visual tracking
This paper uses transformer to design a tracker and combines spatio-temporal features.
It is a very effective method, you can refer to it!
The previous Siamese series of algorithms only used spatial features, so they are not very friendly to the scenes where the target disappears and the object changes too much.
Using transformers, the problem of long-distance interactions is addressed in sequence modeling.
Spatial information contains object appearance information for target localization, and temporal information contains object state changes between frames. Considering the superior ability to model global dependencies, tarnsformer integrates spatial and temporal information tracking to generate object localization with discriminative spatio-temporal features.
Learning to Track Objects from Unlabeled Videos
Learning to track objects from unlabeled videos
This paper uses an unsupervised tracker.
Three challenges of previous unsupervised tracking were found: moving object discovery, rich temporal variation exploitation, and online update.
Aiming at the above methods, a new unsupervised tracking method is proposed. First, use unsupervised optical flow and dynamic programming to sequentially sample moving objects; then use a single frame pair to train a naive Siamese tracker from scratch; finally, use a cyclic memory learning scheme to train the tracker to achieve online updates.
This paper also proposes that an important idea of target tracking is to input a dynamic and constantly updated template object to ensure long-term tracking.
In summary, two points, one is unsupervised tracking, and the other is using online update.
Saliency-Associated Object Tracking
Significantly Associated Object Tracking
Most of the current trackers track and recognize the whole target, but it is difficult to track targets with various shape changes. Another idea is to divide the target into the same small blocks for local tracking, and track all small blocks in parallel, and use all these small blocks for parallel tracking.
But in fact, many local small blocks are useless and will have a bad influence on the result.
Therefore, this paper proposes to use the local saliency mining module to capture the local saliency based on the local tracking of the salient area, and then use the saliency association modeling module to associate the captured saliency together for tracking and state estimation.
This method has a better tracking effect on objects with large shape changes.
HiFT: Hierarchical Feature Transformer for Aerial Tracking
Aerial Tracking of Hierarchical Feature Tarnsformer
This paper mainly aims at improving the Siamese series of tracking methods, and proposes an efficient and effective layered feature transformer (shift) aerial tracking**. The hierarchical similarity graph generated by the multi-level convolutional layer is input into the feature transformer to realize the interactive fusion of spatial clues (shallow layer) and semantic clues (deep layer).
Due to the use of transformer, the global context information can be better extracted to facilitate target search; multi-level feature learning is used to obtain a feature space for tracking with strong recognizability.
Of course, the scene in this article is mainly a low-resolution tracking object captured by a drone in the air.
Learn to Match: Automatic Matching Network Design for Visual Tracking
Learning to Match: Network Design for Automatic Matching for Visual Tracking
The previous Siamese series algorithms work well, but for the input template object and search object, only cross-correlation is used to find the similarity. Two disadvantages: 1. Heuristic matching network design heavily relies on expert experience. 2. It is difficult for a single matching operator to guarantee stable tracking in all challenging environments
This paper proposes six other matching operators to explore the feasibility of matching operator selection, which is actually a very good idea!
We can combine them to explore complementary features and get better matching results.
In addition, the binary channel operation (BCM) is used to search for the optimal combination of these operators, and the optimal combination of various operators is automatically generated to obtain a general target tracking model to meet a variety of complex target tracking. Scenes.
Learning to Adversarially Blur Visual Object Tracking
Learning, adversarial blurred visual object tracking
Due to the previous object tracking, there is not much discussion on the robustness of tracking under blurred images, so this paper proposes Adversarial Blur Attack (ABA).
The author studies the principle of motion blur generation and designs a blur generation method for visual tracking.
But we don't seem to need such a research scenario? It may be possible to use this method to generate a corresponding data set in the future, and use it to train to improve the robustness of the model.
After that, I will share detailed notes of more than 40 top conference articles in the past three years in the column Target Tracking (SOT)|Top Conference Papers|Study Notes , so that everyone can get started quickly.
Interested students like + bookmark + follow, directly enter the column to learn ~ your support is my biggest motivation ~
interested students like + bookmark + follow, directly enter the column to study ~ your support is my greatest Motivation ~
Interested students like + bookmark + follow, directly enter the column to learn ~ Your support is my biggest motivation ~