CVPR2023 Target Detection Paper Collection

The International Conference on Computer Vision and Pattern Recognition (CVPR) is one of the top conferences in the field of computer science and an interdisciplinary conference in image processing, machine learning, artificial intelligence and other fields.

Every year, the CVPR conference will have a large number of paper submissions and academic exchange activities, covering multiple research directions including image processing, computer vision, pattern recognition, machine learning, deep learning, artificial intelligence, etc. It is the most influential and influential in this field. One of the representative academic conferences.

AMiner uses AI technology to classify and sort out the conference papers included in CVPR2023. Today we share 49 papers on the subject of object detection. Here we present the ten most popular papers. Welcome to download and collect!

1. Detecting Everything in the Open World: Towards Universal Object Detection paper details page
Authors: Zhenyu Wang, Yali Li, Xi Chen, Ser-Nam Lim, Antonio Torralba, Hengshuang Zhao, Shengjin Wang
Link: https://www.aminer. cn/pub/641a71fb90e50fcafd7200d8/
AI Review (Large Model Driven): In this paper, we formally address global perceptron detection. Its goal is to detect each scene and predict each category. The reliance on human annotations, limited visual information, and the constraints of new classes of open worlds limit the generalizability of traditional detectors.

2. Adaptive Sparse Convolutional Networks with Global Context Enhancement for Faster Object Detection on Drone Images paper details page
Author: Bowei Du, Yecheng Huang, Jiaxin Chen, Di Huang
Link: https://www.aminer.cn/pub/64225b7590e50fcafde11f47/
AI A Survey (Large Model Driven): Efficient Sparse Transformation Networks for High Accuracy UAV Imagery This paper investigates the use of sparse transformations to optimize detection heads. Nevertheless, it suffers from inappropriate integration of small objects and local information and coarse control of dynamics below the background. To address these issues, we propose a novel Global Context Enhanced Adaptive Sparse Variational Network (CEAC). It first develops a context-enhanced collective normalization (CEGN) layer by replacing the statistics based on sparse sample features with those of global contextual features, and then designs an adaptive multi-layer masking strategy to produce different heights, Produces optimal coverage in varying sizes of foreground coverage. Experimental results confirm that the network significantly reduces GHLPs and speeds up the inference process.

3. Detection Hub: Unifying Object Detection Datasets via Query Adaptation on Language Embedding Paper Details Page
Authors: Lingchen Meng, Xiyang Dai, Yinpeng Chen, Pengchuan Zhang, Dongdong Chen, Mengchen Liu, Jianfeng Wang, Zuxuan Wu, Lu Yuan, Yu-Gang Jiang
link: https://www.aminer.cn/pub/62a0137a5aee126c0ff6a08a/
AI review (large model driven): Training cluster detectors on multiple datasets can significantly improve the performance of computer vision tasks. However, when training a clustering detector under a dataset, this co-training brings two major obstacles: classification discrepancies and inconsistent bounding frame annotations. This paper shows that these two challenges can be effectively addressed by dynamically adjusting the language embeddings of each dataset into each cluster. We design a detection center to make choices for classification problems based on different dataset distributions. Compared with previous methods, our improved method can utilize the semantic center of the classifier as the semantic center of common categories, while learning semantic preferences for specific category attributes to handle annotation differences and generate domain gaps. These new improvements allow us to simultaneously train a set of individual precision calibrators to take full advantage of their advantages.

4. What Can Human Sketches Do for Object Detection? Paper Details Page
Author: Pinaki Nath Chowdhury, Ayan Kumar Bhunia, Aneeshan Sain, Subhadeep Koley, Tao Xiang, Yi-Zhe Song
Link: https://www.aminer.cn/pub /64225b7d90e50fcafde14e97/
AI Review (Large Model Driven): This paper develops a sketch-based image retrieval framework for the first time. The result is a sketch-based image detection framework. The framework does not need to (i) know the category of the test and (ii) need to specify additional labels and category markers. Instead, we demonstrate that an intuitive combination between the two basic models can effectively solve the problem of CLIP and provide model generalization for image retrieval. Evaluating our framework on standard image detection datasets outperforms both supervised and weakly supervised object detectors.

5. Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation paper details page
Authors: Feng Li, Hao Zhang, Huaizhe xu, Shilong Liu, Lei Zhang, Lionel M. Ni, Heung-Yeung Shum
Link: https: //www.aminer.cn/pub/629ec2145aee126c0fb7a260/
AI Review (Large Model Driven): Feature-based Clustering Algorithm, We propose a new clustering framework. The framework outperforms all existing special segmentation methods by training on tens of millions of parameters on the ResNet50 baseline and the SwinL baseline. Our experiments show that feature-based clustering algorithms can significantly improve the level of existing state-of-the-art segmentation techniques.

6. Consistent-Teacher: Towards Reducing Inconsistent Pseudo-targets in Semi-supervised Object Detection paper details page
Authors: Xinjiang Wang, Xingyi Yang, Shilong Zhang, Yijiang Li, Litong Feng, Shijie Fang, Chengqi Lyu, Kai Chen, Wayne Zhang
link : https://www.aminer.cn/pub/63180be590e50fcafded435a/
AI Review (Large Model Driven): In this paper, we delve into the challenges faced in semi-supervised object detectors. We observe that 1) the general allocation policy of current detectors is sensitive to label noise. 2) Task inconsistency, i.e. the conventional allocation policy is more sensitive, classification and reversal prediction are simultaneously inferred on the same feature point. These issues lead to inconsistent optimization objectives for the learner network, which deteriorates performance and speeds up model fusion. We propose a system called Consistent Teacher, which achieves a performance of 41.0 mAP on a large number of Ssod evaluations.

7. Phase-Shifting Coder: Predicting Accurate Orientation in Oriented Object Detection paper details page
Author: Yi Yu, Feipeng Da
Link: https://www.aminer.cn/pub/6371b1a790e50fcafdb2e958/
AI review (large model drive): This article proposes A new resolution transform slicer is proposed, named time-slot shift decoder (PCS). The bipolar frequency version of this decoder is PSCD. By mapping the rotation order of different cycles to different slot slots, we provide a unified slot-based framework for probabilistic reasoning problems. Based on this framework, the general problems of oriented object detection, such as boundary discontinuity and equivalence, are perfectly solved. Analysis and experiments on three datasets demonstrate the effectiveness and potential of the method.

8. CAT: LoCalization and IdentifyAtion Cascade Detection Transformer for Open-World Object Detection paper details page
Authors: Shuailei Ma, Yuefeng Wang, Jiaqi Fan, Ying Wei, Thomas H. Li, Hongli Liu, Fanbing Lv
Link: https://www .aminer.cn/pub/640015f390e50fcafdcf9dd5/
AI Review (Large Model Driven): This paper proposes a new efficient unsupervised clustering detector based on LoCal. The method combines model-driven and input-driven PLM, and generates robust markers through a cross-decoding algorithm. Extensive experiments on two baseline datasets show that our model outperforms all statistical metrics used in OWOD tasks. We propose an adaptive pseudo-labeling mechanism that fuses model-driven and input-driven PLMs and independently creates robust labels for unknowns.

9. MixTeacher: Mining Promising Labels with Mixed Scale Teacher for Semi-Supervised Object Detection paper details page
Authors: Liang Liu, Boshen Zhang, Jiangning Zhang, Wuhao Zhang, Zhenye Gan, Guanzhong Tian, ​​Wenbing Zhu, Yabiao Wang, Chengjie Wang Links
: https://www.aminer.cn/pub/6413dabe90e50fcafd3cd802/
AI Review (Large Model Driven): We propose a new framework for semi-supervised token generation. The framework improves quality-based token generation and modular learning by introducing a mixed quality teacher. Furthermore, we propose to generalize predictions at the module level using a distributional similarity measure, which outperforms the best predictions from individual quality features. Our extensive experiments show that the method achieves state-of-the-art performance.

10. DetCLIPv2: Scalable Open-Vocabulary Object Detection Pre-training via Word-Region Alignment Paper Details Page
Authors: Lewei Yao, Jianhua Han, Xiaodan Liang, Dan Xu, Wei Zhang, Zhenguo Li, Hang Xu
Link: https://www .aminer.cn/pub/6434cfd690e50fcafd7a476f/
AI Review (Large Model Driven): This paper introduces DetCLIPv2, an efficient and scalable training framework for open part-of-speech tagging (OVD) combined with large-scale image-text pairs. Compared with previous OVD frameworks, DetCLIPv2 directly learns fine-grained character region alignment from a large number of image-text pairs, thus improving the localization ability of the model. The model effectively utilizes the image-text pair data by jointly training and adopting low-resolution input, and DetCLIPv2 uses more image-text pairs than DetCLIPv2 in the same training time, and improves the performance.

——————————————————————————————————————

To view all target detection papers, click here :

https://www.aminer.cn/conf/5eba43d8edb6e7d53c0fb8a1/CVPR2023

Supongo que te gusta

Origin blog.csdn.net/AI_Conf/article/details/130829259
Recomendado
Clasificación