ECCV22 latest paper summary (target detection, image segmentation, supervised learning, GAN, etc.)

Strong thanks to the paper resources provided by Jishi platform

ECCV 2022 has been released, a total of 1629 papers have been selected, and the acceptance rate is less than 20%. In order to allow everyone to acquire and learn the cutting-edge technology of computer vision faster, the author tracks the latest papers of ECCV2022, including papers and code summaries of different research directions.

         This updated ECCV 2022 paper includes detection, segmentation, image processing, video understanding, neural network structure design, unsupervised learning, self-supervised learning, transfer learning and other directions. Project address: https://github.com/extreme-assistant/ECCV2022-Paper-Code-Interpretation

  • detection
  • Split
  • Image Processing
  • video processing
  • Image and video retrieval and understanding
  • estimate
  • Target Tracking
  • Text Detection and Recognition
  • GAN/Generative/Adversarial
  • Neural Network Structure Design
  • data processing
  • Model Training/Generalization
  • model compression
  • model evaluation
  • Semi-supervised learning / Self-supervised learning
  • Multimodal/cross-modal learning
  • few-shot learning
  • reinforcement learning

detection

2D object detection

[1] Point-to-Box Network for Accurate Object Detection via Single Point Supervision (point-to-box network for accurate target detection through single-point supervision) paper: https://arxiv.org/abs/2207.06827 code

: https
: / /github.com/ucas-vg/p2bnet

[2] You Should Look at All Objects (You should look at all objects)

paper: https://arxiv.org/abs/2207.07889
code: https://github.com/charlespikachu/yslao

[3] Adversarially-Aware Robust Object Detector (adversarial awareness robust object detector)

paper: https://arxiv.org/abs/2207.06202
code: https://github.com/7eu7d7/robustdet

3D object detection

[1] Rethinking IoU-based Optimization for Single-stage 3D Object Detection (rethinking IoU-based single-stage 3D object detection optimization)

paper: https://arxiv.org/abs/2207.09332

Human Interaction Detection

[1] Towards Hard-Positive Query Mining for DETR-based Human-Object Interaction Detection (hard query mining for DETR-based human-computer interaction detection)

paper: https://arxiv.org/abs/2207.05293
code: https:/ /github.com/muchhair/hqm

Image Anomaly Detection

[1] DICE: Leveraging Sparsification for Out-of-Distribution Detection (DICE: Leveraging Sparsification for Out-of-Distribution Detection)

paper: https://arxiv.org/abs/2111.09805
code: https://github.com/deeplearning- wisc/dice

Split

instance segmentation

[1] Box-supervised Instance Segmentation with Level Set Evolution (box-supervised instance segmentation with level set evolution)

paper: https://arxiv.org/abs/2207.09055

[2] OSFormer: One-Stage Camouflaged Instance Segmentation with Transformers (OSFormer: Using Transformers for single-stage camouflaged instance segmentation)

paper: https://arxiv.org/abs/2207.02255
code: https://github.com/pjlallen/ osformer

semantic segmentation

[1] 2DPASS: 2D Priors Assisted Semantic Segmentation on LiDAR Point Clouds (2DPASS: 2D Priors Assisted Semantic Segmentation on LiDAR Point Clouds)

paper: https://arxiv.org/abs/2207.04397
code: https:// github.com/yanx27/2dpass

Video Object Segmentation

[1] Learning Quality-aware Dynamic Memory for Video Object Segmentation (learning quality-aware dynamic memory for video object segmentation)

paper: https://arxiv.org/abs/2207.07922
code: https://github.com/workforai/qdmn

Image Processing

super resolution

[1] Dynamic Dual Trainable Bounds for Ultra-low Precision Super-Resolution Networks (Dynamic Dual Trainable Bounds for Ultra-low Precision Super-Resolution Networks)

paper: https://arxiv.org/abs/2203.03844
code: https:// github.com/zysxmu/ddtb

Image Denoising

[1] Deep Semantic Statistics Matching (D2SM) Denoising Network (Deep Semantic Statistics Matching (D2SM) Denoising Network)

paper: https://arxiv.org/abs/2207.09302

Image Restoration/Image Enhancement/Image Reconstruction

[1] Semantic-Sparse Colorization Network for Deep Exemplar-based Colorization (Semantic Sparse Colorization Network for Deep Exemplar-based Colorization)

paper: https://arxiv.org/abs/2112.01335

[2] Geometry-aware Single-image Full-body Human Relighting (Geometry-aware Single-image Full-body Human Relighting)

paper: https://arxiv.org/abs/2207.04750

[3] Multi-Modal Masked Pre-Training for Monocular Panoramic Depth Completion (multi-modal masked pre-training for monocular panoramic depth completion)

paper: https://arxiv.org/abs/2203.09855

[4] PanoFormer: Panorama Transformer for Indoor 360 Depth Estimation (PanoFormer: Panorama Transformer for Indoor 360 Depth Estimation)

paper: https://arxiv.org/abs/2203.09283

[5] SESS: Saliency Enhancing with Scaling and Sliding (SESS: Enhancing Saliency with Scaling and Sliding)

paper: https://arxiv.org/abs/2207.01769

[6] RigNet: Repetitive Image Guided Network for Depth Completion (RigNet: Repeated Image Guided Network for Depth Completion)

paper: https://arxiv.org/abs/2107.13802

Image Outpainting

[1] Outpainting by Queries (outsourcing by query)

paper: https://arxiv.org/abs/2207.05312
code: https://github.com/kaiseem/queryotr

Style Transfer

[1] CCPL: Contrastive Coherence Preserving Loss for Versatile Style Transfer (CCPL: Contrastive Coherence Preserving Loss for General Style Transfer)

paper: https://arxiv.org/abs/2207.04808
code: https://github.com/JarrentWu1031 /CCPL

Video Processing

[1] Improving the Perceptual Quality of 2D Animation Interpolation (improving the perceptual quality of two-dimensional animation interpolation)

paper: https://arxiv.org/abs/2111.12792
code: https://github.com/shuhongchen/eisai-anime- interpolator

[2] Real-Time Intermediate Flow Estimation for Video Frame Interpolation (real-time intermediate flow estimation of video frame interpolation)

paper: https://arxiv.org/abs/2011.06294
code: https://github.com/MegEngine/arXiv2020- RIFE

Image and video retrieval and understanding

Action recognition

[1] ReAct: Temporal Action Detection with Relational Queries (ReAct: Temporal Action Detection using Relational Queries)

paper: https://arxiv.org/abs/2207.07097
code: https://github.com/sssste/react

[2] Hunting Group Clues with Transformers for Social Group Activity Recognition (Use Transformers to find group clues for social group activity recognition)

paper: https://arxiv.org/abs/2207.05254

video understanding

[1] GraphVid: It Only Takes a Few Nodes to Understand a Video (GraphVid: It Only Takes a Few Nodes to Understand a Video)

paper: https://arxiv.org/abs/2207.01375

[2] Deep Hash Distillation for Image Retrieval (deep hash distillation for image retrieval)

paper: https://arxiv.org/abs/2112.08816
code: https://github.com/youngkyunjang/deep-hash-distillation

Video Retrieval

[1] TS2-Net: Token Shift and Selection Transformer for Text-Video Retrieval (TS2-Net: Token Shift and Selection Transformer for Text Video Retrieval)

paper: https://arxiv.org/abs/2207.07852
code: https://github.com/yuqi657/ts2_net

[2] Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video Retrieval (Lightweight Attention Feature Fusion: A New Baseline for Text-to-Video Retrieval) paper

: https://arxiv.org/abs/2112.01832

estimate

Pose Estimation

[1] Category-Level 6D Object Pose and Size Estimation using Self-Supervised Deep Prior Deformation Networks (category-level 6D object pose and size estimation using self-supervised deep prior deformation network) paper

: https://arxiv.org/abs /2207.05444
code: https://github.com/jiehonglin/self-dpdn

depth estimation

[1] Physical Attack on Monocular Depth Estimation with Optimal Adversarial Patches (Physical Attack on Monocular Depth Estimation with Optimal Adversarial Patches)

paper: https://arxiv.org/abs/2207.04718

Target Tracking

[1] Towards Grand Unification of Object Tracking (Towards the grand unification of target tracking)

paper: https://arxiv.org/abs/2207.07078
code: https://github.com/masterbin-iiau/unicorn

Text Detection and Recognition

[1] Dynamic Low-Resolution Distillation for Cost-Efficient End-to-End Text Spotting (Dynamic Low-Resolution Distillation for Cost-Effective End-to-End Text Recognition)

paper: https://arxiv.org/abs/2207.06694
code: https://github.com/hikopensource/davar-lab-ocr

GAN/Generative/Adversarial

[1] Eliminating Gradient Conflict in Reference-based Line-Art Colorization (eliminating gradient conflicts in reference-based line-art coloring)


paper:https://arxiv.org/abs/2207.06095
code:https://github.com/kunkun0w0/sga

[2] WaveGAN: Frequency-aware GAN for High-Fidelity Few-shot Image Generation (WaveGAN: Frequency-aware GAN for High-Fidelity Few-shot Image Generation)

paper: https://arxiv.org/abs/2207.07288
code: https https://github.com/kobeshegu/eccv2022_wavegan

[3] FakeCLR: Exploring Contrastive Learning for Solving Latent Discontinuity in Data-Efficient GANs (FakeCLR: Exploring Contrastive Learning for Solving Latent Discontinuity in Data-Efficient GANs) paper:

https://arxiv.org/abs/2207.08630 code
: https://github.com/iceli1007/fakeclr

[4] UniCR: Universally Approximate Certified Robustness via Randomized Smoothing (UniCR: Universally Approximate Certified Robustness via Random Smoothing)

paper: https://arxiv.org/abs/2207.02152

Neural Network Structure Design

Neural Network Architecture Search (NAS)

[1] ScaleNet: Searching for the Model to Scale (ScaleNet: Searching for the Model to Scale)

paper: https://arxiv.org/abs/2207.07267
code: https://github.com/luminolx/scalenet

[2] Ensemble Knowledge Guided Sub-network Search and Fine-tuning for Filter Pruning (integrated knowledge-guided sub-network search and filter pruning fine-tuning)

paper: https://arxiv.org/abs/2203.02651
code: https:// github.com/sseung0703/ekg

[3] EAGAN: Efficient Two-stage Evolutionary Architecture Search for GANs (EAGAN: Efficient Two-stage Evolutionary Architecture Search for GAN)

paper: https://arxiv.org/abs/2111.15097
code: https://github.com/marsggbo /EAGAN

data processing

Normalized

[1] Fine-grained Data Distribution Alignment for Post-Training Quantization (fine-grained data distribution alignment quantified after training)

paper: https://arxiv.org/abs/2109.04186
code: https://github.com/zysxmu/ fdda

Model Training/Generalization

noise label

[1] Learning with Noisy Labels by Efficient Transition Matrix Estimation to Combat Label Miscorrection (Learning Noisy Labels by Efficient Transition Matrix Estimation to Combat Label Miscorrection)

paper: https://arxiv.org/abs/2111.14932

model compression

knowledge distillation

[1] Knowledge Condensation Distillation (knowledge concentration distillation)

paper: https://arxiv.org/abs/2207.05409
code: https://github.com/dzy3/kcd

model evaluation

[1] Hierarchical Latent Structure for Multi-Modal Vehicle Trajectory Forecasting (Hierarchical Latent Structure for Multi-Modal Vehicle Trajectory Prediction)

paper: https://arxiv.org/abs/2207.04624
code: https://github.com/d1024choi/ hlstrajforecast

Semi-supervised learning/Unsupervised learning/Self-supervised learning

[1] FedX: Unsupervised Federated Learning with Cross Knowledge Distillation (FedX: Unsupervised Joint Learning with Cross Knowledge Distillation)

paper: https://arxiv.org/abs/2207.09158

[2] Synergistic Self-supervised and Quantization Learning (collaborative self-supervised and quantitative learning)

paper: https://arxiv.org/abs/2207.05432
code: https://github.com/megvii-research/ssql-eccv2022

[3] Contrastive Deep Supervision (contrastive deep supervision)

paper: https://arxiv.org/abs/2207.05306
code: https://github.com/archiplab-linfengzhang/contrastive-deep-supervision

[4] Dense Teacher: Dense Pseudo-Labels for Semi-supervised Object Detection (Dense Teacher: Dense Pseudo-Labels for Semi-supervised Object Detection)


paper:https://arxiv.org/abs/2207.02541

[1] Image Coding for Machines with Omnipotent Feature Learning (image coding of machines with omnipotent feature learning)

paper: https://arxiv.org/abs/2207.01932

Multimodal Learning/Cross-Modal

visual-language

[1] Contrastive Vision-Language Pre-training with Limited Resources (contrastive visual language pre-training with limited resources)

paper: https://arxiv.org/abs/2112.09331
code: https://github.com/zerovl/zerovl

cross-modal

[1] Cross-modal Prototype Driven Network for Radiology Report Generation (cross-modal prototype-driven network for radiology report generation)

paper: https://arxiv.org/abs/2207.04818v1
code: https://github.com /markin-wang/xpronet

few-shot learning

[1] Learning Instance and Task-Aware Dynamic Kernels for Few Shot Learning (Learning Instance and Task-Aware Dynamic Kernels for Few Shot Learning)

paper: https://arxiv.org/abs/2112.03494

transfer learning/adaptive

[1] Factorizing Knowledge in Neural Networks (decomposing knowledge in neural networks)

paper: https://arxiv.org/abs/2207.03337
code: https://github.com/adamdad/knowledgefactor

[2] CycDA: Unsupervised Cycle Domain Adaptation from Image to Video (CycDA: Unsupervised Cycle Domain Adaptation from Image to Video)

paper: https://arxiv.org/abs/2203.16244

reinforcement learning

[1] Target-absent Human Attention (missing target-human attention loss)

paper: https://arxiv.org/abs/2207.01166
code: https://github.com/neouyghur/sess

Guess you like

Origin blog.csdn.net/qq_45368632/article/details/125926564