List of CVPR2020 papers (Chinese-English bilingual)

Conditional Channel Gated Networks for Task-Aware Continual Learning
Multimodal Categorization of Crisis Events in Social Media Multimodal Categorization of Crisis Events in Social Media
Counterfactual Vision and Language Learning Counterfactual Vision and Language Learning
Gold Seeker Information Gain From Policy Distributions for Goal-Oriented Vision-and-Langauge Image2StyleGAN How to Edit the Embedded Images Image2StyleGAN How to Edit the Embedded Images Cross-Modal
Deep
Face Normals With Deactivable Skip Connections
Hussein Correction Filter for Single Image Super-Resolution Robustifying Off-the-Shelf Deep Super-Resolvers for Single Image Super-Resolution Robustification Off-the-shelf Hussein correction filter for deep super-resolution
Towards Achieving Adversarial Robustness by Enforcing Feature Consistency Across Bit
Deep White-Balance Editing
Towards Causal VQA Revealing and Reducing Spurious Correlations by Invariant Scale
-Space Flow for End-to-End Optimized Video Compression End-to-End Optimized Video Compressed Scale Space Streaming
Camera On-Boarding for Person Re-Identification Using Hypothesis Transfer Learning
Density-Based Clustering for 3D Object Detection in Point Clouds Density-Based Clustering
Non-Adversarial Video Synthesis With Learned Priors Non-Adversarial Video Synthesis with Learning Priors
Fast Soft Color Segmentation Fast Soft Color Segmentation
From Two Rolling Shutters to One Global Shutter From Two Rolling Shutters to One Global Shutter Active
Speakers Active speaker in Context
From Paris to Berlin Discovering Fashion Style Influences Around the From Paris to Berlin Discovering Fashion Style Influences Around the
Disentangled Image Generation Through Structured Noise Injection
A Stochastic Conditioning Scheme for Diverse Human Motion Prediction A Stochastic Conditioning Scheme for Diverse Human Motion Prediction
High-Resolution Daytime Translation Without Domain Labels A Characteristic Function Approach
to Deep Implicit Generative Modeling
Unsupervised Multi-Modal Image Registration via Geometry Preserving Image-to-Image Translation Unsupervised Image-to-Image Translation via Geometry Preserving Multimodal Image Registration
Single-Stage Semantic Segmentation From Image Labels
UniPose Unified Human Pose Estimation in Single Images and Videos Single-Stage Semantic Segmentation From Image Labels UniPose Unified Human Pose Estimation in Single Images and Videos
SAL Sign Agnostic Learning of Shapes From Raw Data SAL is symbol-agnostic for learning shapes from raw data
Single-Step Adversarial Training With Dropout Scheduling
TESA Tensor Element Self-Attention via Matricization
Bi3D Stereo Depth Estimation via Binary Classifications Bi3D Stereo Depth Estimation via Binary Classification
Meshlet Priors for 3D Mesh Reconstruction Meshlet prior
Weakly-Supervised Domain Adaptation via GAN and Mesh Model for Estimating Explorable Super Resolution Explorable
Super Resolution
Exploring via GAN and Mesh Model for Estimating Unlabeled Faces for Novel Attribute Discovery Explore unlabeled faces to discover new attributes
Adaptive Dilated Network With Self-Correction Supervision for Counting Adaptive Dilated Network with Self-Correction Supervision for Counting
D3Feat Joint Learning of Dense Detection and Description of 3D Local FeaturesD3Feat Dense Detection Joint Learning with 3D Description
Deep Facial Non-Rigid Multi-View Stereo
Learning to Forget for Meta-Learning Learning to Forget for Meta-Learning
Event Probability Mask EPM and Event Denoising Convolutional Neural Network Event Probability Mask EPM and Event Denoising Convolutional Neural Network
An Adaptive Neural Network for Unsupervised Mosaic Consistency Analysis in
Novel Object Viewpoint Estimation Through Reconstruction Alignment
4D Visualization of Dynamic Events From Unconstrained Multi-View Videos From Unconstrained Multi-View Videos 4D Visualization of Dynamic Events for Unconstrained Multiview Video
SAM The Sensitivity of Attribution Methods to Hyperparameters
Height and Uprightness Invariance for 3D Prediction From a Single 3D Prediction's Height and Verticality Invariance
MAGSAC a Fast Reliable and Accurate Robust Estimator MAGSAC A Fast Reliable and Accurate Robust Estimator
ScopeFlow Dynamic Scene Scoping for Optical Flow ScopeFlow
Improved Few-Shot Visual Classification Improved Few-Shot Visual Classification
Shape Reconstruction by Learning Differentiable Surface Representations
Context R-CNN Long Term Temporal Context for Per-Camera Object Detection Context R-CNN Long Term Temporal Context for Per-Camera Object Detection
SpeedNet Learning the Speediness in Videos SpeedNet
Can Weight Sharing Outperform Random Architecture Search An Investigation With Weight Sharing Can Win through random architecture search?
PandaNet Anchor-Based Single-Shot Multi-Person 3D Pose Estimation PandaNet Anchor-Based Single-Shot Multi-Person 3D Pose Estimation
Uninformed Students Student-Teacher Anomaly Detection With Discriminative Latent Embeddings
AOWS Adaptive and Optimal Network Width Search With Latency Constraints AOWS Adaptive and Optimal Network Width Search With Latency Constraints
MINA Convex Mixed -Integer Programming for Non-Rigid Shape Alignment MINA Convex Mixed Integer Programming for Non-rigid Shape Alignment
Classifying Segmenting and Tracking Object Instances in Video with Mask Using Mask to Classify Segmentation and Tracking Object Instances in Video Making
Better Mistakes Leveraging Class Hierarchies With Deep Networks Make Better Mistakes Using the Class Hierarchy of Deep Networks
DUNIT Detection-Based Unsupervised Image-to-Image
Translation Regression Prior Normalized Flow
A Sparse Resultant Based Method for Efficient Minimal Solvers A Sparse Resultant Based Method for Efficient Minimal Solvers
Reinforced Feature Points Optimizing Feature Detection and Description for a Reinforced Feature Points Optimizing Feature Detection and Description for a
Sketch Less for More On-the-Fly Fine-Grained Sketch-Based Image Retrieval
Deep 3D Capture Geometry and Reflectance From Sparse Multi-View Images Get deep 3D geometry and reflectance from sparse multi-view images
ENSEI Efficient Secure Inference via Frequency-Domain Homomorphic Convolution for Privacy-Preserving ENSEI Efficient Security through Frequency-Domain Homomorphic Convolution Reasoning to protect privacy
Seeing Through Fog Without Seeing Fog Deep Multimodal Sensor Fusion Seeing Through Fog Without Seeing Fog Deep Multimodal Sensor Fusion Synchronizing
Probability Measures on Rotations via Optimal Transport
Defending Against Universal Attacks Through Selective Feature Regeneration Defense against Universal Attacks via Selective Feature Regeneration
Two-Shot Spatially-Varying BRDF and Shape Estimation Two-Shot Spatially-Varying BRDF and Shape Estimation
DeepDeform Learning Non-Rigid RGB-D Reconstruction With Semi-Supervised Data DeepDeform Learning Non-Rigid RGB-D Reconstruction with Semi-Supervised Data Learning a
Neural Solver for Multiple Object Tracking
Rethinking Zero-Shot Video Classification End-to-End Training for Realistic Applications Rethinking Zero-Shot Video Classification End-to-End Training for Realistic Applications
Solving Jigsaw Puzzles With Eroded Boundaries Solving Jigsaw Puzzles With Eroded Boundaries
3FabRec Fast Few-Shot Face Alignment by Reconstruction 3FabRec Reconstruction Fast Neural
Head Reenactment with Latent Pose Descriptors Neural Head Reenactment with Latent Pose Descriptors
A Multimodal Dataset for Autonomous Driving nuScenes
Generalizing Hand Segmentation in Egocentric Videos With Uncertainty-Guided Model Adaptation Generalizing Hand Segmentation in Egocentric Videos via Uncertainty-Guided Model Adaptation
Learning a Unified Sample Weighting Network for Object Detection
Reconstruct Locally Localize Globally A Model Free Method for Object
Rethinking Differentiable Search for Mixed-Precision Neural Networks Rethinking Differentiable Search for Mixed-Precision Neural Networks Think Differentiable Search for Mixed Precision Neural Networks
ZeroQ A Novel Zero Shot Quantization Framework ZeroQ A Novel Zero Shot Quantization Framework
Appearance Shock Grammar for Fast Medial Axis Extraction From Real Appearance Shock Grammar for Fast Medial Axis Extraction From Real
Sign Language Transformers Joint End-to-End Sign Language Recognition and Translation Sign Language Transformers Joint End-to-End Sign Language Recognition and Translation
D2Det Towards High Quality Object Detection and Instance Segmentation D2Det Towards High Quality Object Detection and Instance Segmentation
Domain Balancing Face Recognition on Long-Tailed Domains Long Domain Balanced Face Recognition on Tail Domain
Few-Shot Video Classification via Temporal Alignment Few-Shot Video Classification via Temporal Alignment
Prime Sample Attention in Object Detection
Stereoscopic Flash and No-Flash Photography for Shape and Albedo Recovery Stereoscopic Flash and No-Flash Photography for Shape and Albedo Recovery
Scalable Uncertainty for Computer Vision With Functional Variational Inference Scalable Uncertainty for Computer Vision for Variational Inference
Modeling the Background for Incremental Learning in Semantic Segmentation
What It Thinks Is Important Is Important Robustness Transfers Through What It Thinks Is Important Is Important Robustness Transfers Through It Thinks What Matters Is Important Robustness Data Uncertainty Learning in
Face Recognition
Data Uncertainty Learning in Face Recognition
Synthetic Learning Learn From Distributed Asynchronized Discriminator GAN Without Sharing Synthetic learning from distributed asynchronous discriminator GAN without sharing
Weakly-Supervised Semantic Segmentation via Sub-Category Exploration
Neural Topological SLAM for Visual Navigation Neural Topological SLAM for Visual Navigation
JA-POLS A Moving-Camera Background Model via Joint Alignment and Partially-Overlapping JA -POLS
3D Sketch-Aware Semantic Scene Completion via Semi-Supervised Structure Prior through joint alignment and partially overlapping motion camera background model
A Hierarchical Graph Network for 3D Object Detection on Point A Multi-Task Mean
Teacher for Semi-Supervised Shadow Detection A Multi-Task Mean Teacher for Semi-Supervised Shadow Detection
A Neural Rendering Framework for Free-Viewpoint Relighting Illuminated Neural Rendering Framework
Action Segmentation With Joint Self-Supervised Temporal Domain Adaptation Joint Self-Supervised Temporal Domain Adaptation Action Segmentation
Adaptive Fractional Dilated Convolution Network for Image Aesthetics Assessment Adaptive Fractional Dilated Convolution Network for Image Aesthetics Assessment
AdderNet Do We Really Need Multiplications in Deep Learning AdderNet
Adversarial Robustness From Self-Supervised Pre- Training to Fine-Tuning from self-supervised pre-training to fine-tuning against robustness
Auto-Tuning Structured Light by Optical Stochastic Gradient Descent Automatically tune structured light through optical stochastic gradient descent
BANet Bidirectional Aggregation Network With Occlusion Handling for Panoptic Segmentation for panoramic segmentation BANet with Occlusion Handling Bidirectional Aggregation Network
Better Captioning With Sequence-Level Exploration Better Captioning Using Sequence-Level Exploration
BlendMask Top-Down Meets Bottom-Up for Instance Segmentation BlendMask Top-Down and Bottom-Up Instance Segmentation Meets
BSP -Net Generating Compact Meshes via Binary Space Partitioning BSP-Net generates compact meshes through binary space partitioning
Camera Trace Erasing Camera Trace Erasing
Cops-Ref A New Dataset and Task on Compositional Referring Expression Cops-Ref Combining Reference Expressions New Dataset and Task
Counterfactual Samples Synthesizing for Robust Visual Question Answering Counterfactual Samples for Robust Visual Question Answering Synthetic
Cross-View Tracking for Multi-Human 3D Pose Estimation at Over 100
Data-Efficient Semi-Supervised Learning by Reliable Edge Mining
Domain Adaptive Image- to-Image Translation Domain Adaptive Image to Image Conversion
DSGN Deep Stereo Geometry Network for 3D Object Detection DSGN Deep Stereo Geometry Network for 3D Object Detection
Dynamic Convolution Attention Over Convolution Kernels Dynamic Convolution Attention Over Convolution Kernels
End-to -End Learnable Geometric Vision by Backpropagating PnP Optimization realizes end-to-end learnable geometric vision through backpropagating PnP optimization
Fine-Grained Video-Text Retrieval With Hierarchical Graph Reasoning
Frequency Domain Compact 3D Convolutional Neural Networks Frequency Domain Compact 3D Convolutional Neural Networks
G2L-Net Global to Local Network for Real-Time 6D Pose Estimation G2L-Net for real-time 6D pose estimation
Harmonizing Transferability and Discriminability for Adapting Object Detectors
Image Search With Text Feedback by Visiolinguistic Attention Learning Visual Language Attention Learning Text Feedback Image Search
IMRAM Iterative Matching With Recurrent Attention Memory for Cross-Modal Image-Text IMRAM Iterative Matching With Recurrent Attention Memory for Cross-Modal Image-Text
Intelligent Home 3D Automatic 3D-House Design From Linguistic Descriptions Only Smart Home 3D Automatic 3D House design comes only from language description
Label Distribution Learning on Auxiliary Label Space Graphs for Facial Label Distribution Learning on Auxiliary Label Space Graphs
Learning a Weakly-Supervised Video Actor-Action Segmentation Model With a Wise
Learning Canonical Shape Space for Category-Level 6D Object Pose and Learning Canonical Shape Space for Category-Level 6D Object Pose and
Memory Enhanced Global-Local Aggregation for Video Object Detection Memory Enhanced Global-Local Aggregation
MnasFPN Learning Latency-Aware Pyramid Architecture for Object Detection on Mobile MnasFPN Learning Latency-Aware Pyramid Architecture for
Mobile Object Detection Pairwise Spatial Relationships MonoPair Monocular 3D Object Detection Using Pairwise Spatial Relationships
Network Adjustment Channel Search Guided by FLOPs Utilization Ratio FLOPs Utilization Guided Network Adjustment Channel Search
Norm-Aware Embedding for Efficient Person Search Norm-Aware Embedding for Efficient Person Search
OASIS A Large-Scale Dataset for Single Image 3D in the OASIS Large-Scale Dataset for Single Image 3D
One-Shot Adversarial Attacks on Visual Tracking With Dual Attention
PuppeteerGAN Arbitrary Portrait Animation With Semantic-Aware Appearance Transformation
Reusing Discriminators for Encoding Towards Unsupervised Image -to-Image Translation Reuse Discriminator for Unsupervised Image-to-Image Translation Coding
Salience-Guided Cascaded Suppression Network for Person Re-Identification
Say As You Wish Fine-Grained Control of Image Caption Generation Fine-grained control over image caption generation at will
Selective Transfer With Reinforced Transfer Network for Partial Domain Adaptation
Siamese Box Adaptive Network for Visual Tracking Siamese Box Adaptive Network for Visual Tracking Adaptive Network
SLV Spatial Likelihood Voting for Weakly Supervised Object Detection SLV Space Likelihood Voting for Weakly Supervised Object Detection
State-Aware Tracker for Real-Time Video Object Segmentation
Stochastic Sparse Subspace Clustering
Unsupervised Learning of Intrinsic Structural Representation Points
CascadePSP Toward Class -Agnostic and Very High-Resolution Segmentation via Global and CascadePSP via Global and
Deep Stereo Using Adaptive Thin Volume Representation With Uncertainty Awareness
Explaining Knowledge Distillation by Quantifying the Knowledge To explain knowledge distillation
HigherHRNet Scale-Aware Representation Learning for Bottom-Up Human Pose Estimation HigherHRNet Scale-Aware Representation Learning for Bottom-Up Human Pose Estimation
Inter-Task Association Critic for Cross -Resolution Person Re-Identification Inter-Task Association Critic for Cross-Resolution Person Re-Identification
Learned Image Compression With Discretized Gaussian Mixture Likelihoods and Attention
DeepLab A Simple Strong and Fast Baseline for Bottom-Up Panoptic Panoptic-DeepLab Bottom-up Panoptic's Simple Strong and Fast Baseline RiFeGAN Rich
Feature Generation for Text-to-Image Synthesis From Prior Knowledge RiFeGAN Text-to-Image Based on Prior Knowledge Synthetic Rich Feature Generation
Skeleton-Based Action Recognition With Shift Graph Convolutional Network
Time Flies Animating a Still Image With Time-Lapse Video As Time Flies Animating a Still Image With Time-Lapse Video
Non-Local Neural Networks With Grouped Bilinear Attentional Transforms Non-Local Neural Networks With Grouped Bilinear Attentional Transforms
Implicit Functions in Feature Space for 3D Shape Reconstruction and
Towards Efficient Model Compression via Learned Global Ranking
Agriculture-Vision A Large Aerial Image Database for Agricultural Pattern Analysis Agriculture-Vision Large Aerial Imagery Database for Agricultural Pattern Analysis
Assessing Image Quality Issues for Real-World Problems
When to Use Convolutional Neural Networks for Inverse Problems When to Use Convolutional Neural Networks for Inverse Problems
Evaluating Weakly Supervised Object Localization Methods Right
Cars Cant Fly Up in the Sky Improving Urban-Scene Segmentation Cars Cant Fly Up in the Sky Improving Urban-Scene Segmentation
Hi-CMD Hierarchical Cross-Modality Disentanglement for Visible-Infrared Person Re-Identification Hi-CMD Hierarchical Cross-Modal Disentanglement for Visible Infrared Person Re-Identification
Scene-Adaptive Video Frame Interpolation via Meta-Learning
StarGAN v2 Diverse Image Synthesis for Multiple Domains StarGAN v2 Multi-domain Diverse Image Synthesis
Task Agnostic Robust Learning on Corrupt Outputs by Correlation-Guided Mixture Correlation-Guided Hybrid Task-Independent Robust Learning of Corrupted Outputs
Detecting Attended Visual Targets in Video
Effectively Unbiased FID and Inception Score and Where to Find Effectively Unbiased FID and Inception Score and Where to Find
Deep Non-Line-of-Sight Reconstruction
Deep Global Registration Deep Global Registration
High-Dimensional Convolutional Networks for Geometric Pattern Recognition
Learning Geocentric Object Pose in Oblique Monocular Images Learning Geocentric Object Pose in Oblique Monocular Images Geocentric Object Pose in Eye Images
P-nets Deep Polynomial Neural Networks P-nets Deep Polynomial Neural Networks
Detection in Crowded Scenes One Proposal Multiple Predictions
A Context-Aware Loss Function for Action Spotting in Soccer Videos
Bodies at Rest 3D Human Pose and Shape Estimation From Static Body 3D Human Pose and Shape Estimation
Detecting Adversarial Samples Using Influence Functions and Nearest Neighbors
Editing in Style Uncovering the Local Semantics of GANs
DoveNet Deep Image Harmonization via Domain Verification Domain Verified DoveNet Deep Image Coordination
Attention-Based Context Aware Reasoning for Situation Recognition
Computing the Testing Error Without a Testing Set
Meshed- Memory Transformer for Image Captioning Mesh Memory Transformer for Image Captioning
Context-Aware Human Motion Prediction Context-aware Human Motion Prediction
GanHand Predicting Human Grasp Affordances in Multi-Object Scenes GanHand Predicting Human Grasp Ability in Multi-Object Scenes
Estimating Low-Rank Region Likelihood Maps Estimating Low-Rank Region Likelihood Maps
Gradually Vanishing Bridge for Adversarial Domain Adaptation
Learning Dynamic Relationships for 3D Human Motion Prediction Learning Dynamic Relationships for 3D Human Motion Prediction
Towards Discriminability and Diversity Batch Nuclear-Norm Maximization Under Label Insufficient and Diversity Batch Kernel Norm Maximization
Exploiting Joint Robustness to Adversarial Perturbations Using Joint Robustness to Respond to Adversarial Perturbations
High-Performance Long-Term Tracking With Meta-Updater Using Meta-Updater for High-Performance Long-Term Tracking
Neural Point Cloud Rendering via Multi -Plane Projection Neural point cloud rendering based on multi-plane projection
SG-NN Sparse Generative Neural Networks for Self-Supervised Scene Completion of SG-NN Sparse Generative Neural Networks for Self-Supervised Scene Completion Probabilistic Regression for
Visual Tracking Probabilistic Regression for Visual Tracking
Multi-Scale Fusion Subspace Clustering Using Similarity Constraint Constrained multi-scale fusion subspace clustering
On the Detection of Digital Face Manipulation On the detection of digital face processing
Your Local GAN Designing Two Dimensional Local Attention Mechanisms for your local GAN Designing two-dimensional local attention mechanism
Sequential Mastery of Multiple Visual Tasks Networks Naturally Learn to Sequentially Master Multiple Vision Tasks Networks Naturally Learn
Lange Unsupervised Model Personalization While Preserving Privacy and Scalability An OpenLange Unsupervised Model Personalization While Preserving Privacy and Scalability Open
RoboTHOR An Open Simulation-to-Real Embodied AI Platform RoboTHOR An open simulation-to-real AI platform
Optimal least-squares solution to the hand-eye calibration problem Optimal least squares solution to the hand-eye calibration problem
CvxNet Learnable Convex Decomposition CvxNet Learnable convex decomposition
Detail-recovery Image Deraining via Context Aggregation Networks Detail-recovery image through context aggregation network Rain
Disentangled and Controllable Face Image Generation via 3D Imitative-Contrastive Learning Generate disentangled and controllable face images through 3D imitation contrastive learning
RetinaFace Single-Shot Multi-Level Face Localization in the Wild RetinaFace single-shot multi-level face localization
Semantic Image Manipulation Using Scene Graphs Semantic Image Processing Using Scene Graphs
Guided Variational Autoencoder for Disentanglement Learning Learning Depth
-Guided Convolutions for Monocular 3D Object Detection Learning Depth-Guided Convolutions for Monocular 3D Object Detection Depth-guided convolution
Minimal Solutions to Relative Pose Estimation From Two Views Sharing Minimal solutions for relative pose estimation shared from two views
Robust Homography Estimation via Dual Principal Component Pursuit
Learning to Observe Approximating Human Perceptual Thresholds for Detection of Learning to Observe Approximating Human Perceptual Thresholds for Detection of
Deep Geometric Functional Maps Robust Feature Learning for Shape Correspondence Shape Corresponding Deep Geometric Function Map Robust Feature Learning
Benchmarking Adversarial Robustness on Image Classification Bi
-Directional Interaction Network for Person Search Bi-Directional Interaction Network for Person Search
CentripetalNet Pursuing High-Quality Keypoint Pairs for Object Detection CentripetalNet Pursuing High Quality Keypoint Pairs for Object Detection
Fashion Editing With Adversarial Parsing Learning
Instance Guided Proposal Network for Person Search Instance Guided Proposal Network for Person Search
Multi-Scale Boosted Dehazing Network With Dense Feature Fusion Multi-Scale Enhanced Dehazing Network with Dense Feature Fusion
Robust Superpixel-Guided Attentional Adversarial Attack
Self-Robust 3D Point Recognition via Gather-Vector Guidance
What Can Be Transferred Unsupervised Domain Adaptation for Endoscopic Lesions Unsupervised Domain Adaptation of Metastatic Endoscopic Lesions
HOPE-Net A Graph-Based Model for Hand-Object Pose Estimation HOPE-Net Unsupervised
Magnification of Posture Deviations Across Subjects Unsupervised amplification of
The GAN That Warped Semantic Attribute Editing With Unpaired Data GAN Action Modifiers
Learning From Adverbs in Instructional Videos Learning Action Modifiers from Adverbs in Instructional Videos
Associate-3Ddet Perceptual-to-Conceptual Association for 3D Point Cloud Object Detection Associate-3Ddet Perceived Concept Association for 3D Point Cloud Object Detection
Correlation-Guided Attention for Corner Detection Based Visual Tracking
Learning Invariant Representation for Unsupervised Image Restoration
SpineNet Learning Scale-Permuted Backbone for Recognition and Localization SpineNet Learning Scale-Permuted Backbone
Adversarial Camouflage Hiding Physical-World Attacks With Natural Styles
Cross-Spectral Face Hallucination via Disentangling Independent Factors Cross-Spectral Face Hallucination via Disentangling Independent Factors Varicolored
Image De-Hazing variegated image dehazing
Panoptic-Based Image Synthesis
Vec2Face Unveil Human Faces From Their Blackbox Features in Face Vec2Face Reveals human faces from their black box features
Watch Your Up-Convolution CNN Based Generative Deep Neural Networks Are Watch Generative Deep Neural Networks Based on Upconvolutional CNNs
Learning User Representations for Open Vocabulary Image Hashtag Prediction Learning User Representations for Open Vocabulary Image Tag Prediction
Counting Out Time Class Agnostic Video Repetition Counting in the Counting Time Class Agnostic Video Repetition Counting in the
Structured Multi-Hashing for Model Compression for Model Compression Tangent Images for Mitigating
Spherical Distortion Use
the Force Luke Learning to Predict Physical Forces by
Smooth Shells Multi-Scale Shape Registration With Functional Maps Graph Smoothing Shell Multi-scale Shape Registration
Uncertainty-Aware CNNs for Depth Completion Uncertainty from Beginning to End Depth Completion Uncertainty Uncertainty-Aware CNN
Fast Sparse ConvNets Fast Sparse ConvNets
Meta-Learning of Neural Architectures for Few-Shot Learning Neural Architecture Meta-Learning for Few-Shot Learning
3D-MPA Multi-Proposal Aggregation for 3D Semantic Instance Segmentation 用于 3D 语义实例分割的 3D-MPA 多建议聚合
Photometric Stereo via Discrete Hypothesis-and-Test Search 通过离散假设和测试搜索的光度立体
Oops Predicting Unintentional Action in Video 糟糕，预测视频中的无意动作
A Disentangling Invertible Interpretation Network for Explaining Latent Representations 用于解释潜在表示的解缠结可逆解释网络
Learning to Discriminate Information for Online Action Detection 学习区分在线动作检测的信息
Differentiable Adaptive Computation Time for Visual Reasoning 视觉推理的可微自适应计算时间
Compressed Volumetric Heatmaps for Multi-Person 3D Pose Estimation 用于多人 3D 姿势估计的压缩体积热图
TRPLP - Trifocal Relative Pose From Lines at Points TRPLP - 点线的三焦相对位姿
Camouflaged Object Detection 伪装物体检测
Few-Shot Object Detection With Attention-RPN and Multi-Relation Detector FGN
Fully Guided Network for Few-Shot Instance Segmentation FGN Fully Guided Network for Few-Shot Instance Segmentation
GaitPart Temporal Part-Based Model for Gait Recognition GaitPart
Learning Integral Objects With Intra-Class Discriminator for Weakly-Supervised Semantic Segmentation Learning Integral Objects With Intra-Class Discriminator for Weakly-Supervised Semantic Segmentation Learning Longterm Representations
for Person Re-Identification Using Radio Signals Learning to re-identify long-term representations of persons using radio signals
Taking a Deeper Look at Co-Salient Object Detection
Connect-and-Slice An Hybrid Approach for Reconstructing 3D Objects Connect-and- Slice A Hybrid Method for Reconstructing 3D Objects
Densely Connected Search Space for More Flexible Neural Architecture Search
GraspNet-1Billion A Large-Scale Benchmark for General Object Grasping GraspNet-1Billion A Large-Scale Benchmark for General Object Grasping GraspNet-1Billion
Perceptual Quality Assessment of Smartphone Photography Perceptual Quality Assessment for Smartphone Photography
TPNet Trajectory Proposal Network for Motion Prediction TPNet Trajectory Proposal Network for Motion Prediction
SCT Set Constrained Temporal Transformer for Set Supervised Action Segmentation
X3D Expanding Architectures for Efficient Video Recognition X3D Extension Architecture for Efficient Video Recognition
Three-Dimensional Reconstruction of Human Interactions Three-Dimensional Reconstruction of Human Interactions
ScrabbleGAN Semi-Supervised Varying Length Handwritten Text Generation ScrabbleGAN Semi-Supervised Varying Length Handwritten Text Generation
Information-Driven Direct RGB-D Odometry Information-Driven Direct RGB-D Odometry
How Much Time Do You Have Modeling Multi-Duration Saliency How Much Time Do You Have Modeling Multi-Duration Saliency
gDLS Generalized Pose-and-Scale Estimation Given Scale and Gravity Priors gDLS Generalized Pose and Scale Estimation with Given Scale and Gravity Priors
JL-DCF Joint Learning and Densely-Cooperative Fusion Framework for RGB-D Salient JL-DCF Joint Learning and Dense Collaboration for RGB-D Salient Fusion Framework
Joint Texture and Geometry Optimization for RGB-D Reconstruction
MCEN Bridging Cross-Modal Gap between Cooking Recipes and Dish Images MCEN Bridging Cross-Modal Gap between Cooking Recipes and Dish Images
Neural Implicit Embedding for Point Cloud Analysis Neural Implicit Embeddings for Point Cloud Analysis
Learning Generative Models of Shape Handles Learning Generative Models of Shape Handles
Wish You Were Here Context-Aware Human Generation Wish You Were Here Context-Aware Human Generation Wish You Were Here Context-Aware Human Generation
Music Gesture for Visual Sound Separation
AdversarialNAS Adversarial Neural Architecture Search for GANs AdversarialNAS AdversarialNAS Adversarial Neural Architecture Search GAN
Discrete Model Compression With Resource Constraint for Deep Neural Networks
Flow Contrastive Estimation of Energy-Based Models
GraphTER Unsupervised Learning of Graph Transformation Equivariant Representations via Auto-Encoding GraphTER Learning to Optimize on
SPD Manifolds Learning to Optimize on SPD Manifolds Tube
Listen to Look Action Recognition by Previewing Audio Listen to Look Action Recognition by Previewing Audio
MTL-NAS Task-Agnostic Neural Architecture Search Towards General-Purpose Multi-Task Learning MTL-NAS Task-Agnostic Neural Architecture Search
Multi-Modal Graph Neural Network for Joint Reasoning on Vision and
Pose-Guided Visible Part Matching for Occluded Person ReID Pose-Guided Visible Part Matching for Occluded Person ReID
Recursive Least-Squares Estimator-Aided Online Learning for Visual Tracking Recursive Least Squares Estimator Assisted Online Learning for Visual Tracking
SketchyCOCO Image Generation From Freehand Scene Sketches Generate SketchyCOCO Images from Freehand Scene Sketches
VectorNet Encoding HD Maps and Agent Dynamics From Vectorized Representation VectorNet From Vectorized Representation Encoding HD Maps and Proxy Dynamics
Satellite Image Time Series Classification With Pixel-Set Encoders and Temporal Satellite Imagery Time Series Classification Using Pixel-Set Encoders and Time Series
Actor-Transformers for Group Activity Recognition Actor-Transformers for Group Activity Recognition
Video to Events Recycling Video Datasets for Event Cameras Video to Events Recycling Video Datasets for Event Cameras
Averaging Essential and Fundamental Matrices in Collinear Camera Settings Averaging Essential and Fundamental Matrices in Collinear Camera Settings
Local Deep Implicit Functions for 3D Shape
Learning Representations by Predicting Bags of Visual Words By Predicting Bags of Visual Words Learning Representation
Learning Multiview 3D Point Cloud Registration Learning Multiview 3D Point Cloud Registration
Eternal Sunshine of the Spotless Net Selective Forgetting in Deep
ReSprop Reuse Sparsified Backpropagation ReSprop Reuse Sparse Backpropagation
A Quantum Computational Approach to Correspondence Problems on Point Sets A quantum computing method for point set correspondence problems
Geometrically Principled Connections in Graph Neural Networks
Learning Temporal Co-Attention Models for Unsupervised Video Action Localization Learning Temporal Co-Attention Models for Unsupervised Video Action Localization temporal co-attention model
Achieving Robustness in the Wild via Adversarial Mixing With Disentangled
Dynamic Neural Relational Inference
Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching The cascading cost of stereo matching
Image Processing Using Multi-Code GAN Prior
Improving the Robustness of Capsule Networks to Image Affine Transformations Improving the Robustness of Capsule Networks to Image Affine Transformations
Spherical Space Domain Adaptation With Robust Pseudo-Label Loss Spherical Spatial Domain Adaptive
Generative Hybrid Representations for Activity Forecasting With No-Regret Learning
Minimal Solutions for Relative Pose With a Single Affine Correspondence The minimum solution for the relative pose of the affine correspondence
Through Fog High-Resolution Imaging Using Millimeter Wave Radar
Deep Active Learning for Biased Datasets via Fisher Kernel Self-Supervision
FeatureFlow Robust Video Interpolation via Structure-to-Texture Generation FeatureFlow Robust Video Interpolation
3D Packing for Self-Supervised Monocular Depth Estimation
A Spatiotemporal Volumetric Interpolation Network for 4D Dynamic Medical Image A Attentive Weights Generation
for Few Shot Learning via Information Maximization AugFPN Improving Multi-Scale Feature Learning for
Object Detection AugFPN Improving Multi-Scale Feature Learning for Object Detection AugFPN Improving Multi-Scale Feature Learning for Object Detection Multi-scale Feature Learning
Closed-Loop Matters Dual Regression Networks for Single Image Super-Resolution Closed loop is very important for double regression network of single image super-resolution Density-Aware Feature Embedding
for Face Clustering Density-Aware Feature Embedding for Face Clustering
DMCP Differentiable Markov Channel Pruning DMCP Differentiable Markov Channel Pruning for Neural Networks
Hit-Detector Hierarchical Trinity Architecture Search for Object Detection Hit-Detector Hierarchical Trinity Architecture Search for Object Detection
Iterative Context-Aware Graph Inference for Visual Dialog Iterative Context-Aware Graph for Visual Dialog Reasoning
Learning Meta Face Recognition in Unseen Domains Learning Meta Face Recognition in Unseen Domains
Multi-Dimensional Pruning A Unified Framework for Model Compression Multi-dimensional pruning model compression unified framework
Normalized and Geometry-Aware Self-Attention Network for Image Captioning for image description Normalized and Geometry-Aware Self-Attention Networks
On Positive-Unlabeled Classification in GAN
Online Knowledge Distillation via Collaborative Learning Online Knowledge Distillation via Collaborative Learning
Organ at Risk Segmentation for Head and Neck Cancer Using
SiamCAR Siamese Fully Convolutional Classification and Regression for Visual Tracking SiamCAR Siamese full-volume classification and regression for visual tracking
When NAS Meets Robustness In Search of Robust Architectures Against When NAS Meets Robustness
Zero-Reference Deep Curve Estimation for Low-Light Image Enhancement Zero Reference Depth Curve Estimation for Low Light Image Enhancement
PatchVAE Learning Local Latent Codes for Recognition PatchVAE Learning Local Latent Codes for Recognition
Rethinking Depthwise Separable Convolutions How Intra-Kernel Correlations Lead to Improved Improve
DeepCap Monocular Human Performance Capture Using Weak Supervision
HOnnotate A Method for 3D Annotation of Hand and Object HOnnotate A Method for 3D Annotation of Hand and Object
GhostNet More Features From Cheap Operations GhostNet More Features From Cheap Operations More Features
Joint Training of Variational Auto-Encoder and Latent Energy-Based Model
Learning the Redundancy-Free Features for Generalized Zero-Shot Object Recognition Learning for Generalized Zero-Shot Object Recognition Recognized non-redundant features
Neuromorphic Camera Guided High Dynamic Range Imaging Neuromorphic Camera Guided High Dynamic Range Imaging
OccuSeg Occupancy-Aware 3D Instance Segmentation OccuSeg Occupancy Aware 3D Instance Segmentation
RMP-SNN Residual Membrane Potential Neuron for Enabling Deeper High-Accuracy and RMP -SNN Remnant Membrane Potential Neurons for deeper high precision and
SPARE3D A Dataset for SPatial REasoning on Three-View Line Drawings SPARE3D DualSDF Semantic Shape
Manipulation Using a Two-Level Representation
Towards Learning a Generic Agent for Vision -and-Language Navigation via Pre-Training A general agent for learning visual and language navigation through pre-training
ILFO Adversarial Attack on Adaptive Neural Networks ILFO Adversarial Attack on Adaptive Neural Networks
Space-Time-Aware Multi-Resolution Video Enhancement Resolution Video Enhancement
The Knowledge Within Methods for Data-Free Model Compression
Multi-scale Domain-adversarial Multiple-instance CNN for Cancer Subtype Classification with Unannotated Multi-scale Domain for Unannotated Cancer Subtype Classification Adversarial Multi-Instance CNN
Leveraging Photometric Consistency Over Time for Sparsely Supervised Hand-Object Reconstruction
MPM Joint Representation of Motion and Position Map for Cell MPM Joint Representation of Cell Motion and Position Map
Nonparametric Object and Parts Modeling With Lie Group Dynamics
Defending and Harnessing the Bit-Flip Based Adversarial Weight Attack Defending and exploiting bit-flip-based adversarial weight attacks
Epipolar Transformers Incremental
Learning in Online Scenario
Learning Filter Pruning Criteria for Deep Convolutional Neural Networks Acceleration Learning Filter for Deep Convolutional Neural Networks Acceleration Pruning Standard
MiLeNAS Efficient Neural Architecture Search via Mixed-Level Reformulation
Momentum Contrast for Unsupervised Visual Representation Learning Momentum Contrast for Unsupervised Visual Representation Learning
PVN3D A Deep Point-Wise 3D Keypoints Voting Network for 6DoF PVN3D's Deep Point-Wise 3D Keypoint Voting Network for 6DoF
Structure Aware Single-Stage 3D Object Detection From Point Cloud
A Lighting-Invariant Point Processor for Shading
Leveraging 2D Data to Learn Textured 3D Mesh Generation Utilization 2D Data Learning Textured 3D Mesh Generation
Learning a Neural 3D Texture Space From 2D Exemplars Learning a Neural 3D Texture Space from 2D Samples
A Multi-Hypothesis Approach to Color Constancy
Learning to Autofocus Learning to Autofocus
PointGMM A Neural GMM Network for Point Clouds PointGMM Exploit Clues From Views Self-Supervised and Regularized Learning for Multiview EPOS Estimating 6D Pose of Objects With
Clues in Views Self-Supervised and Regularized Learning for Multiview
Symmetries EPOS estimates the 6D pose of objects with symmetry
Augment Your Batch Improving Generalization Through Instance Repetition Enhance Your Batch Through Instance Repetition Improve Generalization
Distilling Image Dehazing With Heterogeneous Task Imitation Use Heterogeneous Tasks to Imitate Distillation Image Dehazing
Learning to Detect Important People in Unlabelled Images for Semi-Supervised Detecting important people in semi-supervised unlabeled images
Composed Query Image Retrieval Using Locally Bounded Features
Inter-Region Affinity Distillation for Road Marking Segmentation
Learning to Structure an Image With Few Colors Learn to build an image with a small number of colorsReal
-Time Panoptic Segmentation From Dense DetectionsReal-Time Panoptic Segmentation of Dense DetectionRevealNet
Seeing Behind Objects in RGB-D Scans RevealNet Seeing Behind Objects in RGB-D ScansStrip
Pooling Rethinking Spatial Pooling for Scene Parsing Strip Pooling rethinks spatial pooling for scene parsing
ViBE Dressing for Diverse Body Shapes ViBE Dressing for Different Body Types
Generalized ODIN Detecting Out-of-Distribution Image Without Learning From Out-of-Distribution Data Generalized ODIN Detecting Out-of-Distribution Images Without Learning from Out-of-Distribution Data Bi-Directional Relationship Inferring
Network for Referring Image Segmentation A Bidirectional Relational Inference Network for Reference Image Segmentation
Collaborative Motion Prediction via Neural Motion Message Passing
Creating Something From Nothing Unsupervised Knowledge Distillation for Cross-Modal Hashing Unsupervised from Scratch Knowledge Distillation for Cross-Modal Hashing
DSNAS Direct Neural Architecture Search Without Parameter Retraining
Iterative Answer Prediction With Pointer-Augmented Multimodal Transformers for TextVQA Pointer-Augmented Multimodal Transformers for TextVQA Iterative Answer Prediction
Learning to Segment the Tail Learning to Segment the Tail
Progressive Relation Learning for Group Activity Recognition
RandLA-Net Efficient Semantic Segmentation of Large-Scale Point Clouds RandLA-Net Efficient Semantic Segmentation
Single-Stage 6D Object Pose Estimation Single-Stage 6D Object Pose Estimation
Temporally Distributed Networks for Fast Video Semantic Segmentation
Unsupervised Domain Adaptation With Hierarchical Gradient Synchronization
What You See is What You Get Exploiting Visibility for What you see is what you get
Adversarial Texture Optimization From RGB-D Scans
An Internal Covariate Shift Bounding Algorithm for Deep Neural Networks An Internal Covariate Shift Bounding Algorithm for Deep Neural Networks
An Investigation Into the Stochasticity of Batch Whitening
ARCH Animatable Reconstruction of Clothed Humans ARCH Clothed Human Animation Reconstruction
ClusterVO Clustering Moving Instances and Estimating Visual Odometry for Self ClusterVO Clustering Moving Instances and Estimating Self-Visual Odometry
Controllable Orthogonalization in Training DNNs Controllable Orthogonalization in Training DNNs
CurricularFace Adaptive Curriculum Learning Loss for Deep Face Recognition CurricularFace Adaptive Curriculum Learning Loss for Deep Face Recognition
Deep Semantic Clustering by Partition Confidence Maximisation
Fast Video Object Segmentation With Temporal Aggregation Network and Dynamic Usage Time Aggregating Networks and Dynamics for Fast Video Object Segmentation Feature
-Metric Registration A Fast Semi-Supervised Approach for Robust Point Cloud
Temporal Reasoning Improves Action Segmentation
Interpretable and Accurate Fine-grained Recognition via Region Grouping Interpretable and Accurate Fine-grained Recognition by Region Grouping
Learning Identity-Invariant Motion Representations for Cross-ID Face Reenactment Learning Identity-Invariant Motion Representations for Cross-ID Face Reenactment NMS
by NMS Representative Region Towards Crowded Pedestrian Detection by Proposal NMS by Representative Region to Crowded Pedestrian Detection by Proposal
OctSqueeze Octree-Structured Entropy Model for LiDAR Compression
PF-Net Point Fractal Network for 3D Point Cloud Completion PF-Net Point Fractal Network for 3D Point Cloud Completion
Probability Weighted Compact Feature for Domain Adaptive Retrieval
PropagationNet Propagate Points to Curve to Learn Structure Information PropagationNet Propagate Points to Curve to Learn Structure Information
Real-World Person Re-Identification via Degradation Invariance Learning
Referring Image Segmentation via Cross-Modal Progressive Comprehension Referring Image Segmentation via Cross-Modal Progressive Comprehension
SQE a Self Quality Evaluation Metric for Parameters Optimization in SQE Parameter Optimization Self-Quality Assessment Index
The Devil Is in the Details Delving Into Unbiased Data The Devil Is in the Details Delving Into Unbiased Data
Universal Physical Camouflage Attacks on Object Detectors
Self-Supervised Monocular Scene Flow Estimation Self-Supervised Monocular Scene Flow Estimation Supervised Monocular Scene Flow Estimation
A Shared Multi-Attention Framework for Multi-Label Zero-Shot Learning A Shared Multi-Attention Framework for Multi-Label Zero-Shot Learning
Fine-Grained Generalized Zero-Shot Learning via Dense Attribute-Based Attention by Based on Dense Attributes Fine-grained generalized zero-shot learning of attention
Interactive Multi-Label CNN Learning With Partial Labels Interactive Multi-Label CNN Learning With Partial Labels
Learning to Super Resolve Intensity Images From Events
Semi-Supervised Semantic Image Segmentation With Self-Correcting Networks Semi-Supervised Semantic Image Segmentation with Self-Correcting Networks
Low-Rank Compression of Neural Nets Learning the Rank of Each Learning Low-level Compression of Neural Networks for Each Level
Global Optimality for Point Set Registration Using Semidefinite Programming Weakly
-Supervised 3D Human Pose Learning via Multi-View Images in the Multi-View Based Weakly supervised 3D human pose learning for images
Enhancing Generic Segmentation With Learned Region Representations
DOA-GAN Dual-Order Attentive Generative Adversarial Network for Image Copy-Move Forgery DOA-GAN Dual-Order Attentive Generative Adversarial Network for Image Copy-Move Forgery Order Attention Generative Adversarial Network
Video Super-Resolution With Temporal Group Attention Video Super-Resolution With Temporal Group Attention
Optical Non-Line-of-Sight Physics-Based 3D Human Pose Estimation
Scene Recomposition by Learning-Based ICP
Can Deep Learning Recognize Subtle Human Activities Can Deep Learning Recognize Subtle Human Activities Identifying Subtle Human Activities
ActionBytes Learning From Trimmed Videos to Localize Actions ActionBytes Learning from Trimmed Videos to Localize Actions
Self-Supervised Learning of Interpretable Keypoints From Unlabelled Videos
Attack to Explain Deep Representation Attack Explains Deep Representation
Rethinking Class-Balanced Methods for Long-Tailed Visual Recognition From a Domain
Generalized Product Quantization Network for Semi-Supervised Image Retrieval Generalized Product Quantization Network for Semi-Supervised Image Retrieval product quantization network
xMUDA Cross-Modal Unsupervised Domain Adaptation for 3D Semantic Segmentation xMUDA Cross-Modal Unsupervised Domain Adaptation for 3D Semantic Segmentation Learn2Perturb An End-to-End
Feature Perturbation Learning to Improve Adversarial Robustness Learn2Perturb An end-to-end feature perturbation learning to Improving Adversarial Robustness
Steering Self-Supervised Feature Learning Beyond Local Pixel Statistics Guided Self-Supervised Feature Learning Beyond Local Pixel Statistics
Sparse Layered Graphs for Multi-Object Segmentation Sparse Layered Graphs for Multi-Object Segmentation
Action Genome Actions As Compositions of Spatio-Temporal Scene Graphs Genome Actions as a Combination of Spatio-Temporal Scene Graphs
Attention Convolutional Binary Neural Tree for Fine-Grained Visual Categorization
Revisiting Saliency Metrics Farthest-Neighbor Area Under Curve Revisiting Examine the saliency measure under the curve
Single-Side Domain Generalization for Face Anti-Spoofing Single-Side Domain Generalization for Face Anti-Spoofing
Attention Scaling for Crowd Counting
Coherent Reconstruction of Multiple Humans From a Single Image Coherent Reconstruction of Multiple Humans From a Single Image
DeeperForensics-1.0 A Large-Scale Dataset for Real-World Face Forgery Detection DeeperForensics-1.0 for Large-scale datasets for real-world face forgery detection
End-to-End 3D Point Cloud Instance Segmentation Without Detection End-to-end 3D point cloud instance segmentation without detection
Fantastic Answers and Where to Find Them Immersive Question-Directed Visual Fantastic Answers and Where to find them Immersive Question-Oriented Vision
In Defense of Grid Features for Visual Question Answering Defending Grid Features for Visual Question Answering
Learning Event-Based Motion Deblurring Learning Event-Based Motion Deblurring
Local Implicit Grid Representations for 3D Scenes 3D Scenes Local implicit grid representation
Multi-Scale Progressive Fusion Network for Single Image Deraining Multi-scale progressive fusion network for single image deraining
Peek-a-Boo Occlusion Reasoning in Indoor Scenes With Plane Representations Peek-a-Boo Occlusion Reasoning in Indoor Scenes With Plane Representations
PointGroup Dual-Set Point Grouping for 3D Instance Segmentation PointGroup Dual-Set Point Grouping for 3D Instance Segmentation
PSGAN Pose and Expression Robust Spatial-Aware GAN for Customizable Makeup PSGAN Pose and Expression Robust Spatial-Aware GAN for Customizable Makeup SDFDiff Differentiable Rendering
of Signed Distance Fields for 3D Shape
-NAS Serial-to-Parallel Backbone Search for Object Detection SP-NAS
AdaBits Neural Network Quantization With Adaptive Bit-Widths AdaBits Neural Network Quantization With Adaptive Bit-Widths
Exploring Spatial-Temporal Multi -Frequency Analysis for High-Fidelity and Temporal-Consistency Video Prediction Explore the spatio-temporal multi-frequency analysis of high-fidelity and temporal-consistency video prediction
Geometric Structure Based and Regularized Depth Estimation From 360 Indoor
Light Field Spatial Super-Resolution via Deep Combinatorial Geometry Embedding and
Style Normalization and Restitution for Generalizable Person Re-Identification Generalizable Person Re-identification Style Normalization and Restoration
Cross-Modal Cross-Domain Moment Alignment Network for Person Search
Self-Supervised Monocular Trained Depth Estimation Using Self -Attention and Discrete Disparity Depth Estimation for Self-Supervised Monocular Training Using Self-Attention and Discrete Disparity
Select to Better Learn Fast and Accurate Deep Learning Using
Cylindrical Convolutional Networks for Joint Object Detection and Viewpoint Estimation Cylindrical Convolutional Networks for Joint Object Detection and Viewpoint Estimation
MMTM Multimodal Transfer Module for CNN Fusion MMTM Multimodal Transfer Module for CNN Fusion
Deep Polarization Cues for Transparent Object Segmentation Deep Polarization Cues for Transparent Object Segmentation
Benchmarking the Robustness of Semantic Segmentation Models The Robustness of Semantic Segmentation Models Benchmarking
Noise Robust Generative Adversarial Networks Noise Robust Generative Adversarial Networks
Defending Against Model Stealing Attacks With Adaptive Misinformation
MSG-GAN Multi-Scale Gradients for Generative Adversarial Networks MSG- GAN Multiscale Gradient
Analyzing and Improving the Image Quality of StyleGAN StyleGAN Image Quality Analysis and Improvement
Deblurring Using Analysis-Synthesis Networks Pair Using Analysis-Synthesis Networks to Deblur
On Translation Invariance in CNNs Convolutional Layers Can Exploit Absolute About Translation Invariance in CNNs Convolutional layers can take advantage of Absolute
Multiple Anchor Learning for Visual Object Detection
RGBD-Dog Predicting Canine Pose from RGBD Sensors RGBD-Dog Predicting Canine Pose from RGBD Sensors
RankMI A Mutual Information Maximizing Ranking Loss RankMI A Mutual Information Maximizing Ranking Loss
Generalized Zero -Shot Learning via Over-Complete Distribution Through a fully distributed generalized zero-shot learning
AnimalWeb A Large-Scale Hierarchical Dataset of Annotated Animal Faces AnimalWeb Large-scale hierarchical dataset of annotated animal faces
Hyperbolic Image Embeddings Hyperbolic image embedding
ActiveMoCap Optimized Viewpoint Selection for Active Human Motion Capture ActiveMoCap Optimized Viewpoint Selection
A Programmatic and Semantic Approach to Explaining and Debugging Neural
Advisable Learning for Self-Driving Vehicles by Internalizing Observation- to-Action Rules provide desirable learning for self-driving cars by internalizing rules for observed actions
GroupFace Learning Latent Groups and Constructing Group-Based Representations for Face GroupFace Learning Latent Group and Constructing Group-Based Face Representation Hypergraph Attention
Networks for Multimodal Learning
Learning Texture Invariant Representation for Domain Adaptation of Semantic Segmentation Semantic Segmentation Domain Adapted Learning Texture Invariant Representation
Learning to Simulate Dynamic Environments With GameGAN Learning to Simulate Dynamic Environments With GameGAN Learning to Simulate Dynamic
Environments with GameGAN M2m Imbalanced Classification via Major-to-Minor Translation
Modality Shifting Attention Network for Multi-Modal Video Question Answering Modal Transfer Attention Network for Multimodal Video Question Answering
Modeling Biological Immunity to Adversarial Examples Biological Immunity Modeling for Adversarial Examples
Proxy Anchor Loss for Deep Metric Learning Proxy Anchor Loss for Deep Metric Learning
Regularization on Spatio-Temporally Smoothed Feature for Action Recognition 动作识别时空平滑特征的正则化
Single Image Reflection Removal With Physically-Based Training Images 使用基于物理的训练图像去除单图像反射
Spatially Attentive Output Layer for Image Classification 用于图像分类的空间注意输出层
Transfer Learning From Synthetic to Real-Noise Denoising With Adaptive Instance 使用自适应实例将学习从合成迁移到真实噪声去噪
Video Panoptic Segmentation 视频全景分割4700，全景分割10000
PointRend Image Segmentation As Rendering PointRend 图像分割作为渲染
CONSAC Robust Multi-Model Fitting by Conditional Sample Consensus 基于条件样本一致性的 CONSAC 稳健多模型拟合
Belief Propagation Reloaded Learning BP-Layers for Labeling Problems Belief Propagation Reloaded Learning BP-Layers for Labeling Problems
Embedding Expansion Augmentation in Embedding Space for Deep Metric Learning
Total Deep Variation for Linear Inverse Problems
VIBE Video Inference for Human Body Pose and Shape Estimation VIBE Video Inference with Shape Estimation
Universal Litmus Patterns Revealing Backdoor Attacks in CNNs
PhysGAN Generating Physical-World-Resilient Adversarial Examples for Autonomous Driving PhysGAN Generating Physical-World-Resilient Adversarial Examples for Autonomous Driving
Compositional Convolutional Neural Network the s A Deep Architecture With Innate Robustness Combined Convolutional Neural Network with Intrinsic Robust Deep Architecture
Factorized Higher-Order CNNs With an Application to Spatio-Temporal Emotion Estimation Decomposition Higher-Order CNN for Spatio-Temporal Emotion Estimation
DeepFaceFlow In-the-Wild Dense 3D Facial Motion Estimation DeepFaceFlow In-the-Wild dense 3D facial motion estimation
Learning Interactions and Relationships Between Movie Characters
Instance Segmentation of Biological Images Using Harmonic Embeddings
Articulation-Aware Canonical Surface Mapping
Weakly-Supervised Mesh-Convolutional Hand Reconstruction in the Wild Weakly supervised grid convolution hand reconstruction
LUVLi Face Alignment Estimating Landmarks Location Uncertainty and Visibility Likelihood LUVLi Face Alignment Estimation Landmark Location Uncertainty and Visibility Likelihood
Self-Supervised 3D Human Pose Estimation via Part Guided Novel Image Self-supervised 3D human pose estimation based on partially guided new images
Towards Inheritable Models for Open-Set Domain Adaptation Universal Source
-Free Domain Adaptation Universal Source-Free Domain Adaptation
Normal Assisted Stereo Depth Estimation Normal Assisted Stereo Depth Estimation
Structured Compression by Weight Encryption for Unstructured Pruning and Quantization 用于非结构化剪枝和量化的权重加密结构化压缩
Blur Aware Calibration of Multi-Focus Plenoptic Camera 多焦点全光相机的模糊感知校准
Prior Guided GAN Based Semantic Inpainting 先前引导的基于 GAN 的语义修复
MAST A Memory-Augmented Self-Supervised Tracker MAST 一种记忆增强的自我监督跟踪器
MSeg A Composite Dataset for Multi-Domain Semantic Segmentation MSeg 用于多域语义分割的复合数据集
SaccadeNet A Fast and Accurate Object Detector SaccadeNet 一种快速准确的目标检测器
SampleNet Differentiable Point Cloud Sampling SampleNet 可微分点云采样
Which Is Plagiarism Fashion Image Retrieval Based on Regional Representation 基于区域表征的抄袭时尚图像检索是什么
AvatarMe Realistically Renderable 3D Facial Reconstruction In-the-Wild AvatarMe 真实可渲染的 3D 面部重建在野外
Learning Instance Occlusion for Panoptic Segmentation
A Graduated Filter Method for Large Scale Robust Estimation A Graduated Filter Method for Large Scale Robust Estimation
Deep Homography Estimation for Dynamic Scenes Deep Homography Estimation for Dynamic Scenes
Going Deeper With Lean Point Networks Use Lean Point Networks to Go Deeper
Guen Disentangling Physical Dynamics From Unknown Factors for Unsupervised Video PredictionGuen Disentangling Physical Dynamics From Unknown Factors for Unsupervised Video PredictionHierarchical
Conditional Relation Networks for Video Question Answering for Video Question Answering AdaCoF Adaptive Collaboration
of Flows for Video Frame Interpolation
Adversarial Vertex Mixup Toward Better Adversarially Robust Generalization Adversarial Vertex Mixup Toward Better Adversarially Robust Generalization
CenterMask Real- Time Anchor-Free Instance Segmentation CenterMask Real-time anchor-free instance segmentation
Continual Learning With Extended Kronecker-Factored Approximate Curvature 扩展克罗内克因子近似曲率的持续学习
Large Scale Video Representation Learning via Relational Graph Clustering 基于关系图聚类的大规模视频表示学习
Learning Augmentation Network via Influence Functions 通过影响函数学习增强网络
MaskGAN Towards Diverse and Interactive Facial Image Manipulation MaskGAN 迈向多样化和交互式面部图像处理
NeuralScale Efficient Scaling of Neurons for Resource-Constrained Deep Neural Networks NeuralScale 用于资源受限的深度神经网络的神经元的有效缩放
Reference-Based Sketch Image Colorization Using Augmented-Self Reference and Dense Semantic 使用增强自我参考和密集语义的基于参考的草图图像着色
Structure Boundary Preserving Segmentation for Medical Image With Ambiguous Boundary 具有模糊边界的医学图像的结构保边界分割
TextureFusion High-Quality Texture Acquisition for Real-Time RGB-D Scanning TextureFusion is used for high-quality texture acquisition of real-time RGB-D scanning Uncertainty-
Aware Mesh Decoder for High Fidelity 3D Face Reconstruction Uncertainty-Aware Mesh Decoder for High Fidelity 3D Face Reconstruction Known grid decoder
Warping Residual Based Image Stitching for Large Parallax
Polarized Reflection Removal With Perfect Alignment in the Wild Polarized Reflection Removal With Perfect Alignment in the Wild
SegGCN Efficient 3D Point Cloud Segmentation With Fuzzy Spherical Kernel SegGCN using fuzzy spherical kernel Efficient 3D point cloud segmentation
Deep Iterative Surface Normal Estimation
Adaptive Interaction Modeling via Graph Operations Search
Advancing High Fidelity Identity Swapping for Forgery Detection High-fidelity identity exchange for forgery detection
Adversarial Feature Hallucination Networks for Few-Shot Learning
All in One Bad Weather Removal Using Architectural Search
Anisotropic Convolutional Networks for 3D Semantic Scene Completion for 3D Semantics Anisotropic Convolutional Networks for Scene Completion
Approximating shapes in images with low-complexity polygons
AutoTrack Towards High-Performance Visual Tracking for UAV With Automatic Spatio-Temporal AutoTrack Towards Automatic Spatio-Temporal AutoTrack High-performance visual tracking of drones
BachGAN High-Resolution Image Synthesis From Salient Object Layout BachGAN of salient object layout High-resolution image synthesis
Background Data Resampling for Outlier-Aware Classification Background Data Resampling for Outlier-Aware Classification
Block-Wisely Supervised Neural Architecture Search With Knowledge Distillation Block-supervised neural architecture search using knowledge distillation
Boosting Few-Shot Learning With Adaptive Margin Loss 通过自适应边际损失促进 Few-Shot 学习
Cascaded Deep Monocular 3D Human Pose Estimation With Evolutionary Training 具有进化训练的级联深度单目 3D 人体姿势估计
Category-Level Articulated Object Pose Estimation 类别级关节物体姿态估计
Celeb-DF A Large-Scale Challenging Dataset for DeepFake Forensics Celeb-DF 用于 DeepFake 取证的大规模具有挑战性的数据集
Composing Good Shots by Exploiting Mutual Relations 利用相互关系构筑好镜头
Context-Aware Group Captioning via Self-Attention and Contrastive Features 通过自我注意和对比特征的上下文感知组字幕
Correspondence Networks With Adaptive Neighbourhood Consensus 具有自适应邻域共识的通信网络
Cross-Domain Document Object Detection Benchmark Suite and Method 跨域文档对象检测基准套件和方法
Deep Fair Clustering for Visual Learning 视觉学习的深度公平聚类
Deep Grouping Model for Unified Perceptual Parsing
Deformation-Aware Unpaired Image Translation for Pose Estimation on Laboratory Animals
Density-Aware Graph for Deep Semi-Supervised Visual Recognition Density-aware maps for deep semi-supervised visual recognition
Detailed 2D-3D Joint Representation for Human-Object Interaction Dynamic Hierarchical Mimicking Towards Consistent Optimization Objectives Dynamic
Hierarchical Simulation for Consistent Optimization Objectives
Multiscale Graph Neural Networks for 3D Skeleton Based Human
End-to-End Learning Local Multi-View Descriptors for 3D Point Clouds End-to-End Learning Local Multi-View Descriptors for 3D Point Clouds Enhanced
Blind Face Restoration With Multi-Exemplar Images and Adaptive Spatial Enhanced Blind Face Restoration With Multi-Exemplar Images and Adaptive Spatial
Enhanced Transport Distance for Unsupervised Domain Adaptation
Enhanced Intrinsic Adversarial Robustness via Feature Pyramid Decoder
Face X-Ray for More General Face Forgery Detection Face X-Ray for More General Face Forgery Detection General facial forgery detection
FALCON A Fourier Transform Based Approach for Fast and Secure FALCON
Few Sample Knowledge Distillation for Efficient Network Compression
FSS-1000 A 1000 -Class Dataset for Few-Shot Segmentation FSS-1000
Gait Recognition via Semi-supervised Disentangled Representation Learning to Identity and Recognition and Recognition of
Gait GAN Compression Efficient Architectures for Interactive Conditional GANs GAN Compression Efficient Architecture for Interactive Conditional GANs
GP-NAS Gaussian Process Based Neural Architecture Search
Group Sparsity The Hinge Between Filter Pruning and Decomposition for
Hierarchical Scene Coordinate Classification and Regression for Visual Localization Hierarchical Scene Coordinate Classification and Regression for Visual Localization
Improving Confidence Estimates for Unfamiliar Examples
Improving One-Shot NAS by Suppressing the Posterior Fading Improving One-Shot NAS
Inverse Rendering for Complex Indoor Scenes Shape Spatially-Varying Lighting and Reverse Rendering of Complex Indoor Scenes Shapes Spatially Varying Lighting and
Joint Spatial-Temporal Optimization for Stereo 3D Object Tracking Joint Spatial-Temporal Optimization of Stereo 3D Object Tracking
Learning Dynamic Routing for Semantic Segmentation Learning Dynamic Routing for Semantic Segmentation
Learning Formation of Physically-Based Face Attributes Learning Formation Based on Physically-Based Face Attributes
Learning From Noisy Anchors for One-Stage Object Detection 从噪声锚中学习单阶段目标检测
Learning to Learn Cropping Models for Different Aspect Ratio Requirements 学习学习不同长宽比要求的裁剪模型
Learning to Optimize Non-Rigid Tracking 学习优化非刚性跟踪
ManiGAN Text-Guided Image Manipulation ManiGAN 文本引导的图像处理
MixNMatch Multifactor Disentanglement and Encoding for Conditional Image Generation 用于条件图像生成的 MixNMatch 多因素解缠结和编码
Model Adaptation Unsupervised Domain Adaptation Without Source Data 无源数据的模型自适应无监督域自适应
NETNet Neighbor Erasing and Transferring Network for Better Single Shot NETNet 邻居擦除和传输网络以获得更好的单次拍摄
Neural Architecture Search for Lightweight Non-Local Networks 轻量级非本地网络的神经架构搜索
Overcoming Classifier Imbalance for Long-Tail Object Detection With Balanced Group
PaStaNet Toward Human Activity Knowledge Engine PaStaNet Towards Human Activity Knowledge Engine
Perspective Plane Program Induction From a Single Image Perspective Plane Program Induction
PointAugment An Auto-Augmentation Framework for Point Cloud Classification PointAugment
Projection Probability-Driven Black-Box Attack Projection Probability-Driven Black-Box Attack
QEBA Query-Efficient Boundary-Based Blackbox Attack QEBA Query Efficient boundary-based black-box attack
Recurrent Feature Reasoning for Image Inpainting Recurrent Feature Reasoning for Image Inpainting
Robust 3D Self-Portraits in Seconds Complete powerful 3D self-portraits in seconds
Screencast Tutorial Video Understanding
Self-Learning With Rectification Strategy for Human Parsing Self-learning and Correction Strategies for Human Parsing
Self-Supervised Deep Visual Odometry With Online Adaptation 具有在线自适应的自我监督深度视觉里程计
Set-Constrained Viterbi for Set-Supervised Action Segmentation 用于集监督动作分割的集约束维特比
SGAS Sequential Greedy Architecture Search SGAS 顺序贪心架构搜索
Shape correspondence using anisotropic Chebyshev spectral CNNs 使用各向异性切比雪夫谱 CNN 的形状对应
Single Image Reflection Removal Through Cascaded Refinement 通过级联细化去除单幅图像反射
SmallBigNet Integrating Core and Contextual Views for Video Classification SmallBigNet 集成核心视图和上下文视图以进行视频分类
Spatial Pyramid Based Graph Reasoning for Semantic Segmentation 基于空间金字塔的语义分割图推理
Symmetry and Group in Attribute-Object Compositions 属性-对象组合中的对称性和组
TEA Temporal Excitation and Aggregation for Action Recognition 用于动作识别的 TEA 时间激发和聚合
Through the Looking Glass Neural 3D Reconstruction of Transparent Shapes 通过窥镜对透明形状进行神经 3D 重建
Towards Transferable Targeted Attack 迈向可转移的有针对性的攻击
Training a Steerable CNN for Guidewire Detection 训练用于导丝检测的可控 CNN
Transferring Cross-Domain Knowledge for Video Sign Language Recognition 迁移视频手语识别的跨域知识
Unifying Training and Inference for Panoptic Segmentation 统一全景分割的训练和推理
Unsupervised Reinforcement Learning of Transferable Meta-Skills for Embodied Navigation 具身导航的可迁移元技能的无监督强化学习
Visual-Semantic Matching by Exploring High-Order Attention and Distraction 通过探索高阶注意力和分心进行视觉语义匹配
Wavelet Integrated CNNs for Noise-Robust Image Classification 用于抗噪图像分类的小波集成 CNN
PnPNet End-to-End Perception and Prediction With Tracking in the Loop PnPNet End-to-End Perception and Prediction with Loop Tracking
PolyTransform Deep Polygon Transformer for Instance Segmentation PolyTransform Deep Polygon Transformer for Instance Segmentation
The Garden of Forking Paths Towards Multi-Future Trajectory Prediction
A Real-Time Cross-Modality Correlation Filtering Method for Referring Expression Comprehension Iteratively
-Refined Interactive 3D Medical Image Segmentation With Multi- Agent Reinforcement Learning Iteratively Improved Interactive 3D Medical Image Segmentation with Multi-Agent Reinforcement Learning
PPDM Parallel Point Detection and Matching for Real-Time Human-Object Interaction PPDM Parallel Point Detection and Matching for Real-Time Human-Object Interaction Towards Unsupervised
Learning of Generative Models for 3D Controllable Image Unsupervised learning of generative models for 3D controllable images
A Spatial RNN Codec for End-to-End Image Compression 用于端到端图像压缩的空间 RNN 编解码器
BEDSR-Net A Deep Shadow Removal Network From a Single Document BEDSR-Net 来自单个文档的深度阴影去除网络
Convolution in the Cloud Learning Deformable Kernels in 3D Graph 云中的卷积学习 3D 图中的可变形内核
Fashion Outfit Complementary Item Retrieval 时尚服装配套物品检索
FPConv Learning Local Flattening for Point Convolution FPConv 学习点卷积的局部展平
GPS-Net Graph Property Sensing Network for Scene Graph Generation GPS-Net Graph Property Sensing Network for Scene Graph 生成
Graph-Guided Architecture Search for Real-Time Semantic Segmentation 用于实时语义分割的图形引导架构搜索
HRank Filter Pruning Using High-Rank Feature Map 使用高等级特征图的 HRank 过滤器修剪
Interactive Image Segmentation With First Click Attention 具有首次点击注意的交互式图像分割
M-LVC Multiple Frames Prediction for Learned Video Compression M-LVC Multiple Frames Prediction for Learning Video Compression
Progressive Mirror Detection Progressive Mirror Detection
Regularizing Neural Networks via Minimizing Hyperspherical Energy Regularizing Neural Networks by Minimizing Hyperspherical Energy
Shoesstring Graph-Based Semi-Supervised Classification With Severely Limited Labeled Data
Sketch-BERT Learning Sketch Bidirectional Encoder Representation From Transformers by Self-Supervised Sketch-BERT Learning Sketch Bidirectional Encoding from Transformers by Self-Supervised Towards
High-Fidelity 3D Face Reconstruction From In-the-Wild Images Using Graph Towards High-Fidelity 3D Face Reconstruction From In-the-Wild Images Using Graph Unsupervised
Person Re-Identification via Softened Similarity Learning identify
Video Instance Segmentation Tracking With a Modified VAE Architecture 使用修改后的 VAE 架构的视频实例分割跟踪
Visual Chirality 视觉手性
Few-Shot Pill Recognition 少量药丸识别
SCATTER Selective Context Attentional Scene Text Recognizer SCATTER 选择性上下文注意力场景文本识别器
3D Part Guided Image Editing for Fine-Grained Object Understanding 用于细粒度对象理解的 3D 零件引导图像编辑
A Novel Recurrent Encoder-Decoder Structure for Large-Scale Multi-View Stereo Reconstruction 一种用于大规模多视图立体重建的新型循环编解码器结构
ABCNet Real-Time Scene Text Spotting With Adaptive Bezier-Curve Network 使用自适应贝塞尔曲线网络的 ABCNet 实时场景文本定位
ARShadowGAN Shadow Generative Adversarial Network for Augmented Reality in Single ARSShadowGAN 阴影生成对抗网络，用于单人增强现实
Attention Mechanism Exploits Temporal Contexts Real-Time 3D Human Pose Reconstruction Attention Mechanism Uses Temporal Context Real-time 3D Human Pose Reconstruction
Beyond Short-Term Snippet Video Relation Detection With Spatio-Temporal Global Context Short-Term Snippet Video Relation Detection Beyond Spatio-Temporal Global Context
BFBox Searching Face -Appropriate Backbone and Feature Pyramid Network for Face BFBox Searching Face-Apropriate Backbone and Feature Pyramid Network for Face
Boosting Semantic Human Matting With Coarse Annotations
adaptive recursion Partitions perform CARP compression on multi-dimensional images
CRNet Cross-Reference Networks for Few-Shot Segmentation CRNet cross
-reference network for Few-Shot segmentation reasoning
Decoupled Representation Learning for Skeleton-Based Gesture Recognition
Deep Representation Learning on Long-Tailed Data A Learnable Embedding Augmentation Deep Representation Learning on Long-Tailed Data A Learnable Embedding Augmentation
Deep Shutter Unrolling Network Shutter Unfolding Network
Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition
DIST Rendering Deep Implicit Signed Distance Function With Differentiable Sphere Function
Diverse Image Generation via Self-Conditioned GANs Generate diverse images through self-adjusting GAN
Extremely Dense Point Correspondences Using a Learned Feature Descriptor
Few-Shot Open-Set Recognition Using Meta- Learning Few-Shot Open-Set Recognition Using Meta-Learning
Flow2Stereo Effective Self-Supervised Learning of Optical Flow and Stereo Matching Flow2Stereo Effective Optical Flow Self-Supervised Learning and Stereo Matching
Global Texture Enhancement for Fake Face Detection in the Wild
Globally Optimal Contrast Maximisation for Event-Based Motion Estimation Event-based global optimal contrast maximization
Graph Structured Network for Image-Text Matching
HAMBox Delving Into Mining High-Quality Anchors on Face Detection HAMBox Deep mining of high-quality face detection Quality Anchor
How Does Noise Help Robustness Explanation and Exploration under the Noise How Does Noise Help Robustness Explanation and Exploration
Hyperbolic Visual Embedding Learning for Zero-Shot Recognition
Improving Convolutional Networks With Self-Calibrated Convolutions Improving Convolutional Networks Using Self-Calibrating Convolutions
Joint Demosaicing and Denoising With Self Guidance Joint Demosaicing and Self Guidance Denoising
KeyPose Multi-View 3D Labeling and Keypoint Estimation for Transparent Objects KeyPose Multi-View 3D Labeling and Keypoint Estimation for Transparent Objects
Learning by Analogy Reliable Supervision From Transformations for Unsupervised Optical From Learning by Analogy in Unsupervised Optical Transformation Reliable Supervision
Learning Selective Self-Mutual Attention for RGB-D Saliency Detection Learning
to See Through Obstructions Learning to See Through Obstacles
MemNAS Memory-Efficient Neural Architecture Search With Grow-Trim Learning MemNAS Memory Efficient Neural Architecture Search With Grow-Trim Learning
Mnemonics Training Multi-Class Incremental Learning Without Forgetting Mnemonics Training Multi-Class Incremental Learning Without Forgetting
Neural Contours Learning to Draw Lines From 3D Shapes Neural Contour learning Drawing lines from 3D shapes
Open Compound Domain Adaptation Open Compound Domain Adaptation
Recognizing Objects From Any View With Object and Viewer-Centered Representations Regularizing Discriminative Capability of CGANs
for Semi-Supervised Generative Learning
Residual Feature Aggregation Network for Image Super-Resolution
Rethinking Computer-Aided Tuberculosis Diagnosis Rethinking Computer-Aided Tuberculosis Diagnosis
Search to Distill Pearls Are Everywhere but Not the Eyes Not Eyes
Semantic Correspondence as an Optimal Transport Problem
Severity-Aware Semantic Segmentation With Reinforced Wasserstein Training Severity-Aware Semantic Segmentation With Reinforced Wasserstein Training
Single-Image HDR Reconstruction by Learning to Reverse the Camera Pipeline by Learning Inverting the camera pipeline for single-image HDR reconstruction
StereoGAN Bridging Synthetic-to-Real Domain Gap by Joint Optimization of Domain StereoGAN Bridging Synthetic-to-Real Domain Gap by Joint Optimization of Domain Towards
Visually Explaining Variational Autoencoders Towards Visually Explaining Variational Autoencoders
Understanding Road Layout From Videos as a Whole From Videos Overall understanding of road layout?
Unity Style Transfer for Person Re-Identification
Unsupervised Instance Segmentation in Microscopy Images via Panoptic Domain Adaptation Unsupervised Learning
for Intrinsic Image Decomposition From a Single Image Unsupervised Learning for Intrinsic Image Decomposition from a Single Image
Violin A Large-Scale Dataset for Video-and-Language Inference Violin A Large-Scale Dataset for Video and Language Inference?
Visually Imbalanced Stereo Matching
When2com Multi-Agent Perception via Communication Graph Grouping When2com Multi-Agent Perception via Communication Graph Grouping
Generating Accurate Pseudo-Labels in Semi-Supervised Learning and Avoiding Overconfident Predictions
Searching for Actions on the Hyperbole
UnrealText Synthesizing Realistic Scene Text Images From the Unreal World UnrealText synthesizes real scene text images from the virtual world
12-in-1 Multi-Task Vision and Language Representation Learning 12-in-1 multi-task vision and language representation learning
Cross-Modality Person Re-Identification With Shared-Specific Feature Transfer with shared specific features Transferred Cross
-Task Black-Box Transferability of Adversarial Examples With Dispersion Reduction
From Depth What Can You See Depth Completion via Auxiliary From Depth What Can You See Depth Completion via Auxiliary What depth can you see?
Geometry-Aware Satellite-to-Ground Image Synthesis for Urban Areas Geometry-Aware Satellite-to-Ground Image Synthesis for Urban Areas
Learning Video Object Segmentation From Unlabeled Videos Learning Video Object Segmentation from Unlabeled Videos
MUXConv Information Multiplexing in Convolutional Neural Networks Predicting
Cognitive Declines Using Longitudinally Enriched Representations for Imaging Biomarkers bioimaging Marker representation predicts cognitive decline
RetinaTrack Online Single Stage Joint Detection and Tracking RetinaTrack Online Single Stage Joint Detection and Tracking
Stochastic Classifiers for Unsupervised Domain Adaptation Stochastic Classifiers for Unsupervised Domain Adaptation
D3S - A Discriminative Single Shot Segmentation Tracker D3S - Discriminative Single Shot Segmentation Tracker
ASLFeat Learning Local Features of Accurate Shape and Localization ASLFeat Learning Local Features of Accurate Shape and Localization Attention-Aware Multi -View
Stereo Attention-aware Multi-View Stereo

End-to-End Optimization of Scene Layout Scene Layout End-to-End Optimization
Learn to Augment Joint Data Augmentation and Network Optimization for Learning to Augment Joint Data Augmentation and Network Optimization
Multi-Task Collaborative Network for Joint Referring Expression Comprehension and Segmentation for Joint Reference Expression Multi-task Collaborative Networks for Understanding and Segmentation
Neural Network Pruning With Residual-Connections and Limited-Data Wavelet Synthesis
Net for Disparity Estimation to Synthesize DSLR Caliber Wavelet Synthesis Net for Disparity Estimation DSLR Caliber
Where What Whether Multi-Modal Learning Meets Pedestrian Detection Where Multi-Modal Learning Meets Pedestrian Detection
Cross-Domain Semantic Segmentation via Domain-Invariant Interactive Relation Transfer
Learning to Segment 3D Point Clouds in 2D Image Space learns to segment 3D point clouds in 2D image space
Deep Face Super-Resolution With Iterative Collaboration Between Attentive Recovery and
Learning to Dress 3D People in Generative Clothing Learning to Dress 3D People in Generative Clothing
Structure-Preserving Super Resolution With Gradient Guidance Guided Structure Preserving Super
-Resolution Unpaired Image Super-Resolution Using Pseudo-Supervision Pathological Retinal
Region Segmentation From OCT Images Using Geometric Relation Using Geometric Relation to Segment Pathological Retinal Regions from OCT Images
Boundary- Aware 3D Building Reconstruction From a Single Overhead Image Reconstructing Boundary-Aware 3D Buildings from a Single Overhead Image
Erasing Integrated Learning A Simple Yet Effective Approach for Weakly Erasing Integrated Learning A Simple Yet Effective Approach for Weakly Erasing Integrated Learning
Multimodal Future Localization and Emergence Prediction for Objects in Egocentric Center for Multimodal Future Localization and Occurrence Prediction of Objects
SOS Selective Objective Switch for Rapid Immunofluorescence Whole Slide Image SOS Selective Objective Switch for Rapid Immunofluorescence Whole Slide Image
HandVoxNet Deep Voxel-Based Network for 3D Hand Shape and Pose HandVoxNet
Sideways Depth -Parallel Training of Video Models Horizontal Deep Parallel Training of Video Models
TITAN Future Forecast Using Action Priors
LiDARsim Realistic LiDAR Simulation by Leveraging the Real World LiDARsim Realistic LiDAR Simulation by Leveraging the Real World
MANTRA Memory Augmented Networks for Multiple Trajectory Prediction MANTRA memory enhancement network for multi-trajectory prediction
Graph Embedded Pose Clustering for Anomaly Detection Graph Embedded Pose Clustering for Anomaly Detection
Towards Learning Structure via Consensus for Face Segmentation and Parsing Consensus direction through face segmentation and parsing learning structure
Something-Else Compositional Action Recognition With Spatial-Temporal Interaction Networks
Minimal Solvers for 3D Scan Alignment With Pairs of Intersecting
Augmenting Colonoscopy Using Extended and Directional CycleGAN for Lossy Image Translation Enhances lossy images using extended and directed CycleGAN Colonoscopy
CIAGAN Conditional Identity Anonymization Generative Adversarial Networks CIAGAN Conditional Identity Anonymization Generative Adversarial Networks
Focus on Defocus Bridging the Synthetic to Real Domain Gap Focus
Visual-Textual Capsule Routing for Text-Based Video Segmentation
Dont Hit Me Glass Detection in Real-World Scenes Don't Hit Me Glass Detection in Real-World Scenes
Image Super-Resolution With Cross-Scale Non-Local Attention and Exhaustive Self-Exemplars Mining Learning to Have an Ear for Face Super-Resolution Learning to Listen to Face Super-
Resolution
Controllable Person Image Synthesis With Attribute-Decomposed GAN
ADINet Attribute Driven Incremental Network for Retinal Image Classification ADINet Attribute Driven Incremental Network for Retinal Image Classification
Filter Grafting for Deep Neural Networks Filter Grafting
Parsing-Based View-Aware Embedding Network for Vehicle Re-Identification
PULSE Self-Supervised Photo Upsampling via Latent Space Exploration of Generative PULSE Self-Supervised Photo Upsampling via Latent Space Exploration of Generative Supervised Photo Upsampling
Learning Better Lossless Compression Using Lossy Compression Learning Better Lossless Compression Using Lossy Compression
Can We Learn Heuristics for Graphical Model Inference Using Reinforcement
Deep Optics for Single-Shot High-Dynamic-Range Imaging Deep Optics for Single-Shot High-Dynamic-Range Imaging
Single-Shot Monocular RGB-D Imaging Using Uneven Double Refraction Single-shot monocular RGB-D imaging using uneven birefringence
Hierarchical Graph Attention Network for Visual Relationship Detection Hierarchical Graph Attention Network for Visual Relationship Detection
SSRNet Scalable 3D Surface Reconstruction Network SSRNet Scalable 3D Surface Reconstruction Network
Memory Aggregation Networks for Efficient Interactive Video Object Segmentation
End-to-End Learning of Visual Representations From Uncurated Instructional Videos End-to-End Learning of Visual Representations From Uncurated Instructional Videos
An Efficient PointLSTM for Point Clouds Based Gesture Recognition An efficient PointLSTM based on point cloud gesture recognition
Domain-Aware Visual Bias Eliminating for Generalized Zero-Shot Learning
VOLDOR Visual Odometry From Log-Logistic Dense Optical Flow Residuals VOLDOR Visual Odometry From Log-Logistic Dense Optical Flow Residuals Learning for Generalized Zero-Shot
Learning Weighted Submanifolds With Variational Autoencoders and Riemannian Variational Autoencoders Learning to Transfer Texture From Clothing
Images to 3D Humans Learning to Transfer Texture From Clothing Images to 3D Humans Learning to Transfer Texture From Clothing Images to 3D Humans
Self- Supervised Learning of Pretext-Invariant Representations Self-supervised Learning of Pretext
-Invariant Representations Multiview-Consistent Semi-Supervised Learning for 3D Human Pose Estimation
Learning Visual Motion Segmentation Using Event Surfaces Using Event Surfaces Learning Visual Motion Segmentation
EmotiCon Context-Aware Multimodal Emotion Recognition Using Freges Principle EmotiCon Context-Aware Multimodal Emotion Recognition Using Freges Principle
HyperSTAR Task-Aware Hyperparameters for Deep Networks HyperSTAR Task-Aware Hyperparameters for Deep Networks
Just Go With the Flow Self-Supervised Scene Flow Estimation
StructEdit Learning Structural Shape Variations StructEdit Learning Structural Shape Variations
Social-STGCNN A Social Spatio-Temporal Graph Convolutional Neural Network for Human Social-STGCNN
Moving in the Right Direction A Regularization for Deep Metric
Towards Verifying Robustness of Neural Networks Against A Family of Towards Validating the Robustness of Neural Networks to a Family
Fast Symmetric Diffeomorphic Image Registration with Convolutional Neural Networks
DeepLPF Deep Local Parametric Filters for Image Enhancement DeepLPF for Image Enhancement local parameter filter
Noisier2Noise Learning to Denoise From Unpaired Noisy Data Noisier2Noise Learning to Denoise From Unpaired Noisy Data
Hardware-in-the-Loop End-to-End Optimization of Camera Image Processing Pipelines Hardware-in-the-Loop End-to-End Optimization of Camera Image Processing Pipelines
Learning From Synthetic Animals Learning
Local-Global Video-Text Interactions for Temporal Grounding from Synthetic Animals
Multi-Modal Domain Adaptation for Fine-Grained Action Recognition Domain Adaptive
Dataless Model Selection With the Deep Frame Potential
Self-Supervised Viewpoint Learning From Image Collections Self-Supervised Viewpoint Learning From Image Collections
Ego-Topo Environment Affordances From Egocentric Video Self-Topological Affordances in Video
Speech2Action Cross-Modal Supervision for Action Recognition Speech2Action Cross-Modal Supervision for Action Recognition
DOPS Learning to Detect 3D Objects and Predict Their 3D DOPS Learning to Detect 3D Objects and Predict Their 3D
Deep Learning for Handling Kernelmodel Uncertainty in Image Deconvolution
Variational-EM-Based Deep Learning for Handling Kernelmodel Uncertainty in Image Deconvolution for Noise-Blind Image Deblurring A Self-supervised Approach for Adversarial Robustness From Image Collections to Point Clouds With Self-Supervised Shape and A Self-
supervised Approach for Adversarial Robustness
From Image Collections to Point Clouds with Self-Supervised Shape and
Learning Physics-Guided Face Relighting Under Directional Light
Image Based Virtual Try-On Network From Unpaired Data Image Based Virtual Try-On Network From Unpaired Data Virtual try-on network
How Useful Is Self-Supervised Pretraining for Visual Tasks
Adaptive Hierarchical Down-Sampling for Point Cloud Classification Adaptive Hierarchical Down-Sampling for Point Cloud Classification
You2Me Inferring Body Pose in Egocentric Video via First and You2Me Inferring Body Pose in Egocentric Video via First and
Total3DUnderstanding Joint Layout Object Pose and Mesh Reconstruction for Indoor Total3DUnderstanding Joint Layout Object Pose and Mesh Reconstruction for Indoor
Differentiable Volumetric Rendering Lear ning Implicit 3D Representations Without 3D Supervision Differentiable Volume Rendering Learning Implicit 3D Representation Without 3D Supervision
Softmax Splatting for Video Frame Interpolation Softmax Splatting
Breaking the Cycle - Colleagues Are All You Need Colleagues Are All You Need
HCNAF Hyper-Conditioned Neural Autoregressive Flow and its Application for Probabilistic HCNAF
Learning Situational Driving Learning Situational Driving
Intuitive Interactive Beard and Hair Synthesis With Generative Models Intuitive Interactive Beard and Hair Synthesis With Generative Models
TetraTSDF 3D Human Reconstruction From a Single Image With a TetraTSDF 3D 人体重建从单个图像与
A Unified Optimization Framework for Low-Rank Inducing Penalties 低秩诱导惩罚的统一优化框架
Bundle Adjustment on a Graph Processor 图处理器上的捆绑调整
Local Context Normalization Revisiting Local Normalization 局部上下文规范化重新审视局部规范化
Semi-Supervised Semantic Segmentation With Cross-Consistency Training 具有交叉一致性训练的半监督语义分割
Efficient Neural Vision Systems Based on Convolutional Image Acquisition 基于卷积图像采集的高效神经视觉系统
3DRegNet A Deep Neural Network for 3D Point Registration 3DRegNet 用于 3D 点配准的深度神经网络
Faster Reconstruction of Shredded Text Documents via Self-Supervised Deep Asymmetric 通过自我监督的深度不对称更快地重建碎文本文档
Looking at the Right Stuff - Guided Semantic-Gaze for Autonomous
On the Regularization Properties of Structured Dropout
Cascaded Deep Video Deblurring Using Temporal Sharpness Prior Experimental cascaded deep video deblurring
Dynamic Refinement Network for Oriented and Densely Packed Object Detection
Exploring Category-Agnostic Clusters for Open-Set Domain Adaptation Exploring Category-Agnostic Clusters for Open-Set Domain Adaptation Exploring Category-Agnostic Clusters for Open-Set Domain Adaptation Set Domain
Single Image Optical Flow Estimation With an Event Camera
Spatio-Temporal Graph for Video Captioning With Knowledge Distillation
Unsupervised Intra-Domain Adaptation for Semantic Segmentation Through Self-Supervision Self-Supervised Unsupervised In-Domain Adaptation for Semantic Segmentation
X-Linear Attention Networks for Image Captioning X-Linear Attention Network for Image Description
BidNet Binocular Image Dehazing Without Explicit Disparity Estimation BidNet Binocular Image Dehazing Without Explicit Parallax Estimation
Multi-Scale Interactive Network for Salient Object Detection for Salient Self-Trained Deep Ordinal
Regression for End-to-End Video Anomaly Detection for Object Detection Self-Trained Deep Ordinal Regression for End-to-End Video Anomaly Detection Solving Mixed-
Modal Jigsaw Puzzle for Fine-Grained Sketch-Based Image Retrieval solves the mixed-modal puzzle of fine-grained sketch-based image retrieval
TubeTK Adopting Tubes to Track Multi-Object in a One-Step Training TubeTK uses Tubes to track multiple objects in one-step training
Local Non-Rigid Structure-From-Motion From Diffeomorphic Mappings Local Non-rigid Structural Motion from Differential Mapping
LatentFusion End-to-End Differentiable Reconstruction and Rendering for Unseen Object Pose LatentFusion End-to-End Differentiable Reconstruction and Rendering of Unseen Object Pose
Learning Memory-Guided Normality for Anomaly Detection Learning Memory-Guided Normality for Anomaly Detection
Seeing the World in a Bag of Chips
Learning Unsupervised Hierarchical Part Decomposition of 3D Objects From a Learning Unsupervised Analysis of 3D Objects Layer Partial Decomposition
Heterogeneous Knowledge Distillation Using Information Flow Modeling
TailorNet Predicting Clothing in 3D as a Function of Human TailorNet
An End-to-End Edge Aggregation Network for Moving Object Segmentation End-to-end edge aggregation network for moving object segmentation
Seeing without Looking Contextual Rescoring of Object Detections for AP
3D-ZeF A 3D Zebrafish Tracking Benchmark Dataset 3D-ZeF 3D Zebrafish Tracking Benchmark Data Set
Deep Snake for Real-Time Instance Segmentation Deep Snake for Real-Time Instance Segmentation
IDA-3D Instance-Depth-Aware 3D Object Detection From Stereo Vision for Autonomous IDA-3D Instance-Depth-Aware 3D Object Detection from Stereo Vision for Autonomous Large-Scale Object Detection in the
Wild From Imbalanced Multi-Labels SAINT Spatially
Aware Interpolation NeTwork for Medical Slice Synthesis SAINT Spatially Aware Interpolation NeTwork for Medical Slice Synthesis
Generative-Discriminative Feature Representations for Open-Set Recognition Generative-Discriminative Feature Representations for Open-Set Recognition
Incremental Few-Shot Object Detection Incremental Few-shot Object Detection
Binarizing MobileNet via Evolution-Based Searching Binarizing MobileNet via Evolution-Based Search
CoverNet Multimodal Behavior Prediction Using Trajectory Sets
Learning to Evaluate Perception Models Using Planner-Centric Metrics learn to evaluate perception models using planner-centric metrics
A2dele Adaptive and Attentive Depth Distiller for Efficient RGB-D Salient Adversarial Latent Autoencoders
Adversarial Latent Autoencoders
Evolving Losses for Unsupervised Video Representation Learning Unsupervised Video Representation Learned Evolutionary Loss
SharinGAN Combining Synthetic and Real Data for Unsupervised Geometry Estimation SharinGAN Combining Synthetic Data and Real Data for Unsupervised
Geometry Estimation On the Uncertainty of Self-Supervised Monocular Depth Estimation Uncertainty
Based Camera Model Selection Uncertainty-Based Camera Model Selection
Learning Multi-Object Tracking and Segmentation From Automatic Annotations
Embodied Language Grounding With 3D Visual Feature Representations Embodied Language Grounding With 3D Visual Feature Representations
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis
Exploring Data Aggregation in Policy Learning for Vision-Based Urban Autonomous
C-Flow Conditional Generative Flow Models for Images and 3D Point C-Flow Conditional Generative Flow Model
for Images and 3D PointsImitative Non-Autoregressive Modeling for Trajectory Forecasting and
ImputationImVoteNet Boosting 3D Object Detection in Point Clouds With Image ImVoteNet
P2B Point-to-Box Network for 3D Object Tracking in Point Clouds P2B Point-to-Box Network for 3D Object Tracking in Point Clouds
REVERIE Remote Embodied Visual Referring Expression in Real Indoor Environments REVERIE
Two Causal Principles for Improving Visual Dialog Two Causal Principles for Improving Visual Dialog
DR Loss Improving Object Detection by Distributional Ranking DR Loss Improving Object Detection by Distribution Ranking
End-to-End Pseudo-LiDAR for Image-Based 3D Object Detection End-to-End Pseudo-LiDAR for Image-Based 3D Object Detection
Hierarchically Robust Representation Learning Hierarchical Robust Representation Learning
Attention-Guided Hierarchical Structure Aggregation for Image Matting
Learning to Learn Single Domain Generalization Learning to Learn Single Domain Generalization
SEED Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition SEED Semantic Enhanced Encoder-Decoder Framework for Scene Text Recognition
Forward and Backward Information Retention for Accurate Binary Neural
Networks Adaptive
Loss-Aware Quantization for Multi-Bit Networks Adaptive Loss-Aware Quantization for Multi-Bit Networks
Self2Self With Dropout Learning Self-Supervised Denoising From Single Image Self2Self with Dropout Learning Self-Supervised Single Image Denoising
Designing Network Design Spaces Designing Network Design Spaces
GeoDA A Geometric Framework for Black-Box Adversarial Attacks GeoDA A Geometric Framework for Black-Box Adversarial Attacks
Robust Design of Deep Neural Networks Against Adversarial Attacks Based
iTAML An Incremental Task-Agnostic Meta-learning Approach iTAML A task-independent incremental meta-learning method
TBT Targeted Neural Network Attack With Bit Trojan uses Bit Trojan's TBT target neural network attack
Predicting Sharp and Accurate Occlusion Boundaries in Monocular Depth Estimation Predicting clear and accurate occlusion boundaries in monocular depth estimation
DLWL Improving Detection for Lowshot Classes With Weakly Labeled Data DLWL Improvement for weakly labeled data Detection of the Lowshot class
Whats Hidden in a Randomly Weighted Neural Network
Straight to the Point Fast-Forwarding Videos via Reinforcement Learning Using
A Local-to-Global Approach to Multi-Modal Movie Scene Segmentation 一A Local-to - Global Multimodal Movie Scene Segmentation Method
Global-Local Bidirectional Reasoning for Unsupervised Representation Learning of 3D Point
Reinforcement Learning Aware Simulation-to-Real
Learning to Measure the Static Friction Coefficient in Cloth Contact
There and Back Again Revisiting Backpropagation Saliency Methods
Neural Voxel Renderer Learning an Accurate and Controllable Rendering Tool Neural Voxel Renderer Learning an Accurate and Controllable Rendering Tool
Lightweight Multi-View 3D Pose Estimation Through Camera-Disentangled Representation 通过相机解耦表示进行轻量级多视图 3D 姿势估计
Deep Image Spatial Transformation for Person Image Generation 用于人物图像生成的深度图像空间变换
Instance-Aware Context-Focused and Memory-Efficient Weakly Supervised Object Detection Instance-Aware Context-Focused 和 Memory-Efficient 弱监督目标检测
Neural Blind Deconvolution Using Deep Priors 使用深度先验的神经盲反卷积
Sketchformer Transformer-Based Representation for Sketched Structure 草图结构的基于 Sketchformer 变压器的表示
McFlow Monte Carlo Flow Models for Data Imputation 用于数据插补的 McFlow Monte Carlo 流模型
Learning Fast and Robust Target Models for Video Object Segmentation 学习用于视频对象分割的快速且稳健的目标模型
Predicting Semantic Map Representations From Images Using Pyramid Occupancy Networks
Optimizing Rank-Based Metrics With Blackbox Differentiation Optimizing Rank-Based Metrics With Blackbox Differentiation
Joint Graph-Based Depth Refinement and Normal Estimation Based on Joint Graph Depth refinement and normal estimation of
PADS Policy-Adapted Sampling for Visual Similarity Learning PADS Policy-Adaptive Sampling for Visual Similarity Learning
STEFANN Scene Text Editor Using Font Adaptive Neural Network STEFANN Scene Text Editor Using Font Adaptive Neural Network
Sub-Frame Appearance and 6D Pose Estimation of Fast Moving Objects Sub-Frame Appearance and 6D Pose Estimation of Fast Moving Objects
Cloth in the Wind A Case Study of Physical Measurement
FroDO From Detections to 3D Objects FroDO From detecting 3D objects
Video Object Grounding Using Semantic Roles in Language Description
PIFuHD Multi-Level Pixel-Aligned Implicit Function for High-Resolution 3D Human Digitization
Active 3D Motion Visualization Based on Spatiotemporal Light-Ray Integration Active 3D Motion Visualization Based on Spatiotemporal Light-Ray Integration 3D Motion Visualization
Learning a Dynamic Map of Visual Appearance Learning a Dynamic Map of Visual Appearance
Show Edit and Tell A Framework for Editing Image Captions Show Edit and Tell A Framework for Editing Image Captions
Transferring Dense Pose to Proximal Animal Classes Transferring Dense Pose to Proximal Animal Classes Telezoan
Separating Particulate Matter From a Single Microscopic Image
Warp to the Future Joint Forecasting of Features and Feature Warp to the Future Joint Forecasting Features and Features
Can Facial Pose and Expression Be Separated With Weak Perspective Amblyopia Separate facial poses from expressions?
Discovering Synchronized Subsets of Sequences A Large Scale Solution Discovering Synchronized Subsets of Sequences A Large Scale Solution
SuperGlue Learning Feature Matching With Graph Neural Networks SuperGlue Learning Feature Matching and Graph Neural Networks
Seeing Around Street Corners Non-Line-of-Sight Detection and Tracking In -the-Wild Using Looking Around Street Corners: On
Joint Estimation of Pose Geometry and svBRDF From
a U-Net Based Discriminator for Generative Adversarial Networks 一A Generative Adversarial Network Discriminator Based on U-Net
Why Having 10000 Parameters in Your Camera Model Is Better Why Having 10000 Parameters in Your Camera Model Is Better
DualConvMesh-Net Joint Geodesic and Euclidean Convolutions on 3D Meshes 3D Grid The DualConvMesh-Net Combines Geodesic and Euclidean Convolution
Learning Nanoscale Motion Patterns of Vesicles in Living Cells Learning Nanoscale Motion Patterns of Vesicles in Living Cells
SQuINTing at VQA Models Introspecting VQA Models With Sub-Questions SQuINTing at VQA Models Introspecting VQA Models with Sub-Questions
Background Matting The World Is Your Green Screen Background Matting World Is Your Green Screen
End-to-End Camera Calibration for Broadcast Videos Broadcast Videos ColorFool
Semantic Adversarial Colorization ColorFool Semantic Adversarial Colorization
Understanding Human Hands in Contact at Internet Scale
Domain Adaptation for Image Dehazing Domain Adaptation for Image Dehazing
FineGym A Hierarchical Video Dataset for Fine-Grained Action Understanding FineGym A Hierarchical Video Dataset for Fine-grained Action Understanding
Intra- and Inter-Action Understanding via Temporal Action Parsing Intra- and Inter-Action Understanding via Temporal Action Parsing
PFRL Pose-Free Reinforcement Learning for 6D Pose Estimation PFRL Pose-Free Reinforcement Learning for 6D Pose Estimation
Auto-Encoding Twin-Bottleneck Hashing
Blurry Video Frame Interpolation
Interpreting the Latent Space of GANs for Semantic Face Editing Interpreting GAN's Latent Space for Semantic Face Editing
Noise-Aware Fully Webly Supervised Object Detection Noise-Aware Fully Webly Supervised Object Detection
Towards Backward-Compatible Representation Learning
Fast Texture Synthesis via Pseudo Optimizer Optimizer for fast texture synthesis
Learning Fused Pixel and Feature-Based View Reconstructions for Light Fields Learning Fused Pixel and Feature-Based View Reconstructions for Light Fields Point-
GNN Graph Neural Network for 3D Object Detection in a Point-GNN Graph for 3D Object Detection Neural Networks
Polishing Decision-Based Adversarial Noise With a Customized Sampling
PV-RCNN Point-Voxel Feature Set Abstraction for 3D Object Detection PV-RCNN Point-Voxel Feature Set for 3D Object Detection Abstract
SpSequenceNet Semantic Segmentation Network on 4D Point Clouds SpSequenceNet Semantic Segmentation Network on 4D Point Clouds
Towards Universal Representation Learning for Deep Face Recognition General representation learning for deep face recognition
Unsupervised Deep Shape Descriptor With Point Distribution Learning Unsupervised with point distribution learning Depth Shape Descriptor
Weakly-Supervised Action Localization by Generative Attention Modeling Weakly-Supervised Action Localization by Generative Attention Modeling
Where Am I Looking At Joint Location and Orientation Estimation Where Am I Looking At Joint Location and Orientation Estimation
3D Photography Using Context-Aware Layered Depth Inpainting 3D photography using context-aware layered depth restoration
Robust Reference-Based Super-Resolution With Similarity-Aware Deformable Convolution 具有相似性感知可变形卷积的强大的基于参考的超分辨率
Semantic Pyramid for Image Generation 图像生成的语义金字塔
ALFRED A Benchmark for Interpreting Grounded Instructions for Everyday Tasks ALFRED 解释日常任务接地指令的基准
ViewAL Active Learning With Viewpoint Entropy for Semantic Segmentation ViewAL Active Learning with Viewpoint Entropy for Semantic Segmentation
Visual Grounding in Video for Unsupervised Word Translation 无监督词翻译的视频视觉基础
Adaptive Subspaces for Few-Shot Learning 少样本学习的自适应子空间
Barycenters of Natural Images Constrained Wasserstein Barycenters for Image 自然图像的重心约束了图像的 Wasserstein 重心
Dont Judge an Object by Its Context Learning to Overcome 不要通过上下文来判断一个对象学习克服
Filter Response Normalization Layer Eliminating Batch Dependence in the Training 过滤响应归一化层消除训练中的批次依赖性
Inferring Attention Shift Ranks of Objects for Image Saliency 为图像显着性推断对象的注意力转移等级
Deep Parametric Shape Predictions Using Distance Fields 使用距离场的深度参数形状预测
A Morphable Face Albedo Model 可变形面反照率模型
15 Keypoints Is All You Need 您只需要 15 个关键点
F-BRS Rethinking Backpropagating Refinement for Interactive Segmentation F-BRS 重新思考交互式分割的反向传播细化
Meta-Transfer Learning for Zero-Shot Super-Resolution 零样本超分辨率的元迁移学习
Efficient Derivative Computation for Cumulative B-Splines on Lie Groups 李群上累积 B 样条的高效导数计算
Channel Attention Based Iterative Residual Learning for Depth Map Super-Resolution 基于通道注意的深度图超分辨率迭代残差学习
DEPARA Deep Attribution Graph for Deep Knowledge Transferability 用于深度知识可迁移性的 DEPARA 深度归因图
HybridPose 6D Object Pose Estimation Under Hybrid Representations HybridPose 混合表示下的 6D 对象姿态估计
Revisiting the Sibling Head in Object Detector 重访对象检测器中的兄弟头
DeFeat-Net General Monocular Depth via Simultaneous Unsupervised Representation Learning 通过同时无监督表示学习的 DeFeat-Net 通用单目深度
Same Features Different Day Weakly Supervised Feature Learning for Seasonal 相同的特征不同的日子弱监督特征学习季节性
Lighthouse Predicting Lighting Volumes for Spatially-Coherent Illumination Lighthouse 预测空间相干照明的照明体积
GrappaNet Combining Parallel Imaging With Deep Learning for Multi-Coil MRI GrappaNet 将并行成像与深度学习相结合用于多线圈 MRI
Noise Modeling Synthesis and Classification for Generic Object Anti-Spoofing
Where Does It End - Reasoning About Hidden Surfaces by Where Does It End - Reasoning About Hidden Surfaces by
Blindly Assess Image Quality in the Wild Guided by a Blindly assessing image quality in the wild
Instance-Aware Image Colorization
PREDICT CLUSTER Unsupervised Skeleton Based Action Recognition PREDICT CLUSTER Unsupervised Skeleton Based Action Recognition
Gate-Shift Networks for Video Action Recognition Gate-Shift for Video Action Recognition Network
Spatially-Attentive Patch-Hierarchical Network for Adaptive Motion Deblurring
ACNe Attentive Context Normalization for Robust Permutation-Equivariant Learning ACNe Attention for Robust Permutation-Equivariant Learning Context Normalization
Circle Loss A Unified Perspective of Pair Similarity Optimization Circle Loss Unified Perspective of Similarity Optimization
Conditional Gaussian Distribution Learning for Open Set Recognition 开放集识别的条件高斯分布学习
Disp R-CNN Stereo 3D Object Detection via Shape Prior Guided Disp R-CNN Stereo 3D Object Detection via Shape Prior Guided
Fast Template Matching and Update for Video Object Tracking and 视频对象跟踪的快速模板匹配和更新
Learning Rank-1 Diffractive Optics for Single-Shot High Dynamic Range Imaging 学习用于单次高动态范围成像的 Rank-1 衍射光学
Reciprocal Learning Networks for Human Trajectory Prediction 用于人类轨迹预测的互惠学习网络
Recursive Social Behavior Graph for Trajectory Prediction 用于轨迹预测的递归社会行为图
Scalability in Perception for Autonomous Driving Waymo Open Dataset 自动驾驶 Waymo 开放数据集的感知可扩展性
Multi-Path Learning for Object Pose Estimation Across Domains 跨域对象姿态估计的多路径学习
Two-Stage Peer-Regularized Feature Recombination for Arbitrary Image Style Transfer
EfficientDet Scalable and Efficient Object Detection EfficientDet Scalable and Efficient Object Detection
Equalization Loss for Long-Tailed Object Recognition Equalized Loss for Long Tail Object Recognition
Self-Supervised Human Depth Estimation From Monocular Videos Self-Supervised Human Depth Estimation for Monocular Video
VecRoad Point-Based Iterative Graph Exploration for Road Graphs Extraction VecRoad Point-Based Iterative Graph Exploration for Road Graphs Extraction
Polarized Non-Line-of-Sight Imaging Polarized Non-Line-of-Sight Imaging
StegaStamp Invisible Hyperlinks in Physical Photographs StegaStamp Physical Photographs Invisible Hyperlinks
A Semi-Supervised Assessor of Neural Architectures Neural Architecture's Semi-Supervised Evaluator
Deep Implicit Volume Compression Deep Implicit Volume compression
Local Class-Specific and Global Image-Level Generative Adversarial Networks for Semantic-Guided 用于语义引导的局部类特定和全局图像级生成对抗网络
LSM Learning Subspace Minimization for Low-Level Vision 低级视觉的 LSM 学习子空间最小化
Revisiting Pose-Normalization for Fine-Grained Few-Shot Recognition 重新审视用于细粒度 Few-Shot 识别的姿势归一化
Unbiased Scene Graph Generation From Biased Training 从有偏训练中生成无偏场景图
Uncertainty-Aware Score Distribution Learning for Action Quality Assessment 用于行动质量评估的不确定性感知分数分布学习
Unsupervised Domain Adaptation via Structurally Regularized Deep Clustering 通过结构正则化深度聚类的无监督域适应
Computing Valid P-Values for Image Segmentation by Selective Inference 通过选择性推理计算图像分割的有效 P 值
Alleviation of Gradient Exploding in GANs Fake Can Be Real GANs 中梯度爆炸的缓解可能是真实的
Few-Shot Class-Incremental Learning Few-Shot Class-Incremental Learning
FastDVDnet Towards Real-Time Deep Video Denoising Without Flow Estimation FastDVDnet Real-Time Deep Video Denoising Without Flow Estimation
SER-FIQ Unsupervised Estimation of Face Image Quality Based on Stochastic Stochastic SER-FIQ Unsupervised Face Image Quality Estimation
StyleRig Rigging StyleGAN for 3D Control Over Portrait Images StyleRig Rigging StyleGAN for 3D Control of Portrait Images
Dynamic Fluid Surface Reconstruction Using Deep Neural Network
TDAN Temporally -Deformable Alignment Network for Video Super-Resolution TDAN Time Deformable Alignment Network for Video Super-Resolution
End-to-End Model-Free Reinforcement Learning for Urban Driving Using Implicit Affordances Model Reinforcement Learning
Distilled Semantics for Comprehensive Scene Understanding from Videos Distilled Semantics for Comprehensive Scene Understanding from Videos
Transform and Tell Entity-Aware News Image Captioning Transform and Tell Entity-Aware News Image Captioning
GLU-Net Global-Local Universal Network for Dense Flow and Correspondences GLU-Net Global-Local Universal Network for Dense Flow and Correspondences
Self-Supervised Learning of Video- Induced Visual Invariances Video Induced Visual Invariance Self-Supervised Learning
STAViS Spatio-Temporal AudioVisual Saliency Network STAViS Spatio-Temporal Audiovisual Saliency Network
Learning From Web Data With Self-Organizing Memory Module Using Self-Organizing Memory Module to Learn
Physically Realizable Adversarial Examples from Web Data LiDAR Object Detection Physically Realizable Adversarial Examples for LiDAR Object Detection
Single-View View Synthesis With Multiplane Images
VSGNet Spatial Attention Network for Detecting Human Object Interactions Using VSGNet Spatial for Detecting Human Object Interactions pay attention to the network
Learning When and Where to Zoom With Deep Reinforcement Learning
UNAS Differentiable Architecture Search Meets Reinforcement Learning UNAS Differentiable Architecture Search Meets Reinforcement Learning
Butterfly Transform An Efficient FFT Based Neural Architecture Design Butterfly Transform An Efficient FFT Based Neural Architecture Design Efficient FFT-based Neural Architecture Design
Mixture Dense Regression for Object Detection and Human Pose Estimation Mixture Dense Regression for Object Detection and Human Pose Estimation
VQA With No Questions-Answers Training VQA No Questions-Answers Training
ProAlignNet Unsupervised Learning for Progressively Aligning Noisy Contours ProAlignNet Unsupervised Learning for Gradually Aligning Noise Contours
Toward a Universal Model for Shape From Texture
Dynamic Convolutions Exploiting Spatial Sparsity for Faster Inference Dynamic Convolutions Exploiting Spatial Sparsity for Faster Inference
Siam R-CNN Visual Tracking by Re-Detection re-detected Siam R-CNN visual tracking
PointPainting Sequential Fusion for 3D Object Detection PointPainting Sequential Fusion for 3D Object Detection
NestedVAE Isolating Common Factors via Weak Supervision NestedVAE Isolating Common Factors via Weak Supervision
MoreFusion Multi-object Reasoning for 6D Pose Estimation from Volumetric Fusion MoreFusion Multi-object Reasoning for 6D pose Estimation from Volumetric Fusion
Learning 3D Semantic Scene Graphs From 3D Indoor Reconstructions Learning 3D Semantic Scene Graphs from 3D Indoor Reconstructions
Bringing Old Photos Back to Life Let old photos come back to life
FBNetV2 Differentiable Neural Architecture Search for Spatial and Channel Dimensions FBNetV2 can be different sub-neural architecture Search Space and Channel Dimensions
On Vocabulary Reliance in Scene Text Recognition On Vocabulary Reliance in Scene Text Recognition
Reflection Scene Separation From a Single Image Separating Reflective Scenes from a Single Image
Super-BPD Super Boundary-to-Pixel Direction for Fast Image Segmentation
3DV 3D Dynamic Voxel for Action Recognition in Depth Video 3DV 3D Dynamic Voxel for Action Recognition in Depth Video A Model
-Driven Deep Neural Network for Single Image Rain Removal
Active Vision for Early Recognition of Human Actions Active Vision for Early Recognition of Human Actions
Affinity Graph Supervision for Visual Recognition Affinity Graph Supervision
APQ Joint Search for Network Architecture Pruning and Quantization Policy APQ Joint Search Network Architecture Pruning and Quantization Policy
Attentive Normalization for Conditional Image Generation Attention Normalization for Conditional Image Generation
BiDet An Efficient Binarized Object Detector BiDet An Efficient Binarized Object Detector BiDet BiFuse Monocular 360 Depth Estimation
via Bi-Projection Fusion BiFuse Monocular 360 Depth Estimation via Bi-Projection Fusion
Cascaded Refinement Network for Point Cloud Completion 点云补全的级联细化网络
CenterMask Single Shot Instance Segmentation With Point Representation 具有点表示的 CenterMask 单镜头实例分割
CNN-Generated Images Are Surprisingly Easy to Spot… for Now CNN 生成的图像非常容易发现……目前
Collaborative Distillation for Ultra-Resolution Universal Style Transfer 超分辨率通用风格转移的协同蒸馏
Combining Detection and Tracking for Human Pose Estimation in Videos 结合检测和跟踪在视频中进行人体姿态估计
ContourNet Taking a Further Step Toward Accurate Arbitrary-Shaped Scene Text ContourNet 向准确的任意形状场景文本迈进了一步
Cross-Batch Memory for Embedding Learning 用于嵌入学习的跨批次记忆
Cross-Domain Face Presentation Attack Detection via Multi-Domain Disentangled Representation Learning 基于多域解耦表示学习的跨域人脸表示攻击检测
Cross-Modal Pattern-Propagation for RGB-T Tracking 用于 RGB-T 跟踪的跨模态模式传播
Deep Degradation Prior for Low-Quality Image Classification 低质量图像分类的深度退化先验
Deep Distance Transform for Tubular Structure Segmentation in CT Scans CT 扫描中管状结构分割的远距离变换
Deep Generative Model for Robust Imbalance Classification 鲁棒不平衡分类的深度生成模型
Deep Spatial Gradient and Temporal Depth Learning for Face Anti-Spoofing 用于人脸反欺骗的深度空间梯度和时间深度学习
DeepFLASH An Efficient Network for Learning-Based Medical Image Registration DeepFLASH 一种高效的基于学习的医学图像配准网络
Differential Treatment for Stuff and Things A Simple Unsupervised Domain 事物的差异化处理一个简单的无监督域
Discovering Human Interactions With Novel Objects via Zero-Shot Learning 通过零样本学习发现人类与新物体的交互
Diversified Arbitrary Style Transfer via Deep Feature Perturbation
DNU Deep Non-Local Unrolling for Computational Spectral Imaging DNU Deep Non-local Unrolling
Dual Super-Resolution Learning for Semantic Segmentation Dynamic Face Video Segmentation via Reinforcement Learning for Segmentation
Dynamic Face Video Segmentation via Reinforcement Learning
ECA-Net Efficient Channel Attention for Deep Convolutional Neural Networks ECA-Net Efficient Channel Attention
EventSR From Asynchronous Events to Image Reconstruction Restoration and Super-Resolution EventSR
Few-Shot Learning of Part-Specific Probability Space for 3D Shape Segmentation Few-Shot Learning of Part-Specific Probability Space for 3D Shape Segmentation
FM2u-Net Face Morphological Multi-Branch Network for Makeup-Invariant Face Verification FM2u-Net Face Morphological Multi-Branch Network for Makeup-Invariant Face Verification FocalMix Semi-Supervised Learning for 3D Medical Image Detection FocalMix for 3D Medical Image Detection semi-
supervised Learning
G3AN Disentangling Appearance and Motion for Video Generation G3AN for Video Generation Disentangling Appearance and Motion
Hierarchical Human Parsing With Typed Part-Relation Reasoning
Hierarchical Pyramid Diverse Attention Networks for Face Recognition for Hierarchical Pyramid Diverse Attention Networks for Face Recognition
High-Frequency Component Helps Explain the Generalization of Convolutional Neural Networks
High-Order Information Matters Learning Relation and Topology for Occluded Person High Order information is important for learning relations and topology of occluders
Instance Credibility Inference for Few-Shot Learning Instance Credibility Inference for Few-Shot Learning
Instance Shadow Detection Instance Shadow Detection
Joint Filtering of Intensity Images and Neuromorphic Events for High-Resolution
Learning a Reinforced Agent for Flexible Exposure Bracketing Selection
Learning Combinatorial Solver for Graph Matching Learning Combinatorial Solver for Graph Matching
Learning Human-Object Interaction Detection Using Interaction Points Learning Human-Object Interaction Detection Using Interaction Points
Learning to Cartoonize Using White-Box Cartoon Representations Learning to Cartoonize Using White-Box Cartoon Representations
Lightweight Photometric Stereo for Facial Details Recovery Lightweight Photometric Stereo for Facial Details Recovery
LT-Net Label Transfer by Learning Reversible Voxel-Wise Correspondence for One-Shot LT-Net Label Transfer by Learning Reversible Voxel-Wise Correspondence for One-Shot Mesh-Guided Multi-View Stereo With Pyramid
Architecture View Stereo
MineGAN Effective Knowledge Transfer From GANs to Target Domains With MineGAN
Mitigating Bias in Face Recognition Using Skewness-Aware Reinforcement Learning Mitigating Bias in Face Recognition Using Skewness-Aware Reinforcement Learning Using Skewness-Aware Reinforcement Learning to Mitigate Bias in Face Recognition
NAS-FCOS Fast Neural Architecture Search for Object Detection NAS-FCOS for Object Detection Fast Neural Architecture Search
Neural Networks Are More Productive Teachers Than Human Raters Active Neural Networks Are More Efficient Teachers Than Human Raters
Neural Pose Transfer by Spatially Adaptive Instance Normalization Through Space Neural Pose Transfer with Adaptive Instance Normalization
On the General Value of Evidence and Bilingual Scene-Text Visual Question
Answering Orthogonal Convolutional Neural Networks
PANDA A Gigapixel-Level Human-Centric Video Dataset PANDA Gigapixel-Level Human-Centric Video Dataset PANDA Human-centric video dataset
Pixel Consensus Voting for Panoptic Segmentation
Probabilistic Video Prediction From Noisy Data With a Posterior Confidence Probabilistic Video Prediction From Noisy Data With a Posterior Confidence
Progressive Adversarial Networks for Fine-Grained Domain Adaptation
Robust Object Detection Under Occlusion With Context-Aware CompositionalNets for Fine-Grained Domain Adaptation Robust Object Detection
Under Occlusion Scale-Equalizing Pyramid Convolution for Object Detection
SCOUT Self-Aware Discriminant Counterfactual Explanations SCOUT Self-Aware Discriminant Counterfactual Explanations
SDC-Depth Semantic Divide-and-Conquer Network for Monocular Depth Estimation
Self-Supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation Self-Supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation
Semi -Supervised Learning for Few-Shot Image-to-Image Translation
Sequential 3D Human Pose and Shape Estimation From Point Clouds Sequential 3D Human Pose and Shape Estimation
Smoothing Adversarial Domain Attack and P- Memory Reconsolidation for Cross-Domain Person Smooth against domain attacks and P memory reintegration of cross-domain personnel
Suppressing Uncertainties for Large-Scale Facial Expression Recognition Suppressing the uncertainty of large-scale facial expression recognition
TCTS A Task-Consistent Two-Stage Framework for Person Search TCTS A task-aligned two-stage framework for people search
Towards Fairness in Visual Recognition Effective Strategies for Bias Mitigation
Tracking by Instance Detection A Meta-Learning Approach Tracking by Instance Detection A Meta-Learning Approach
Train in Germany Test in the USA Making 3D Object German trains are tested in the United States to make 3D objects
Training Noise-Robust Deep Neural Networks via Meta-Learning Training Noise-Robust Deep Neural Networks via Meta-Learning
Transferable Controllable and Inconspicuous Adversarial Attacks on Person Re-identification With Transferable Controllable and Inconspicuous Adversarial Attacks on Person Re-identification With Adversarial Attacks on Person Re-Identification
Transformation GAN for Unsupervised Image Synthesis and Representation Learning Transformation GAN for Unsupervised Image Synthesis and Representation Learning
Unsupervised Person Re-Identification via Multi-Label Classification Unsupervised Person Re-Identification via Multi-Label Classification
Video Modeling With Correlation Networks Using Correlation Networks for Video Modeling
Visual Commonsense R-CNN Visual Commonsense R-CNN
VPLNet Deep Single View Normal Estimation With Vanishing Points and VPLNet Deep Single View Normal Estimation With Vanishing Points and Weakly
Supervised Fine-Grained Image Classification via Guassian Mixture Model Oriented
What Deep CNNs Benefit From Global Covariance Pooling An Optimization Deep CNN Benefits from Global Covariance Pooling Optimization What
Makes Training Multi-Modal Classification Networks Hard What Makes Training Multi-Modal Classification Networks Hard
Zero-Assignment Constraint for Graph Matching With Outliers Graph Matching Anomalies Zero distribution constraints of values
Probabilistic Pixel-Adaptive Refinement Networks Probabilistic Pixel-Adaptive Refinement Networks
Mapillary Street-Level Sequences A Dataset for Lifelong Place Recognition Footprints
and Free Space From a Single Color Image Footprint and available space for monochrome images
RoutedFusion Learning Real-Time Depth Map Fusion RoutedFusion Learning Real-Time Depth Map Fusion
A Physics-Based Noise Formation Model for Extreme Low-Light Raw Denoising
Combating Noisy Labels by Agreement A Joint Training Method with Protocol to Combat Noisy Labels
Label Decoupling Framework for Salient Object Detection
Learning Visual Emotion Representations From Web Data Learning Visual Emotion Representations From Web Data
Multi-Modality Cross Attention Network for Image and Sentence Matching Multi
-Path Region Mining for Weakly Supervised 3D Semantic Segmentation on Multi-Path Region Mining for Weakly Supervised 3D Semantic Segmentation
Universal Weighting Metric Learning for Cross-Modal Matching for Cross-Modal Matching Universal Weighted Metric Learning for Pattern Matching
View-GCN View-Based Graph Convolutional Network for 3D Shape Analysis View-GCN View-Based Graph Convolutional Network for 3D Shape Analysis Correspondence-
Free Material Reconstruction using Sparse Surface Constraints
Point Cloud Completion by Skip-Attention Network With Hierarchical Folding Complete Point Cloud
GNN3DMOT Graph Neural Network for 3D Multi-Object Tracking With 2D-3D GNN3DMOT Graph Neural Network for 2D-3D 3D Multi-Object Tracking
MISC Multi-Condition Injection and Spatially-Adaptive Compositing for Conditional Person Image MISC Multi-Condition Injection and Spatially-Adaptive Compositing for Conditional Person Image
Relative Interior Rule in Block-Coordinate Descent Relative Interior Rule in Block-Coordinate Descent
Google Landmarks Dataset v2 - A Large- Scale Benchmark for Instance-Level Google Landmarks Dataset v2 - Instance-Level Large Scale Benchmark
SynSin End-to-End View Synthesis From a Single Image
On the Distribution of Minima in Intrinsic-Metric Rotation Averaging On the Distribution of Minima in Intrinsic-Metric Rotation Averaging
Dynamic Traffic Modeling From Overhead Imagery Dynamic Traffic Modeling of Overhead Images
A Multigrid Method for Efficiently Training Video Models A Multigrid Method for Efficiently Training Video Models
Bidirectional Graph Reasoning Network for Panoptic Segmentation
Boosting the Transferability of Adversarial Samples via Attention Improving Transferability of Adversarial Examples with Attention
Cascade EF-GAN Progressive Facial Expression Editing With Local Focuses Cascade EF-GAN with Local Focus
Exploring Bottom-Up and Top-Down Cues With Attentive Learning for Webly via Webly Focused learning to explore bottom-up and top-down cues
Future Video Synthesis With Object Motion Prediction Future Video Synthesis with Object Motion Prediction
MEBOW Monocular Estimation of Body Orientation in the Wild MEBOW monocular estimation
MotionNet Joint Perception and Motion Prediction for Autonomous Driving Based
Multi-View Neural Human Rendering
PhraseCut Language- Based Image Segmentation in the Wild
PQ-NET A Generative Part Seq2Seq Network for 3D Shapes PQ-NET A Generative Part Seq2Seq Network for 3D Shapes
Rethinking Classification and Localization for Object Detection Rethinking Classification and Localization for Object Detection Positioning
Robustness Guarantees for Deep Neural Networks on Videos
Rotation Consistent Margin Loss for Efficient Low-Bit Face Recognition Rotation Consistent Margin Loss for Efficient Low-Bit Face Recognition
Self-Supervised Domain-Aware Generative Network for Generalized Zero-Shot Learning Self-Supervised Domain-Aware Generative Networks for Generalized Zero-Shot Learning
Temporal-Context Enhanced Detection of Heavily Occluded Pedestrians 重度遮挡行人的时域增强检测
Towards Global Explanations of Convolutional Neural Networks With Concept Attribution 对具有概念属性的卷积神经网络的全局解释
Unsupervised Learning of Probably Symmetric Deformable 3D Objects From Images 来自图像的可能对称可变形 3D 对象的无监督学习
Basis Prediction Networks for Effective Burst Denoising With Large Kernels 大核有效突发去噪的基预测网络
Generating and Exploiting Probabilistic Monocular Depth Estimates 生成和利用概率单目深度估计
Structure Preserving Generative Cross-Domain Learning 结构保持生成式跨域学习
Structure-Guided Ranking Loss for Single Image Depth Prediction 单幅图像深度预测的结构引导排序损失
Efficient and Robust Shape Correspondence via Sparsity-Enforced Quadratic Assignment 通过稀疏强制二次分配实现高效且稳健的形状对应
SAPIEN A SimulAted Part-Based Interactive ENvironment SAPIEN A SimulAted Part-Based Interactive ENvironment
Zooming Slow-Mo Fast and Accurate One-Stage Space-Time Video Super-Resolution Zooming Slow-Mo Fast and Accurate One-Stage Space-Time Video Super-Resolution
Evade Deep Image Retrieval by Stashing Private Images in the
Multi-Domain Learning for Accurate and Few-Shot Color Constancy in the
One Mans Trash Is Another Mans Treasure Resisting Adversarial Examples One Man's Trash Is Another Man's Treasure, Adversarial Examples
Improve Image Recognition Adversarial Examples Improve Image Recognition
MetaFuse A Pre-trained Fusion Model for Human Pose Estimation MetaFuse A Pre-trained Fusion Model for Human Pose Estimation
MLCVNet Multi-Level Context VoteNet for 3D Object Detection MLCVNet Multi-Level Context VoteNet for 3D Object Detection
Partial Weight Adaptation for Robust DNN Inference Partial Weight Adaptation for Robust DNN Inference
PolarMask Single Shot Instance Segmentation With Polar Representation PolarMask Single Shot Instance Segmentation With Polar Representation
Self-Training With Noisy Student Improves ImageNet Classification
Inducing Hierarchical Compositional Model by Sparsifying Generator Network Inducing Hierarchical Compositional Model by Sparsifying Generator Network Hierarchical combination model
Fine-Grained Image-to-Image Transformation Towards Visual Recognition
TA-Student VQA Multi-Agents Training by Self-Questioning TA-Student VQA Multi-agent self-questioning training
Variational Context-Deformable ConvNets for Indoor Scene Parsing
AANet Adaptive Aggregation Network for Efficient Stereo Matching AANet Adaptive Aggregation Network for Efficient Stereo Matching
Attribution in Scale and Space Scale and Space Attribution
Cross-Domain Detection via Graph-Induced Prototype Alignment Cross-Domain Detection via Graph-Induced Prototype Alignment
Deep 3D Portrait From a Single Image Deep 3D Portrait from a Single Image
Deep Kinematics Analysis for Monocular 3D Human Pose Estimation Monocular 3D Human Pose Estimation Deep kinematic analysis
Discriminative Multi-Modality Speech Recognition Discriminative multi-modal speech recognition
End-to-End Illuminant Estimation Based on Deep Metric Learning End-to-end light source estimation based on deep metric learning
EventCap Monocular 3D Capture of High-Speed Human Motions Using an EventCap uses monocular 3D to capture high-speed human motion
Explainable Object-Induced Action Decision for Autonomous Vehicles
Exploring Categorical Regularization for Domain Adaptive Object Detection Exploring Categorical Regularization for Domain Adaptive Object Detection Classification Regularization
Fast MSER Fast MSER
GHUM GHUML Generative 3D Human Shape and Articulated Pose GHUM GHUML generates 3D human shape and joint pose
Grid-GCN for Fast and Scalable Point Cloud Learning Grid-GCN for fast and scalable point cloud learning
G-TAD Sub-Graph Localization for Temporal Action Detection G-TAD sub-graph localization for temporal action detection
How to Train Your How Deep Multi-Object Tracker Trains Your Deep Multi-Object Tracker
Learning in the Frequency Domain Learning
to Restore Low-Light Images via Decomposition-and-Enhancement Learning to Restore Low-Light Images via Decomposition and Enhancement
MARMVS Matching Ambiguity Reduced Multiple View Stereo for Efficient Large MARMVS Matching Ambiguity Reduces Multi-View Stereo for Efficient Large On the
Acceleration of Deep Learning Model Parallelism With Staleness On the Acceleration of Deep Learning Model Parallel and Staleness
Reliable Weighted Optimal Transport for Unsupervised Domain Adaptation Unsupervised Domain Adaptation Reliable Weighted Optimal Transfer of
Stylization-Based Architecture for Fast Deep Exemplar Colorization
Unified Dynamic Convolutional Network for Super-Resolution With Variational Degradations 具有变分退化的超分辨率统一动态卷积网络
Weakly Supervised Semantic Point Cloud Segmentation Towards 10x Fewer Labels 弱监督语义点云分割，标签数量减少 10 倍
What Machines See Is Not What They Get Fooling Scene 机器看到的不是他们得到的愚弄场景
Holistically-Attracted Wireframe Parsing 整体吸引的线框解析
Learning Multi-View Camera Relocalization With Graph Neural Networks 使用图神经网络学习多视图相机重定位
Assessing Eye Aesthetics for Automatic Multi-Reference Eye In-Painting 评估自动多参考眼睛修复的眼睛美学
ClusterFit Improving Generalization of Visual Representations ClusterFit 改进视觉表示的泛化
Cooling-Shrinking Attack Blinding the Tracker With Imperceptible Noises 冷却收缩攻击以难以察觉的噪音使跟踪器失明
Disparity-Aware Domain Adaptation in Stereo Image Restoration
Learning Multi-Granular Hypergraphs for Video-Based Person Re-Identification Learning Multi-Granular Hypergraphs for Video-Based Person Re-Identification
Neural Data Server A Large-Scale Search Engine for Transfer Learning Neural Data Server
Optical Flow in Dense Foggy Scenes Using Semi-Supervised Learning
PointASNL Robust Point Clouds Processing Using Nonlocal Neural Networks With PointASNL Robust Point Cloud Processing Using Nonlocal Neural Networks
3DSSD Point-Based 3D Single Stage Object Detector 3DSSD Point-Based 3D Single Stage Object Detector
Automatic Neural Network Compression by Sparsity-Quantization Joint Learning A Constrained Sparse Quantization Joint Learning Automatic Neural Network Compression
CARS Continuous Evolution for Efficient Neural Architecture Search CARS Continuous Evolution for Efficient Neural Architecture Search
Cost Volume Pyramid Based Depth Inference for Multi-View Stereo
CPR-GCN Conditional Partial-Residual Graph Convolutional Network in Automated Anatomical Labeling CPR-GCN Conditional Partial-Residual Graph Convolutional Network in Automated Anatomical Labeling Network
D3VO Deep Depth Deep Pose and Deep Uncertainty for Monocular D3VO Deep Depth Deep Pose and Deep Uncertainty for Monoocular
Distilling Knowledge From Graph Convolutional Networks
DPGN Distribution Propagation Graph Network for Few-Shot Learning for small sample learning The DPGN Distribution Propagation Map Network
Extreme Relative Pose Network Under Hybrid Representations
FaceScape A Large-Scale High Quality 3D Face Dataset and Detailed FaceScape Large-scale high-quality 3D face dataset and detailed informationFDA
Fourier Domain Adaptation for Semantic Segmentation FDA Fourier Domain Adaptation for Semantic Segmentation
From Fidelity to Perceptual Quality A Semi-Supervised Approach for Low-Light From Fidelity to Perceptual Quality A Semi-Supervised Approach for Low-Light Gated
Channel Transformation for Visual Recognition
Graph-Structured Referring Expression Reasoning in the Reference Expression Reasoning for Wild Graph Structures
Hierarchical Feature Embedding for Attribute Recognition Hierarchical Feature Embedding for Attribute Recognition
In Perfect Shape Certifiably Optimal 3D Shape Reconstruction From 2D Perfect Shape Provably Optimal 3D Shape Reconstruction From 2D
IntrA 3D Intracranial Aneurysm Dataset for Deep Learning Intra 3D Intracranial Aneurysm Dataset for Deep Learning
Learning for Video Compression With Hierarchical Quality and Recurrent Enhancement Learning Texture
Transformer Network for Image Super-Resolution Learning with Hierarchical Quality and Recurrent Enhancement Texture Transformer Networks for Image Super-Resolution
Learning to Cluster Faces via Confidence and Connectivity Estimation 通过置信度和连通性估计学习聚类人脸
Learning to Generate 3D Training Data Through Hybrid Gradient 学习通过混合梯度生成 3D 训练数据
Learning to Manipulate Individual Objects in an Image 学习操纵图像中的单个对象
Learning Unseen Concepts via Hierarchical Decomposition and Composition 通过分层分解和组合学习看不见的概念
One-Shot Domain Adaptation for Face Generation 人脸生成的 One-Shot 域自适应
PFCNN Convolutional Neural Networks on 3D Surfaces Using Parallel Frames 使用并行帧的 3D 表面上的 PFCNN 卷积神经网络
Phase Consistent Ecological Domain Adaptation 相位一致的生态域适应
Predicting Goal-Directed Human Attention Using Inverse Reinforcement Learning 使用逆强化学习预测目标导向的人类注意力
Resolution Adaptive Networks for Efficient Inference 用于高效推理的分辨率自适应网络
Reverse Perspective Network for Perspective-Aware Object Counting
ROAM Recurrently Optimizing Tracking Model ROAM
Rotation Equivariant Graph Convolutional Network for Spherical Image Classification Rotation Equivariant Graph Convolutional Network for Spherical Image Classification Convolutional Network
Self-Learning Video Rain Streak Removal When Cyclic Consistency Meets Temporal When Cyclic Consistency Meets Temporal, Self-Learning Video Rain Streak Removal
Spatial-Temporal Graph Convolutional Network for Video-Based Person Re-Identification for Video-Based Person Re-Identification Superpixel
Segmentation With Fully Convolutional Networks Superpixel Segmentation With Fully Convolutional Networks
SurfelGAN Synthesizing Realistic Sensor Data for Autonomous Driving SurfelGAN Synthesizing Realistic Sensor Data for Autonomous Driving
SwapText Image Based Texts Transfer in Scenes SwapText Image Based Text in Transmission in the scene
Telling Left From Right Learning Spatial Correspondence of Sight and Towards Photo-Realistic
Virtual Try-
On by Adaptively Generating-Preserving Image Content By Adaptively Generating-Preserving Image Content Generating Preserved Image Content for Realistic Virtual Try
-On TransMoMo Invariance-Driven Unsupervised Video Motion Retargeting TransMoMo Invariance-Driven Unsupervised Video Motion Retargeting
Upgrading Optical Flow to 3D Scene Flow Through Optical Expansion Upgrading Optical Flow to 3D Scene Flow Through Optical Expansion
WaveletStereo Learning Wavelet Coefficients of Disparity Map in Stereo Matching Disparity Map in Stereo Matching Wavelet Stereo Learning Wavelet Coefficients
BlendedMVS A Large-Scale Dataset for Generalized Multi-View Stereo Networks BlendedMVS A Large-Scale Dataset for Generalized Multi-View Stereo Networks
Front2Back Single View 3D Shape Reconstruction via Front to Back Front2Back single-view 3D shape reconstruction via Front to Back
Quasi-Newton Solver for Robust Non-Rigid Registration 用于稳健非刚性配准的准牛顿求解器
Video Playback Rate Perception for Self-Supervised Spatio-Temporal Representation Learning 自监督时空表示学习的视频播放速率感知
Syn2Real Transfer Learning for Image Deraining Using Gaussian Processes 使用高斯过程进行图像去雨的 Syn2Real 迁移学习
Orderless Recurrent Models for Multi-Label Classification 多标签分类的无序循环模型
Data-Free Knowledge Amalgamation via Group-Stack Dual-GAN 通过 Group-Stack Dual-GAN 实现无数据知识融合
Distilling Cross-Task Knowledge via Relationship Matching 通过关系匹配提取跨任务知识
Few-Shot Learning via Embedding Adaptation With Set-to-Set Functions 使用 Set-to-Set 函数嵌入自适应的 Few-Shot 学习
HVNet Hybrid Voxel Network for LiDAR Based 3D Object Detection 用于基于 LiDAR 的 3D 对象检测的 HVNet 混合体素网络
Light-weight Calibrator A Separable Component for Unsupervised Domain Adaptation
Probabilistic Structural Latent Representation for Unsupervised Embedding Probabilistic Structural Latent Representation for Unsupervised Embedding
RPM-Net Robust Point Matching Using Learned Features Using Learning Featured RPM-Net Robust Point Matching
Contextual Residual Aggregation for Ultra High-Resolution Image Inpainting
Unpaired Portrait Drawing Generation via Asymmetric Cycle Mapping Unpaired Portrait Drawing Generation via Asymmetric Cycle Mapping Unpaired Portrait Drawing for Ultra High-Resolution Image Inpainting
Neural Cages for Detail-Preserving 3D Deformations
A Unified Object Motion and Affinity Model for Online Multi-Object
Accurate Estimation of Body Height From a Single Depth Image Accurate estimation of body height from a single depth image
Dreaming to Distill Data-Free Knowledge Transfer via DeepInversion 梦想通过 DeepInversion 提取无数据知识转移
LiDAR-Based Online 3D Video Object Detection With Graph-Based Message Passing 基于 LiDAR 的在线 3D 视频对象检测和基于图形的消息传递
From Patches to Pictures PaQ-2-PiQ Mapping the Perceptual Space of 从补丁到图片 PaQ-2-PiQ 映射感知空间
GIFnets Differentiable GIF Encoding Framework GIFnets 可区分的 GIF 编码框架
Rethinking Data Augmentation for Image Super-resolution A Comprehensive Analysis and 重新思考图像超分辨率的数据增强综合分析和
GAMIN Generative Adversarial Multiple Imputation Network for Highly Missing Data 用于高度缺失数据的 GAMIN 生成对抗多重插补网络
Novel View Synthesis of Dynamic Scenes With Globally Coherent Depths 具有全局相干深度的动态场景的新颖视图合成
GreedyNAS Towards Fast One-Shot NAS With Greedy Supernet GreedyNAS 通过贪婪超网实现快速一次性 NAS
KeypointNet A Large-Scale 3D Keypoint Dataset Aggregated From Numerous Human KeypointNet
L2-GCN Layer-Wise and Learned Efficient Training of Graph Convolutional Networks L2-GCN Layering and Learning Efficient Training
Non-Line-of-Sight Surface Reconstruction Using the Directional Light-Cone Transform
OrigamiNet Weakly-Supervised Segmentation-Free One-Step Full Page Text Recognition by learning OrigamiNet Weakly -Supervised Segmentation-Free One Step Full Page Text Recognition by learning
BDD100K A Diverse Driving Dataset for Heterogeneous Multitask Learning BDD100K
C2FNAS Coarse-to-Fine Neural Architecture Search for 3D Medical Image Segmentation C2FNAS Coarse-to-Fine Neural Architecture Search for 3D Medical Image Segmentation
COCAS A Large-Scale Clothes Changing Person Dataset for Re-Identification COCAS
Context Prior for Scene Segmentation
Deformable Siamese Attention Networks for Visual Object Tracking Determinant Regularization
for Gradient-Efficient Graph Matching Determinant Regularization for Gradient-Efficient Graph Matching
Episode-Based Prototype Generating Network for Zero-Shot Learning Episode-Based Prototype Generating Network for Zero-Shot Learning
Fast-MVSNet Sparse- to-Dense Multi-View Stereo With Learned Propagation and Gauss-Newton Refinement Fast-MVSNet Sparse-to-Dense Multi-View Stereo with Learned Propagation and Gauss-Newton Refinement FOAL Fast Online Adaptive Learning for Cardiac Motion Estimation FOAL for Cardiac Motion
Estimation Rapid Online Adaptive Learning of
HUMBI A Large Multiview Dataset of Human Body Expressions HUMBI Human Body Expressions Large Multiview Dataset
Learning Video Stabilization Using Optical Flow Using Optical Flow Learning Video Stabilization
Searching Central Difference Convolutional Networks for Face Anti-Spoofing Searching Central Difference Convolutional Networks for Face Anti-Spoofing Towards
Accurate Scene Text Recognition With Semantic Reasoning
Networks Towards Accurate Scene Text Recognition With Semantic Reasoning Networks
TransMatch A Transfer-Learning Scheme for Semi-Supervised Few-Shot Learning TransMatch A migration learning scheme for semi-supervised Few-Shot learning
Unsupervised Representation Learning for Gaze Estimation Unsupervised representation learning for gaze estimation
Weakly Supervised Discriminative Feature Learning With State Information for Person Weakly Supervised Discriminative Feature Learning Based on Personal State Information
Central Similarity Quantization for Efficient Image and Video Retrieval 用于高效图像和视频检索的中心相似性量化
Efficient Dynamic Scene Deblurring Using Spatially Variant Deconvolution Network With 使用空间变体反卷积网络的高效动态场景去模糊
Ensemble Generative Cleaning With Feedback Loops for Defending Adversarial Attacks 用于防御对抗性攻击的带有反馈循环的集成生成清洗
Plug-and-Play Algorithms for Large-Scale Snapshot Compressive Imaging 用于大规模快照压缩成像的即插即用算法
Revisiting Knowledge Distillation via Label Smoothing Regularization 通过标签平滑正则化重新审视知识蒸馏
Supervised Raw Video Denoising With a Benchmark Dataset on Dynamic 基于动态基准数据集的监督原始视频去噪
Regularizing Class-Wise Predictions via Self-Knowledge Distillation 通过自知识蒸馏对分类预测进行正则化
Old Is Gold Redefining the Adversarially Learned One-Class Classifier Training 老是黄金重新定义对抗学习的一类分类器训练
Autolabeling 3D Objects With Differentiable Rendering of SDF Shape Priors
CycleISP Real Image Restoration via Improved Data Synthesis CycleISP Real Image Restoration via Improved Data Synthesis
Robust Learning Through Cross-Task Consistency Robust Learning for Task Consistency
TomoFluid Reconstructing Dynamic Fluid From Sparse View Videos TomoFluid Weakly
Supervised Visual Semantic Parsing Weakly Supervised Visual Semantic Parsing
3D Human Mesh Regression With Dense Correspondence 3D Human Mesh Regression With Dense Correspondence
Bundle Pooling for Polygonal Architecture Segmentation Problem
Dense Regression Network for Video Grounding Dense Regression Network for Video Grounding
Gum-Net Unsupervised Geometric Matching for Fast and Accurate 3D Subtomogram Gum-Net Unsupervised Geometric Matching for fast and accurate 3D submapping
Hierarchical Clustering With Hard-Batch Triplet Loss for Person Re-Identification Visual Reaction Learning to Play Catch With Your Drone Visual Reaction Learning to Play Catch
With Your Drone
-Cluster Augmented Discriminative Clustering for Domain Adaptive Person Re-Identification AD-Cluster Enhanced Discriminative Clustering for Domain Adaptive Person Re-Identification Deep Structure-Revealed
Network for Texture Recognition Deep Structure-Revealed Network for Texture Recognition
Online Deep Clustering for Unsupervised Representation Learning Online deep clustering for unsupervised representation learning
Self-Supervised Scene De-Occlusion Self-supervised scene to occlude 4D Association Graph for Realtime Multi-Person Motion Capture Using Multiple 4D Association Graph A
using multiple real-time multi-person motion capture
Transductive Approach for Video Object Segmentation A transductive approach for video object segmentation
Adaptive Graph Convolutional Network With Attention Graph Clustering for Co-Saliency Auxiliary
Training Towards Accurate and Robust Models
Bridging the Gap Between Anchor-Based and Anchor-Free Detection via Adaptive
Context Aware Graph Convolution for Skeleton-Based Action Recognition
Context-Aware and Scale- Insensitive Temporal Repetition Counting Context-Aware and Scale-Insensitive Temporal Repeat Counting
Context-Aware Attention Network for Image-Text Retrieval Context-Aware Attention Network for Image Text Retrieval
Conv-MPN Convolutional Message Passing Neural Network for Structured Outdoor Architecture for Structured Conv-MPN Convolutional Message Passing Neural Network for Outdoor Architecture
Copy and Paste GAN Face Hallucination From Shaded Thumbnails Copy and Paste GAN Face Hallucination From Shaded Thumbnails
Correlating Edge Pose With Parsing Correlating Edge Pose With Parsing
Cross-Domain Correspondence Learning for Exemplar-Based Image Translation
DAVD-Net Deep Audio-Aided Video Decompression of Talking Heads DAVD- Net Deep Audio Assisted Video Decompression
Deblurring by Realistic Blurring
Deep Relational Reasoning Graph Network for Arbitrary Shape Text Detection
Deep Unfolding Network for Image Super-Resolution for Image Super-resolution depth expansion network
DeepEMD Few-Shot Image Classification With Differentiable Earth Movers Distance and DeepEMD Few-Shot Image Classification with Differentiable Bulldozer Distance and
Depth Sensing Beyond LiDAR Range Distilling
Effective Supervision From Severe Label Noise Extracting Effective Supervision from Severe Label Noise
Distribution-Aware Coordinate Representation for Human Pose Estimation Distribution-Aware Coordinate Representation for Human Pose Estimation
Dynamic Graph Message Passing Networks
Exemplar Normalization for Learning Deep Representation Example Normalization for Learning Deep Representation
Fixed-Point Back-Propagation Training Fixed-point backpropagation training
FReeNet Multi-Identity Face Reenactment FReeNet multi-identity face reenactment
Fusing Wearable IMUs With Multi-View Images for Human Pose Estimation
Fusion-Aware Point Convolution for Online Semantic 3D Scene Segmentation Fusion-aware point convolution for online semantic 3D scene segmentation
Generating 3D People in Scenes Without People Generate 3D characters in scenes without people
Global-Local GCN Large-Scale Label Noise Cleansing for Face Recognition for faces Global-Local GCN for Recognition Large-Scale Label Noise Cleaning
Interactive Object Segmentation With Inside-Outside Guidance Interactive Object Segmentation with Inside-Outside Guidance
Mask Encoding for Single Shot Instance Segmentation
Memory-Efficient Hierarchical Neural Architecture Search for Image Denoising Memory-Efficient Segmentation for Image Denoising Layer Neural Architecture Search
METAL Minimum Effort Temporal Activity Localization in Untrimmed Videos METAL Minimum Effort Temporal Activity Localization in Untrimmed Videos
Multi-Granularity Reference-Aided Attentive Feature Aggregation for Video-Based Person Re-Identification Nested Scale-Editing
for Conditional Image Synthesis Nested Scale-Editing for Conditional Image Synthesis
Object Relational Graph With Teacher-Recommended Learning for Video Captioning Object
-Occluded Human Shape and Pose Estimation From a Single Color Object occlusion based on a single color Human Shape and Pose Estimation
Online Depth Learning Against Forgetting in Monocular Videos Overcoming Multi-Model Forgetting in One-Shot NAS With Diversity Maximization Overcoming Multi -
Model Forgetting in One-Shot NAS with Diversity Maximization
Network for Human Parsing Partial Perceptual Context Network for Human Parsing
PolarNet An Improved Grid Representation for Online LiDAR Point Clouds PolarNet An Improved Online LiDAR Point Cloud Grid Representation
Putting Visual Object Recognition in Context Putting Visual Object Recognition in Context
Quaternion Product Units for Deep Learning on 3D Rotation Groups
Relation-Aware Global Attention for Person Re-Identification
Rethinking the Route Towards Weakly Supervised Object Localization rethinks the path of weakly supervised object localization
Select Supplement and Focus for RGB-D Saliency Detection Select Supplement and Focus for RGB-D Saliency Detection
Semantics-Guided Neural Networks for Efficient Skeleton-Based Human Action Recognition Semantics-Guided Neural Networks for Efficient Skeleton-Based Human Action Recognition State-Relabeling Adversarial
Active Learning State-Relabeling Adversarial Active Learning
STINet Spatio-Temporal-Interactive Network for Pedestrian Detection and Trajectory Prediction STINet space-time interaction network for pedestrian detection and trajectory prediction
Texture and Shape Biased Two-Stream Networks for Clothing Classification and
The Secret Revealer Generative Model-Inversion Attacks Against Deep Neural Networks Secret Revealer Generation Model Inversion Attack for Deep Neural Networks
Transferring and Regularizing Prediction for Semantic Segmentation Transferring and Regularizing Prediction for Semantic Segmentation
UC-Net Uncertainty Inspired RGB-D Saliency Detection via Conditional Variational Autoencoders Through Conditional Variational Autoencoders UC-Net Uncertainty Inspired RGB-D Saliency Detection
Understanding Adversarial Examples From the Mutual Influence of Images and
Unsupervised Adaptation Learning for Hyperspectral Imagery Super-Resolution
WCP Worst-Case Perturbations for Semi- Supervised Deep Learning WCP Worst Case Perturbation
Weakly-Supervised Salient Object Detection via Scribble Annotations
Where Does It Exist Spatio-Temporal Video Grounding for Multi-Form Sentences Weakly-Supervised Salient Object Detection via Scribble Annotations Where is the spatio-temporal video grounding
ZSTAD Zero-Shot Temporal Activity Detection ZSTAD Zero-Shot Temporal Activity Detection
A Certifiably Globally Optimal Solution to Generalized Essential Matrix Estimation Generalized Essential Matrix Estimation Provable Global Optimal Solution
Bayesian Adversarial Human Motion Synthesis Bayesian Against Human Motion Synthetic
Clean-Label Backdoor Attacks on Video Recognition Models Clean-Label Backdoor Attacks on Video Recognition Models
Domain Decluttering Simplifying Images to Mitigate Synthetic-Real Domain Shift and Domain Decluttering Simplify Images to Lighten Synthetic Real Domain Shift and
Exploring Self-Attention for Image Recognition
Knowledge As Priors Cross-Modal Knowledge Generalization for Datasets Without Superior Knowledge as a Prior Dataset Cross-Modal Knowledge Generalization
Learning Deep Network for Detecting 3D Object Keypoints and 6D
Maintaining Discrimination and Fairness in Class Incremental Learning Incremental Learning in Class Keep Discrimination and Fairness in Learning
MaskFlownet Asymmetric Feature Matching With Learnable Occlusion Mask MaskFlownet Asymmetric Feature Matching and Learnable Occlusion Mask
On Isometry Robustness of Deep 3D Point Cloud Models Under Isometry Robustness of Deep 3D Point Cloud Model
Painting Many Pasts Synthesizing Time Lapse Videos of Paintings Paint many past, composite time lapse painting videos
Predicting Lymph Node Metastasis Using Histopathological Images Based on Multiple
RDCFace Radial Distortion Correction for Face Recognition RDCFace Radial Distortion Correction for Face Recognition
SESS Self-Ensembling Semi-Supervised 3D Object Detection SESS Self-Ensembling Semi-Supervised 3D Object Detection
Towards Better Generalization Joint Depth-Pose Learning Without PoseNet Towards
Large Yet Imperceptible Adversarial Image Perturbations With Perceptual Color Towards Large Yet Imperceptible Adversarial Image Perturbations With Perceptual Color Adversarial Image Perturbation
UCTGAN Diverse Image Inpainting Based on Unsupervised Cross-Space Translation UCTGAN Diverse Image Repair Based on Unsupervised Cross-Space Translation
Joint Semantic Segmentation and Boundary Detection Using Iterative Pyramid Contexts Joint Semantic Segmentation and Boundary Detection Using Iterative Pyramid Contexts
Cross-domain Object Detection through Coarse-to-Fine Feature Adaptation
Deep Metric Learning via Adaptive Learnable Assessment Through Coarse-to-Fine Feature Adaptation Deep Metric Learning via Adaptive Learnable Assessment
Distribution-Induced Bidirectional Generative Adversarial Network for Graph Representation Learning Distribution-Induced Bidirectional Generative Adversarial Network for Graph Representation Learning
Efficient Adversarial Training With Transferable Adversarial Examples Efficient Adversarial Training With Transferable Adversarial Samples
Foreground-Aware Relation Network for Geospatial Object Segmentation in High Spatial Foreground-aware relational network
Image Demoireing with Learnable Bandpass Filters
Learning to Shadow Hand-Drawn Sketches Learning to Shadow Hand-Drawn Sketches
Optical Flow in the Dark
Rethinking Performance Estimation in Neural Architecture Search Rethinking Performance Estimation in Neural Architecture Search Reflections on Performance Estimation in Neural Architecture Search
Syntax-Aware Action Targeting for Video Captioning Grammar-Aware Action Targeting for Video Captioning
Webly Supervised Knowledge Embedding Model for Visual Reasoning
What Does Plate Glass Reveal About Camera Calibration What Does Plate Glass Reveal About Camera Calibration
Minimizing Discrete Total Curvature for Image Processing Minimize the discrete total curvature of image processing
Regularizing CNN Transfer Learning With Randomized Regression Regularizing CNN with Randomized Regression Transfer Learning
Robust Partial Matching for Person Search in the Wild
Squeeze-and- Attention Networks for Semantic Segmentation Squeeze-and-Attention Network
BBN Bilateral-Branch Network With Cumulative Learning for Long-Tailed Visual Recognition for Semantic Segmentation BBN Bilateral Branch Network
Cascaded Human-Object Interaction for Long Tail Visual Recognition Recognition Cascade Human-Object Interaction Recognition
DaST Data-Free Substitute Training for Adversarial Attacks DaST Data-Free Substitute Training for Adversarial Attacks
Deepstrip High-Resolution Boundary Refinement Deepstrip High-Resolution Boundary Refinement
DuDoRNet Learning a Dual-Domain Recurrent Network for Fast MRI Reconstruction DuDoRNet Learning for Fast MRI Reconstruction EcoNAS Finding
Proxies for Economical Neural Architecture Search EcoNAS Finding Proxies for Economical Neural Architecture Search
End-to-End Adversarial-Attention Network for Multi-Modal Clustering End-to-End Adversarial-Attention Network for Multi-Modal Clustering
Geometry and Learning Co-Supported Normal Estimation for Unstructured Point Cloud Geometry and Learning Co-Supported Normal Estimation for Unstructured Point Cloud
Interactive Two-Stream Decoder for Accurate and Fast Saliency Detection Interactive Two-Stream Decoder for Accurate and Fast Saliency Detection Dual-stream decoder
Joint 3D Instance Segmentation and Object Detection for Autonomous Driving Joint 3D Instance Segmentation and Object Detection for Autonomous Driving
KFNet Learning Temporal Camera Relocalization Using Kalman Filtering KFNet Learning Temporal Camera Relocalization Using Kalman Filtering
Learning Oracle Attention for High-Fidelity Face Completion Learning Oracle Attention for High-Fidelity Face Completion
Learning Saliency Propagation for Semi-Supervised Instance Segmentation Semi-Supervised Example Learning to Select Base Classes for Few-Shot Classification for Segmentation
Learning to Select Base Classes for Few-Shot Classification
LG-GAN Label Guided Adversarial Network for Flexible Targeted Attack of LG-GAN Labels for Flexible Targeted Attacks Guided Adversarial Networks
Look-Into-Object Self-Supervised Structure Modeling for Object Recognition
Monocular Real-Time Hand Shape and Motion Capture Using Multi-Modal Data Monocular Real-Time Hand Shape Using Multimodal Data And motion capture
More Grounded Image Captioning by Distilling Image-Text Matching Model for more solid image description by extracting image-text matching model
Multi-Mutual Consistency Induced Transfer Subspace Learning for Human Motion Segmentation 用于人体运动分割的多相一致性诱导迁移子空间学习
Online Joint Multi-Metric Adaptation From Frequent Sharing-Subset Mining for Person 个人频繁共享子集挖掘的在线联合多度量适应
Pattern-Structure Diffusion for Multi-Task Learning 多任务学习的模式结构扩散
Rotate-and-Render Unsupervised Photorealistic Face Rotation From Single-View Images 从单视图图像旋转和渲染无监督照片级真实面部旋转
Spatiotemporal Fusion in 3D CNNs A Probabilistic View 3D CNN 中的时空融合概率视图
ActBERT Learning Global-Local Video-Text Representations ActBERT 学习全局-本地视频-文本表示
AdaCoSeg Adaptive Shape Co-Segmentation With Group Consistency Loss 具有组一致性损失的 AdaCoSeg 自适应形状协同分割
CookGAN Causality Based Text-to-Image Synthesis CookGAN 基于因果关系的文本到图像合成
Dont Even Look Once Synthesizing Features for Zero-Shot Detection
Inflated Episodic Memory With Region Self-Attention for Long-Tailed Visual Recognition Inflated Episodic Memory With Region Self-Attention for Long-Tailed Visual Recognition
MetaIQA Deep Meta-Learning for No-Reference Image Quality Assessment MetaIQA Deep Meta-Learning for No-Reference Image Quality Assessment
Private-kNN Practical Differential Privacy for Computer Vision Private-kNN Practical Differential Privacy for Computer Vision
ReDAReinforced Differentiable Attribute for 3D Face Reconstruction ReDAReinforced differentiable attribute for 3D face reconstruction
Retina-Like Visual Image Reconstruction via Spiking Neural Model Retina-like visual image reconstruction based on spiking neural model
S3VAE Self-Supervised Sequential VAE for Representation Disentanglement and Data Generation S3VAE is used to represent separation and Self-Supervised Sequential VAEs for Data Generation
SEAN Image Synthesis With Semantic Region-Adaptive Normalization SEAN Image Synthesis
Semantically Multi-Modal Image Synthesis
The Edge of Depth Explicit Constraints Between Segmentation and Depth Boundary Explicit Constraints
Towards Unified INT8 Training for Convolutional Neural Network Towards Unified INT8 Training for Convolutional Neural Networks
Vision-Dialog Navigation by Exploring Cross-Modal Memory
Vision-Language Navigation With Self-Supervised Auxiliary Reasoning Tasks Visual Language Navigation with Self-Supervised Auxiliary Reasoning Tasks
Training Quantized Neural Networks With a Full-Precision Auxiliary Module
Unsupervised Learning From Video With Deep Neural Embeddings Unsupervised Learning From Video With Deep Neural Embeddings
Cogradient Descent for Bilinear Optimization Co-gradient descent for bilinear optimization
Deep Residual Flow for Out of Distribution Detection 用于不分布检测的深度剩余流
Sequential Motif Profiles and Topological Plots for Offline Signature Verification 用于离线签名验证的序列基序配置文件和拓扑图
Towards Robust Image Classification Using Sequential Attention Models 使用顺序注意模型实现稳健的图像分类
Deep Adversarial Decomposition A Unified Framework for Separating Superimposed Images Deep Adversarial Decomposition 一种用于分离叠加图像的统一框架

List of CVPR2020 papers (Chinese-English bilingual)

Guess you like