CVPR2023

CVPR 2023 录用论文

CVPR 2023 统计数据:

提交:9155 篇论文
接受:2360 篇论文(接受率 25.8%)
亮点:235 篇论文(接受论文的 10%,提交论文的 2.6%)
获奖候选人:12 篇论文(接受论文的 0.51%,提交论文的 0.13%)


已接受论文列表(未决抄袭和双重提交检查):

Generating Human Motion from Textual Descriptions with High Quality Discrete Representation
Jianrong Zhang · Yangsong Zhang · Xiaodong Cun · Yong Zhang · Hongwei Zhao · Hongtao Lu · Xi SHEN · Ying Shan
SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
Wenxuan Zhang · Xiaodong Cun · Xuan Wang · Yong Zhang · Xi SHEN · Yu Guo · Ying Shan · Fei Wang
Explicit Visual Prompting for Low-Level Structure Segmentations
Weihuang Liu · Xi SHEN · Chi-Man Pun · Xiaodong Cun
Privacy-preserving Adversarial Facial Features
Zhibo Wang · He Wang · Shuaifan Jin · Wenwen Zhang · Jiahui Hu · Yan Wang · Peng Sun · Wei Yuan whu · Kaixin Liu · Kui Ren
NeRF-RPN: A general framework for object detection in NeRFs
Benran Hu · Junkai Huang · Yichen Liu · Yu-Wing Tai · Chi-Keung Tang
Category Query Learning for Human-Object Interaction Classification
Chi Xie · Fangao Zeng · Yue Hu · Shuang Liang · Yichen Wei
A Unified Pyramid Recurrent Network for Video Frame Interpolation
Xin Jin · LONG WU · Jie Chen · Chen Youxin · Jay Koo · Cheul-hee Hahm
SINE: Semantic-driven Image-based NeRF Editing with Prior-guided Editing Field
Chong Bao · Yinda Zhang · Bangbang Yang · Tianxing Fan · Zesong Yang · Hujun Bao · Guofeng Zhang · Zhaopeng Cui
PATS: Patch Area Transportation with Subdivision for Local Feature Matching
Junjie Ni · Yijin Li · Zhaoyang Huang · Hongsheng Li · Zhaoopeng Cui · Hujun Bao · Guofeng Zhang
DualVector: Unsupervised Vector Font Synthesis with Dual-Part Representation
Ying-Tian Liu · Zhifei Zhang · Yuan-Chen Guo · Matthew Fisher · Zhaowen Wang · Song-Hai Zhang
Towards Robust Tampered Text Detection in Document Image: New DataSet and New Solution
chenfan Qu · Chongyu · Yuliang liu · xinhong Chen · Dezhi Peng · FENGJUN Guo · Lianwen Jin
Panoswin: A Pano-Style Swin Transformer for Panorama Understnding
Zhixin Ling · Zhen Xing · Xiangdong Zhou · Man Cao · Guichun Zhou
SVFormer: Semi-supervised Video Transformer for Action Recognition
Zhen Xing · Qi Dai · Han Hu · Jingjing Chen · Zuxuan Wu · Yu-Gang Jiang
Multi-Object Manipulation via Object-Centric Neural Scattering Functions
Stephen Tian · Yancheng Cai · Hong-Xing Yu · Sergey Zakharov · Katherine Liu · Adrien Gaidon · Yunzhu Li · Jiajun Wu
RealImpact: A Dataset of Impact Sound Fields for Real Objects
Samuel Clarke · Ruohan Gao · Mason L Wang · Mark Rau · Julia Xu · Jui-Hsien Wang · Doug James · Jiajun Wu
3D Neural Field Generation using Triplane Diffusion
Jesse Shue · Eric Chan · Ryan Po · Zachary Ankner · Jiajun Wu · Gordon Wetzstein
Putting People in Their Place: Affordance-Aware Human Insertion into Scenes
Sumith Kulal · Tim Brooks · Alex Aiken · Jiajun Wu · Jimei Yang · Jingwan Lu · Alexei A. Efros · Krishna Kumar Singh
Towards Effective Visual Representations for Partial-Label Learning
Shiyu Xia · Jiaqi Lyu · Ning Xu · Gang Niu · Xin Geng
AMT: All-Pairs Multi-Field Transforms for Efficient Frame Interpolation
Zhen Li · Zuo-Liang Zhu · Ling-Hao Han · Qibin Hou · Chunle Guo · Ming-Ming Cheng
DNF: Decouple and Feedback Network for Seeing in the Dark
Xin Jin · Ling-Hao Han · Zhen Li · Chunle Guo · Zhi Chai · Chongyi Li
Spectral Enhanced Rectangle Transformer for Hyperspectral Image Denoising
Miaoyu Li · Ji Liu · Ying Fu · Yulun Zhang · Dejing Dou
Dynamic Aggregated Network for Gait Recognition
Kang Ma · Ying Fu · Dezhi Zheng · Chunshui Cao · Xuecai Hu · Yongzhen Huang
LG-BPN: Local and Global Blind-Patch Network for Self-Supervised Real-World Denoising
ZiChun Wang · Ying Fu · Ji Liu · Yulun Zhang
Real-Time Neural Light Field on Mobile Devices
Junli Cao · Huan Wang · Pavlo Chemerys · Vladislav Shakhrai · Ju Hu · Yun Fu · Denys Makoviichuk · Sergey Tulyakov · Jian Ren
ScaleDet: A Scalable Multi-Dataset Object Detector
Yanbei Chen · Manchen Wang · Abhay Mittal · Zhenlin Xu · Paolo Favaro · Joseph Tighe · Davide Modolo
All in One: Exploring Unified Video-Language Pre-training
Jinpeng Wang · Yixiao Ge · Rui Yan · Yuying Ge · Kevin Qinghong Lin · Satoshi Tsutsui · Xudong Lin · Guanyu Cai · Jianping WU · Ying Shan · Xiaohu Qie · Mike Zheng Shou
Learning Transferable Spatiotemporal Representations from Natural Script Knowledge
Ziyun Zeng Yuying Ge Xihui Liu Bin Chen Ping Luo Shu-Tao Xia Yixiao Ge
KD-GAN: Data Limited Image Generation via Knowledge Distillation
Kaiwen Cui Yingchen Yu Fangneng Zhan · Shengcai Liao · Shijian Lu · Eric Xing
Mapping Degeneration Meets Label Evolution: Learning Infrared Small Target Detection with Single Point Supervision
Xinyi Ying · Li Liu · Yingqian Wang · Ruojing Li · Nuo Chen · Zaiping Lin · Weidong Sheng · Shilin Zhou
Logical Consistency and Greater Descriptive Power for Facial Hair Attribute Learning
Haiyu Wu Grace Bezold Aman Bhatta Kevin Bowyer
Diffusion Video Autoencoders: Toward Temporally Consistent Face Video Editing via Disentangled Video Encoding
Gyeongman Kim Hajin Shim Hyunsu Kim Yunjey Choi Junho Kim Eunho Yang
3D Video Object Detection with Learnable Object-Centric Global Optimization
Jiawei He · Yuntao Chen · Naiyan Wang · Zhaoxiang Zhang
BEVFormer v2: Adapting Modern Image Backbones to Bird's-Eye-View Recognition via Perspective Supervision
Chenyu Yang · Yuntao Chen · Hao Tian · Chenxin Tao · Xizhou Zhu · Zhaoxiang Zhang · Gao Huang · Hongyang Li · Yu Qiao · Lewei Lu · Jie Zhou · Jifeng Dai
MarS3D: A Plug-and-Play Motion-Aware Model for Semantic Segmentation on Multi-Scan 3D Point Clouds
Jiahui Liu · Chirui CHANG · Jianhui Liu · Xiaoyang Wu · Lan Ma · XIAOJUAN QI
Understanding Imbalanced Semantic Segmentation Through Neural Collapse
Zhisheng Zhong · Jiequan Cui · Yibo Yang · Xiaoyang Wu · XIAOJUAN QI · Xiangyu Zhang · Jiaya Jia
Hierarchical Dense Correlation Distillation for Few-Shot Segmentation
Bohao PENG · Zhuotao Tian · Xiaoyang Wu · Chengyao Wang · Shu Liu · Jingyong Su · Jiaya Jia
Masked Scene Contrast: A Scalable Framework for Unsupervised 3D Representation Learning
Xiaoyang Wu · Xin Wen · Xihui Liu · Hengshuang Zhao
Self-Correctable and Adaptable Inference for Generalizable Human Pose Estimation
Zhehan Kan · Shuoshuo Chen · Ce Zhang · Yushun Tang · Zhihai He
Neuro-Modulated Hebbian Learning for Fully Test-Time Adaptation
Yushun Tang · Ce Zhang · Heng Xu · Shuoshuo Chen · Jie Cheng · Luziwei Leng · Qinghai Guo · Zhihai He
Noisy Correspondence Learning with Meta Similarity Correction
Haochen Han · Kaiyao Miao · Qinghua Zheng · Minnan Luo
Detecting Backdoors During the Inference Stage Based on Corr uption Robustness Consistency
xiaogeng liu · Minghui Li · haoyu wang · Shengshan hu · dngpan Ye · libing wu · chaowei xiao
polyFormer: Referring Image Segmentation A. s sequential polygon generation
jiang liu · hui ding · zhaowei cai · yuting zhang · ravi satzoda · vijay Mahadevan R. Manmatha
Glocal Energy-based Learning for Few-Shot Open-Set Recognition
Haoyu Wang Guansong Pang Peng Wang Lei Zhang Wei Wei Yanning Zhang
PointDistiller: Structured Knowledge Distillation Towards Efficient and Compact 3D Detection
Linfeng Zhang Runpei Dong Hung-Shuo Tai Kaisheng Ma
LipFormer: High-fidelity and Generalizable Talking Face Generation with A Pre-learned Facial Codebook
Jiayu Wang Kang Zhao · Shiwei Zhang · Yingya Zhang · Yujun Shen · Deli Zhao · Jingren Zhou
High-fidelity Generalized Emotional Talking Face Generation with Multi-modal Emotion Space Learning
Chao Xu · Junwei Zhu · Jiangning Zhang · Yue Han · Wenqing Chu · Ying Tai · Chengjie Wang · Zhifeng Xie · Yong Liu
EC^2: Emergent Communication for Embodied Control
Yao Mu · Shunyu Yao · Mingyu Ding · Ping Luo · Chuang Gan
Self-Supervised Image-to-Point Distillation via Semantically Tolerant Contrastive Loss
Anas Mahmoud · Jordan Sir Kwang Hu · Tianshu Kuai · Ali Harakeh · Liam Paull · Steven Waslander
Instance Relation Graph Guided Source-Free Domain Adaptive Object Detection
Vibashan Vishnukumar Sharmini · Poojan Oza · Vishal Patel
Mask-free OVIS: Open-Vocabulary Instance Segmentation without Manual Mask Annotations
Vibashan Vishnukumar Sharmini · Ning Yu · Chen Xing · Can Qin · Mingfei Gao · Juan Carlos Niebles · Vishal Patel · Ran Xu
STMT: A Spatial-Temporal Mesh Transformer for MoCap-Based Action Recognition
Xiaoyu Zhu · Po-Yao Huang · Junwei Liang · Celso de Melo · Alexander Hauptmann
DropMAE: Masked Autoencoders with Spatial-Attention Dropout for Tracking Tasks
Qiangqiang Wu · Tianyu Yang · Ziquan Liu · Baoyuan Wu · Ying Shan · Antoni Chan
TWINS: A Fine-Tuning Framework for Improved Transferability of Adversarial Robustness and Generalization
Ziquan Liu · Yi Xu · Xiangyang Ji · Antoni Chan
Optimal Transport Minimization: Crowd Localization on Density Maps for Semi-Supervised Counting
Wei Lin · Antoni Chan
Music-Driven Group Choreography
Nhat Le · Trong Thang Pham · Tuong Do · Erman Tjiputra · Quang Tran · Anh Nguyen
Where is my Wallet? Modeling Object Proposal Sets for Egocentric Visual Query Localization
Mengmeng Xu · Yanghao Li · Cheng-Yang Fu · Bernard Ghanem · Tao Xiang · Juan-Manuel Perez-Rua
Rotation-Invariant Transformer for Point Cloud Matching
Hao Yu · Zheng Qin · Ji Hou · Mahdi Saleh · Dongsheng Li · Benjamin Busam · Slobodan Ilic
Mask3D: Pre-training 2D Vision Transformers by Learning Masked 3D Priors
Ji Hou · Xiaoliang Dai · Zijian He · Angela Dai · Matthias Niessner
Boosting Semi-Supervised Learning by Exploiting All Unlabeled Data
Yuhao Chen · Xin Tan · Borui Zhao · ZhaoWei CHEN · Renjie Song · jiajun liang · Xuequan Lu
Implicit Identity Leakage: The Stumbling Block to Improving Deepfake Detection Generalization
Shichao Dong · Jin Wang · Renhe Ji · jiajun liang · Haoqiang Fan · Zheng Ge
EFEM: Equivariant Neural Field Expectation Maximization for 3D Object Segmentation Without Scene Supervision
Jiahui Lei · Congyue Deng · Karl Schmeckpeper · Leonidas Guibas · Kostas Daniilidis
SemiCVT: Semi-Supervised Convolutional Vision Transformer for Semantic Segmentation
Huimin Huang · Shiao Xie · Lanfen Lin · Tong Ruofeng · Yen-wei Chen · Yuexiang Li · Hong Wang · Y awen huang · Yefeng Zheng
CNVid-3.5M: Build, Filter, and Pre-train the Large-scale Public Chinese Video-text Dataset
Tian Gan · Qing Wang · Xingning Dong · Xiangyuan Ren · Liqiang Nie · Qingpei Guo
Disentangling Writer and Character Styles for Handwriting Generation
Gang Dai Yifan Zhang Qingfeng Wang Qing Du Zhuliang Yu Zhuoman Liu Shuangping Huang
A2J-Transformer: Anchor-to-Joint Transformer Network for 3D Interacting Hand Pose Estimation from a Single RGB Image
Changlong Jiang · Yang Xiao · Cunlin Wu · Mingyang Zhang · Jinghong Zheng · Zhiguo Cao · Joey Zhou
Uni-Perceiver v2: A Generalist Model for Large-Scale Vision and Vision-Language Tasks
Hao Li · Jinguo Zhu · Xiaohu Jiang · Xizhou Zhu · Hongsheng Li · Chun Yuan · Xiaohua Wang · Yu Qiao · Xiaogang Wang · Wenhai Wang · Jifeng Dai
ShapeTalk: A Language Dataset and Framework for 3D Shape Edits and Deformations
Panos Achlioptas · Ian Huang · Minhyuk Sung · Sergey Tulyakov · Leonidas Guibas
Lite DETR : An Interleaved Multi-Scale Encoder for Efficient DETR
Feng Li Ailing Zeng Shilong Liu Hao Zhang Hongyang Li Lionel Ni Lei Zhang
Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation
Feng Li · Hao Zhang · Huaizhe Xu · Shilong Liu · Lei Zhang · Lionel Ni · Heung-Yeung Shum
MP-Former: Mask-Piloted Transformer for Image Segmentation
Hao Zhang · Feng Li · Huaizhe Xu · Shijia Huang · Shilong Liu · Lionel Ni · Lei Zhang
Enlarging Instance-specific and Class-specific Information for Open-set Action Recognition
Jun Cen · Shiwei Zhang · Xiang Wang · Yixuan Pei · Zhiwu Qing · Yingya Zhang · Qifeng Chen
MoLo: Motion-augmented Long-short Contrastive Learning for Few -shot Action Recognition
Xiang Wang Shiwei Zhang Zhiwu Qing Changxin Gao Yingya Zhang Deli Zhao Nong Sang
PCR: Proxy-based Contrastive Replay for Online Class-Incremental Continual Learning
Huiwei Lin · Baoquan Zhang · Shanshan Feng · Xutao Li · Yunming Ye
Building Rearticulable Models for Arbitrary 3D Objects from 4D Point Clouds
Shaowei Liu · Saurabh Gupta · Shenlong Wang
Slide-Transformer: Hierarchical Vision Transformer with Local Self-Attention
Xuran Pan · Tianzhu Ye · Zhuofan Xia · Shiji Song · Gao Huang
Compressing Volumetric Radiance Fields to 1 MB
Lingzhi Li · Zhen Shen · Zhongshu Wang · Li Shen · Liefeng Bo
REVEAL: Retrieval-Augmented Visual-Language Pre-Training with Multi-Source Multimodal Knowledge Memory
Ziniu Hu · Ahmet Iscen · Chen Sun · Zirui Wang · Kai-Wei Chang · Yizhou Sun · Cordelia Schmid · David Ross · Alireza Fathi
Improving Image Recognition by Retrieving from Web-Scale Image-Text Data
Ahmet Iscen · Alireza Fathi · Cordelia Schmid
Learning to Name Classes for Vision and Language Models
Sarah Parisot · Yongxin Yang · Steven McDonagh
SteerNeRF: Accelerating NeRF Rendering via Smooth Viewpoint Trajectory
Sicheng Li · Hao Li · Yue Wang · Yiyi Liao · Lu Yu
Semi-Supervised Video Inpainting with Cycle Consistency Constraints
Zhiliang Wu · Han Xuan · Changchang Sun · Weili Guan · Kang Zhang · Yan Yan
Deep Stereo Video Inpainting
Zhiliang Wu · Changchang Sun · Han Xuan · Yan Yan
VoP: Text-Video Co-operative Prompt Tuning for Cross-Modal Retrieval
Siteng Huang · Biao Gong · Yulin Pan · Jianwen Jiang · Yiliang Lv · Yuyuan Li · Donglin Wang
NeRF-Supervised Deep Stereo
Fabio Tosi · Alessio Tonioni · Daniele Gregorio · Matteo Poggi
Collaborative Static and Dynamic Vision-Language Streams for Spatio-Temporal Video Grounding
Zihang Lin · Chaolei Tan · Jian-Fang Hu · Zhi Jin · Tiancai Ye · Wei-Shi Zheng
Hierarchical Semantic Correspondence Networks for Video Paragraph Grounding
Chaolei Tan · Zihang Lin · Jian-Fang Hu · Wei-Shi Zheng · Jianhuang Lai
Combining Implicit-Explicit View Correlation for Light Field Semantic Segmentation
Ruixuan Cong · Da Yang · Rongshan Chen · Sizhe Wang · Zhenglong Cui · HaoSheng
Improving Robustness of Vision Transformers by Reducing Sensitivity to Patch Corruptions
Yong Guo · David Stutz · Bernt Schiele
DF-Platter: Multi-Face Heterogeneous Deepfake Dataset
Kartik Narayan · Harsh Agarwal · Kartik Thakral · Surbhi Mittal · Mayank Vatsa · Richa S ingh
Metadata-Based RAW Reconstruction via Implicit Neural Functions
Leyi Li Huijie Qiao Qi Ye Qinmin Yang
I
2
-SDF: Intrinsic Indoor Scene Reconstruction and Editing via Raytracing in Neural SDFs
Jingsen Zhu Yuchi Huo Qi Ye Fujun Luan Jifan Li Dianbing Xi Lisha Wang · Rui Tang · Wei Hua · Hujun Bao · Rui Wang
Polarized Color Image Denoising
Zhuoxiao Li · Haiyang Jiang · Mingdeng Cao · Yinqiang Zheng
NeFII: Inverse Rendering for Reflectance Decomposition with Near-Field Indirect Illumination
Haoqian Wu · Zhipeng Hu · Lincheng Li · Yongqiang Zhang · Changjie Fan · Xin Yu
Balanced Energy Regularization Loss for Out-of-distribution Detection
Hyunjun Choi · Hawook Jeong · Jin Choi
DeCo : Decomposition and Reconstruction for Compositional Temporal Grounding via Coarse-to-Fine Contrastive Ranking
Lijin Yang · Quan Kong · Hsuan-Kung Yang · Wadim Kehl · Yoichi Sato · Norimasa Kobori
CREPE: Can Vision-Language Foundation Models Reason Compositionally?
Zixian Ma · Jerry Hong · Mustafa Omer Gul · Mona Gandhi · Irena Gao · Ranjay Krishna
Painting 3D Nature in 2D: View Synthesis of Natural Scenes from a Single Semantic Mask
Shangzhan Zhang · Sida Peng · Tianrun Chen · Linzhan Mou · Haotong Lin · Kaicheng Yu · Yiyi Liao · Xiaowei Zhou
Learning 3D-aware Image Synthesis with Unknown Pose Distribution
Zifan Shi · Yujun Shen · Yinghao Xu · Sida Peng · Yiyi Liao · Sheng Guo · Qifeng Chen · Dit-Yan Yeung
StyleSync: High-Fidelity Generalized and Personalized Lip Sync in Style-based Generator
Jiazhi Guan · Zhanwang Zhang · Hang Zhou · Tianshu Hu · Kaisiyuan Wang · Dongliang He · Haocheng Feng · Jingtuo Liu · Errui Ding · Ziwei Liu · Jingdong Wang
A Whac-A-Mole Dilemma: Shortcuts Come in Multiples Where Mitigating One Amplifies Others
Zhiheng Li · Ivan Evtimov · Albert Gordo · Caner Hazirbas · Tal Hassner · Cristian Canton · Chenliang Xu · Mark Ibrahim
Cooperation or Competition: Avoiding Player Domination for Multi-target Robustness by Adaptive Budgets
Yimu Wang · Dinghuai Zhang · Yihan Wu · Heng Huang · Hongyang Zhang
Gated Stereo: Joint Depth Estimation from Gated and Wide-Baseline Active Stereo Cues
Stefanie Walz · Mario Bijelic · Andrea Ramazzina · Amanpreet Walia · Fahim Mannan · Felix Heide
SliceMatch: Geometry-guided Aggregation for Cross-View Pose Estimation
Zimin Xia · Holger Caesar · Julian Kooij · Ted Lentsch
Towards Compositional Adversarial Robustness: Generalizing Adversarial Training to Composite Semantic Perturbations
Lei Hsiung · Yun-Yun Tsai · Pin-Yu Chen · Tsung-Yi Ho
StyleGAN Salon: Multi-View Latent Optimization for Pose-Invariant Hairstyle Transfer
Sasikarn Khwanmuang · Pakkapon Phongthawee · Patsorn Sangkloy · Supasorn Suwajanakorn
Learning Geometric-aware Properties in 2D Representation Using Lightweight CAD Models, or Zero Real 3D Pairs
Pattaramanee Arsomngern · Sarana Nutanong · Supasorn Suwajanakorn
Visibility Constrained Wide-band Illumination Spectrum Design for Seeing-in-the-Dark
Muyao Niu · Zhuoxiao Li · Zhihang Zhong · Yinqiang Zheng
ToThePoint: Efficient Contrastive Learning of 3D Point Clouds via Recycling
Xinglin Li · Jiajing Chen · Jinhui Ouyang · Hanhui Deng · Senem Velipasalar · Di Wu
AUNet: Learning Relations Between Action Units for Face Forgery Detection
Weiming Bai · Yufan Liu · Zhipeng Zhang · Bing Li · Weiming Hu
Physical-World Optical Adversarial Attacks on 3D Face Recognition
Yanjie Li Yiquan Li Xuelong Dai Songtao Guo Bin Xiao
Robust Single Image Reflection Removal Against Adversarial Attacks
Zhenbo Song Zhenyuan Zhang Kaihao Zhang Wenhan Luo Zhaoxin F an · Wenqi Ren · Jianfeng Lu
The Enemy of My Enemy is My Friend: Exploring Inverse Adversaries for Improving Adversarial Training
Junhao Dong Seyed-Mohsen Moosavi-Dezfooli Jianhuang Lai Xiaohua Xie
Boosting Accuracy and Robustness of Student Models via Adaptive Adversarial Distillation
Bo Huang · Mingyang Chen · Yi Wang JUNDA LU Minhao Cheng Wei Wang
Introducing Competition to Boost the Transferability of Targeted Adversarial Examples through Clean Feature Mixup
Junyoung Byun · Myung-Joon Kwon · Seungju Cho · Yoonji Kim · Changick Kim
Angelic Patches for Improving Third-Party Object Detector Performance
Wenwen Si · Shuo Li · Sangdon Park · Insup Lee · Osbert Bastani
Sibling-Attack: Rethinking Transferable Adversarial Attacks against Face Recognition
Zexin Li · Bangjie Yin · Taiping Yao · Junfeng Guo · Shouhong Ding · Simin Chen · Cong Liu
A Practical Upper Bound for the Worst-Case Attribution Deviations
Fan Wang · Adams Kong
You Are Catching My Attention: Are Vision Transformers Bad Learners under Backdoor Attacks?
Zenghui Yuan · Pan Zhou · Kai Zou · Yu Cheng
Architectural Backdoors in Neural Networks
Mikel Bober-Irizar · Ilia Shumailov · Yiren Zhao · Robert Mullins · Nicolas Papernot
The Dark Side of Dynamic Routing Neural Networks: Towards Efficiency Backdoor Injection
Simin Chen · Hanlin Chen · Mirazul Haque · Cong Liu · Wei Yang Style
Adv: Meta Style Adversarial Training for Cross-Domain Few-Shot Learning
Yuqian Fu · YU XIE · Yanwei Fu · Yu-Gang Jiang
Rethinking Domain Generalization for Face Anti-spoofing: Separability and Alignment
Yiyou Sun · Yaojie Liu · Xiaoming Liu · Yixuan Li · Vincent Chu
Make Landscape Flatter in Differentially Private Federated Learning
Yifan Shi Yingqi Liu Kang Wei Li Shen Xueqian Wang Dacheng Tao
Confidence-aware Personalized Federated Learning via Variational Expectation Maximization
Junyi Zhu · Xingchen Ma · Matthew Blaschko
ScaleFL: Resource-Adaptive Federated Learning with Heterogeneous Clients
Fatih Ilhan · Gong Su · Ling Liu
MetaMix: Towards Corruption-Robust Continual Learning with Temporally Self-Adaptive Data Transformation
Zhenyi Wang · Li Shen · Donglin Zhan · Qiuling Suo · Yanjun Zhu · Tiehang Duan · Mingchen Gao
Revisiting Reverse Distillation for Anomaly Detection
Tran Dinh Tien · Anh Tuan Nguyen · Nguyen Tran · Huy Ta · Soan Duong · Chanh Nguyen · Steven Truong
Generating Anomalies for Video Anomaly Detection with Prompt-based Feature Mapping
Zuhao Liu · Xiao-Ming Wu · Dian Zheng · Kun-Yu Lin · Wei-Shi Zheng
Explicit Boundary Guided Semi-Push-Pull Contrastive Learning for Supervised Anomaly Detection
Xincheng Yao · Ruoqi Li · Jing Zhang · Jun Sun · Chongyang Zhang
Towards Universal Fake Image Detectors that Generalize Across Generative Models
Utkarsh Ojha · Yuheng Li · Yong Jae Lee
Edges to Shapes to Concepts: Adversarial Augmentation for Robust Vision
Aditay Tripathi · Rishubh Singh · Anirban Chakraborty · Pradeep Shenoy
Sequential training of GANs against GAN-classifiers reveals correlated “knowledge gaps” present among independently trained GAN instances
Arkanath Pathak · Nicholas Dufour
Masked Auto-Encoders Meet Generative Adversarial Networks and Beyond
Zhengcong Fei · Mingyuan Fan · Li Zhu · Junshi Huang · Xiaoming Wei · Xiaolin Wei
Vector Quantization with Self-attention for Quality-independent Representation Learning
zhou yang · Weisheng Dong · Xin Li · Mengluan Huang · Yulin Sun · Guangming Shi
PD-Quant: Post-Training Quantization Based on Prediction Difference Metric
Jiawei Liu · Lin Niu · Zhihang Yuan · Dawei Yang · Xinggang Wang · Wenyu Liu
Hard Sample Matters a Lot in Zero-Shot Quantization
Huantong Li · Xiangmiao Wu · fanbing Lv · Daihai Liao · Thomas Li · Yonggang Zhang · Bo Han · Mingkui Tan
Fair Scratch Tickets: Finding Fair Sparse Networks without Weight Training
Pengwei Tang · Wei Yao · Zhicong Li · Yong Liu
Understanding Deep Generative Models with Generalized Empirical Likelihoods
Suman Ravuri Mélanie Rey Shakir Mohamed Marc Deisenroth
Deep Deterministic Uncertainty: A New Simple Baseline
Jishnu Mukhoti Andreas Kirsch Joost van Amersfoort Philip Torr Yarin Gal
Compact ing Binary Neural Networks by Sparse Kernel Selection
Yikai Wang Wenbing Huang Yinpeng Dong Fuchun Sun Anbang Yao
Bias in Pruned Vision Models: In-Depth Analysis and Countermeasures
Eugenia Iofinova Alexandra Peste Dan Alistarh
X-Pruner: eXplainable Pruning for Vision Transformers
Lu Yu Wei Xiang
Deep Gra pH Reprogramming
Yongcheng Jing Chongbin Yuan Li Ju Yiding Yang Xinchao Wang Dacheng Tao
FlowGrad: Controlling the Output of Generative ODEs with Gradients
Xingchao Liu Lemeng Wu Shujian Zhang Chengyue Gong Wei Ping qiang liu
Exploring Data Geometry for Continual Learning
Zhi Gao Chen Xu Feng Li Yunde Jia Mehrtash Harandi Yuwei Wu
Improving Generalization with Domain Convex Game
Fangrui Lv · Jian Liang · Shuang Li · Jinming • Di Liu
Slack: Stable Learning of Augmentations with Cold-Start And Kl. Regularization
Juliette Marrie · Michael Arbel · Diane Larlus · Julien Mairal
Critical Learning Periods for Multisensory Integration in Deep Networks
Michael Kleinman Alessandro Achille Stefano Soatto
Preserving Linear Separability in Continual Learning by Backward Feature Projection
Qiao Gu Dongsub Shim Florian Shkurti
Multi-level Logit Distillation
Ying Jin Jiaqi Wang Dahua Lin
Data-Free Knowledge Distillation via Feature Exchange and Activation Region Constraint
Shikang Yu Jiachen Chen Hu Han Shuqiang Jiang
Masked Autoencoders Enable Efficient Knowledge Distillers
Yutong Bai Zeyu Wang Junfei Xiao Chen Wei Huiyu Wang Alan Yuille Yuyin Zhou Cihang Xie
DKT: Diverse Knowledge Transfer Transformer for Class Incremental Learning
Xinyuan Gao Yuhang He · SongLin Dong Jie Cheng Xing Wei Yihong Gong
BlackVIP: Black-Box Visual Prompting for Robust Transfer Learning
Changdae Oh · Hyeji Hwang · Hee-young Lee · YongTaek Lim · Geunyoung Jung · Jiyoung Jung · Hosik Choi · Kyungwoo Song
PIVOT: Prompting for Video Continual Learning
Andres Villa · Juan Leon Alcazar · Motasem Alfarra · Kumail Alhamoud · Julio Hurtado · Fabian Caba · Alvaro Soto · Bernard Ghanem
MixPHM: Redundancy-Aware Parameter-Efficient Tuning for Low-Resource Visual Question Answering
Jingjing Jiang · Nanning Zheng
NIFF: Alleviating Forgetting in Generalized Few-Shot Object Detection via Neural Instance Feature Forging
Karim Guirguis · Johannes Meier · George Eskandar · Matthias Kayser · Bin Yang · Jürgen Beyerer
Learning with Fantasy: Semantic-Aware Virtual Contrastive Constraint for Few-Shot Class-Incremental Learning
Zeyin Song · Yifan Zhao · Yujun Shi · Peixi Peng · Li Yuan · Yonghong Tian
Improved Test-Time Adaptation for Domain Generalization
Liang Chen · Yong Zhang · Yibing Song · Ying Shan · Lingqiao Liu
TIPI: Test Time Adaptation with Transformation Invariance
Anh Tuan Nguyen · Thanh Nguyen-Tang · Ser-Nam Lim · Philip Torr
ActMAD: Activation Matching to Align Distributions for Test-Time-Training
Muhammad Mirza Mirza · Pol Jane Soneira · Wei Lin · Mateusz Kozinski · Horst Possegger · Horst Bischof
Modality-Agnostic Debiasing for Single Domain Generalization
Sanqing Qu · Yingwei Pan · Guang Chen · Ting Yao · changjun jiang · Tao Mei
ALOFT: A Lightweight MLP-like Architecture with Dynamic Low-frequency Transform for Domain Generalization
Jintao Guo · Na Wang · Lei Qi · Yinghuan Shi
C-SFDA: A Curriculum Learning Aided Self-Training Framework for Efficient Source Free Domain Adaptation
Nazmul Karim · Niluthpol Chowdhury Mithun · Abhinav Rajvanshi · Han-pang Chiu · Supun Samarasekera · Nazanin Rahnavard
Adjustment and Alignment for Unbiased Open Set Domain Adaptation
Wuyang Li · Jie Liu · Bo Han · Yixuan Yuan
Semi-Supervised Domain Adaptation with Source Label Adaptation
Yu-Chu Yu · Hsuan-Tien Lin
Dynamically Instance-Guided Adaptation: A Backward-free Approach for Test-Time Domain Adaptive Semantic Segmentation
Wei Wang · Zhun Zhong · Weijie Wang · Xi Chen · Charles Ling · Boyu Wang · Nicu Sebe
FCC: Feature Clusters Compression for Long-Tailed Visual Recognition
Jian Li · Ziyao Meng · daqian Shi · Rui Song · Xiaolei Diao · Jingwen Wang · Hao Xu
DISC: Learning from Noisy Labels via Dynamic Instance-Specific Selection and Correction
Yifan Li Hu Han Shiguang Shan Xilin CHEN
Superclass Learning with Representation Enhancement
Zeyu Gan Suyun Zhao Jinlong Kang Liyuan Shang Hong Chen Cuiping Li
Improving Selective Visual Question Answering by Learning from Your Peers
Corentin Dancette Spencer Whitehead Rishabh Maheshwary Shanmukha Ramakrishna Vedantam Stefan Scherer Xinlei Chen Matthieu CORD Marcus Rohrbach
Difficulty-based Sampling for Debiased Contrastive Representation Learning
Taeuk Jang · Xiaoqian Wang
Token Boosting for Robust Self-Supervised Visual Transformer Pre-training
Tianjiao Li · Lin Geng Foo · Ping Hu · Xindi Shang · Hossein Rahmani · Zehuan Yuan · Jun Liu
HyperMatch: Noise-Tolerant Semi-Supervised Learning via Relaxed Contrastive Constraint
Beitong Zhou · Jing Lu · Kerui Liu · Yunlu Xu · Zhanzhan Cheng · Yi Niu
Open-Set Likelihood Maximization for Few-Shot Learning
Malik Boudiaf · Etienne Bennequin · Myriam Tami · Antoine Toubhans · Pablo Piantanida · CELINE HUDELOT · Ismail Ayed
Transductive Few-Shot Learning with Prototypes Label-Propagation by Iterative Graph Refinement
Hao Zhu · Piotr Koniusz
Deep Fair Clustering via Maximizing and Minimizing Mutual Information: Theory, Algorithm and Metric
Pengxin Zeng · Yunfan Li · Peng Hu · Dezhong Peng · Jiancheng Lv · Xi Peng
On the Effects of Self-supervision and Contrastive Alignment in Deep Multi-view Clustering
Daniel J. Trosten · Sigurd Løkse · Robert Jenssen · Michael Kampffmeyer
Sample-level Multi-view Graph Clustering
Yuze Tan · Yixi Liu · Shudong Huang · Wentao Feng · Jiancheng Lv
Discriminating Known from Unknown Objects via Structure-Enhanced Recurrent Variational AutoEncoder
Aming WU · Cheng Deng
GEN: Pushing the Limits of Softmax-Based Out-of-Distribution Detection
Xixi Liu · Yaroslava Lochman · Christopher Zach
RankMix: Data Augmentation for Weakly Supervised Learning of Classifying Whole Slide Images with Diverse Sizes and Imbalanced Categories
Yuan-Chih Chen · Chun-Shien Lu
Best of Both Worlds: Multimodal Contrastive Learning with Tabular and Imaging Data
Paul Hager · Martin J. Menten · Daniel Rueckert
DeGPR: Deep Guided Posterior Regularisation For Multi-Class Cell Detection And Counting
Aayush Tyagi · Chirag Mohapatra · Prasenjit Das · Govind Makharia · Lalita Mehra · Prathosh AP · Mausam .
OCELOT: Overlapped Cell on Tissue Dataset for Histopathology
Jeongun Ryu · Aaron Valero Puche · JaeWoong Shin · Seonwook Park · Biagio Brattoli · Jinhee Lee · Wonkyung Jung · Soo Ick Cho · Kyunghyun Paeng · Chan-Young Ock · Donggeun Yoo · Sérgio Pereira
SQUID: Deep Feature In-Painting for Unsupervised Anomaly Detection
Tiange Xiang Yixiao Zhang Yongyi Lu Alan Yuille Chaoyi Zhang Weidong Cai Zongwei Zhou
Devil is in the Queries: Advancing Mask Transformers for Real-world Medical Image Segmentation and Out- of-Distribution Localization
Mingze Yuan Yingda Xia Hexin Dong Zifan Chen Jiawen Yao Mingyan Qiu Ke Yan Xiaoli Yin Yu Shi Xin Chen Zaiyi Liu Bin Dong Jingren Zhou Le Lu Ling Zhang Li Zhang
MagicNet: Semi-Supervised Multi-Organ Segmentation via Magic-Cube Partition and Recovery
Duowen Chen · Yunhao Bai · Wei Shen · Qingli Li · Lequan Yu · Yan Wang
(ML)
2
P-Encoder: On Exploration of Channel-class Correlation for Multi-label Zero-shot Learning
Ziming Liu · Song Guo · Xiaocheng Lu · Jingcai Guo · Jiewei Zhang · Yue Zeng · Fushuo Huo
Out-of-Distributed Semantic Pruning for Robust Semi-Supervised Learning
Yu Wang · Pengchong Qiao · Chang Liu · Guoli Song · Xiawu Zheng · Jie Chen
Contrastive Mean Teacher for Domain Adaptive Object Detectors
Shengcao Cao · Dhiraj Joshi · Liangyan Gui · Yu-Xiong Wang
Harmonious Teacher for Cross-domain Object Detection
Jinhong Deng · Dongli Xu · Wen Li · Lixin Duan
Hierarchical Supervision and Shuffle Data Augmentation for 3D Semi-Supervised Object Detection
Chuandong Liu · CHENQIANG GAO · Fangcen Liu · Pengcheng Li · Deyu Meng · Xinbo Gao
Semi-DETR: Semi-Supervised Object Detection with Detection Transformers
Jiacheng Zhang Xiangru Lin Wei Zhang Kuo Wang Xiao Tan Junyu Han Errui Ding Jingdong Wang Guanbin Li Continual
Detection Transformer for Incremental Object Detection
Yaoyao Liu Bernt Schiele · Andrea Vedaldi · Christian Rupprecht
DA-DETR: Domain Adaptive Detection Transformer with Information Fusion
Jingyi Zhang · Jiaxing Huang · Zhipeng Luo · Gongjie Zhang · Xiaoqin Zhang · Shijian Lu
CIGAR: Cross-Modality Graph Reasoning for Domain Adaptive Object Detection
Y abo Liu Jinghua Wang · Chao Huang · Yaowei Wang · Yong Xu
Box-Level Active Detection
Mengyao Lyu Jundong Zhou Hui Chen Yi-Jie Huang Dongdong Yu Yaqian Li Yandong Guo Yuchen Guo Liuyu Xiang Guiguang Ding Enhanced Training of Query-
Based Object Detection via Selective Query Recollection
Fangyi Chen Han Zhang Kai Hu · Yu-Kai Huang · Chenchen Zhu · Marios Savvides
Vision Transformers are Good Mask Auto-Labelers
Shiyi Lan · Xitong Yang · Zhiding Yu · Zuxuan Wu · Jose Alvarez · Anima Anandkumar
Weakly Supervised Posture Mining for Fine-grained Classification
Zhenchao Tang · Hualin Yang · Calvin Yu-Chian Chen
IDGI: A Framework to Eliminate Explanation Noise from Integrated Gradients
Ruo Yang · Binghui Wang · Mustafa Bilgic
Active Finetuning: Exploiting Annotation Budget in the Pretraining-Finetuning Paradigm
Yichen Xie Han Lu Junchi Yan Xiaokang Yang Masayoshi Tomizuka Wei Zhan
Instance-specific and Model-adaptive Supervision for Semi-supervised Semantic Segmentation
Zhen Zhao Sifan Long Jimin Pi Jingdong Wang Luping Zhou
Long-Tailed Visual Recognition via Self-Heterogeneous Integration with Knowledge Excavation
Yan Jin Mengke LI Yang Lu Yiu-ming Cheung Hanzi Wang
Foundation Model Drives Weakly Incremental Learning for Semantic Segmentation
Chaohui Yu Q iang Zhou · Jingliang Li · Jianlong Yuan · Zhibin Wang · Fan Wang
Out-of-Candidate Rectification for Weakly Supervised Semantic Segmentation
Zesen Cheng · Pengchong Qiao · Kehan ​​Li · Siheng Li · Pengxu Wei · Xiangyang Ji · Li Yuan · Chang Liu · Jie Chen FastInst
: A Simple Query-Based Model for Real-Time Instance Segmentation
Junjie He · Pengyu Li · Yifeng Geng · Xuansong Xie
On Calibrating Semantic Segmentation Models: Analyzes and An Algorithm
Dongdong Wang Boqing Gong Liqiang Wang
Content-aware Token Sharing for Efficient Semantic Segmentation with Vision Transformers
Chenyang Lu Daan de Geus Gijs Dubbelman
Ultra-High Resolution Segmentation with Ultra-Rich Context : A Novel Benchmark
Deyi Ji · Feng Zhao · Hongtao Lu · Mingyuan Tao · Jieping Ye
Few-shot Semantic Image Synthesis with Class Affinity Transfer
Marlene Careil · Jakob Verbeek · Stéphane Lathuilière
Network-free, unsupervised semantic segmentation with synthetic images
Qianli Feng · Raghudeep Gadde · Wentong Liao · Eduard Ramon · Aleix Martinez
MISC210K: A Large-Scale Dataset for Multi-Instance Semantic Correspondence
Yixuan Sun Yiwen Huang · HaJing Guo · Yuzhou zhao · Runmin wu · yizhou yu · Weifeng Ge · wenqiang zhang
Gres: Generalize Referring EXPRESINTATATION
LIANG LIANG LIANGHUIDHUIIIDOIIIIIDEG · XUDON G Jiang
Semantic Prompt for Few-SHOT Image Recognition
wentao Chen · Chenyang Si • zhang zhang · liang · Liang Wang Zilei Wang Tieniu Tan
Contrastive Grouping with Transformer for Referring Image Segmentation
Jiajin Tang · Ge Zheng · Cheng Shi · Sibei YANG
Decomposed Soft Prompt Guided Fusion Enhancing for Compositional Zero-Shot Learning
Xiaocheng Lu · Song Guo · Ziming Liu · Jingcai Guo
GP-VTON: Towards General Purpose Virtual Try-on via Collaborative Local-Flow Global-Parsing Learning
Zhenyu Xie · Zaiyu Huang · Xin Dong · Fuwei Zhao · Haoye Dong · Xijin Zhang · Feida Zhu · Xiaodan Liang
OvarNet: Towards Open-vocabulary Object Attribute Recognition
Keyan Chen · Xiaolong Jiang · Yao Hu · Xu Tang · Yan Gao · Jianqi Chen · Weidi Xie
HOICLIP: Efficient Knowledge Transfer for HOI Detection with Vision-Language Models
Shan Ning · Longtian Qiu · Yongfei Liu · Xuming He
DetCLIPv2: Scalable Open-Vocabulary Object Detection Pre-training via Word-Region Alignment
Lewei Yao Jianhua Han Xiaodan Liang Dan Xu Wei Zhang Zhenguo Li Hang Xu
Data-efficient Large Scale Place Recognition with Graded Similarity Supervision
Maria Leyva- Vallina · Nicola Strisciuglio · Nicolai Petkov
ConZIC: Controllable Zero-shot Image Captioning by Sampling-Based Polishing
Zequn Zeng · Hao Zhang · Zhengjue Wang · Ruiying Lu · Dongsheng Wang · Bo Chen
Deep Hashing with Minimal-Distance-Separated Hash Centers
Liangdao Wang · Yan Pan · Cong Liu · Hanjiang Lai · Jian Yin · Ye Liu
Few-Shot Learning with Visual Distribution Calibration and Cross-Modal Distribution Alignment
Runqi Wang · Hao ZHENG · Xiaoyue Duan · Jianzhuang Liu · Yuning Lu · Tian Wang · Songcen Xu · Baochang Zhang
Masked Autoencoding Does Not Help Natural Language Supervision at Scale
Floris Weers · Vaishaal Shankar · Angelos Kathropoulos · Yinfei Yang · Tom Gun ter
Improving Cross- Modal Retrieval with Set of Diverse Embeddings
Dongwon Kim Namyup Kim Suha Kwak
Revisiting Self-Similarity: Structural Embedding for Image Retrieval
Seongwon Lee Suhyeon Lee Hongje Seong Euntai Kim
LANIT: Language-Driven Image-to-Image Translation for Unlabel ed Data
Jihye Park Sunwoo Kim Soohyun Kim Seokju Cho Jaejun Yoo Youngjung Uh Seungryong Kim
Scaling Language-Image Pre-training via Masking
Yanghao Li · Haoqi Fan · Ronghang Hu · Christoph Feichtenhofer · Kaiming He
Variational Distribution Learning for Unsupervised Text-to-Image Generation
MINSOO KANG · Doyup Lee · Jiseob Kim · Saehoon Kim · Bohyung Han
Semantic-Conditional Diffusion Networks for Image Captioning
Jianjie Luo · Yehao Li · Yingwei Pan · Ting Yao · Jianlin Feng · Hongyang Chao · Tao Mei
Zero-Shot Everything Sketch-Based Image Retrieval, and in Explainable Style
Fengyin Lin · Mingkang Li · Da Li · Timothy Hospedales · Yi-Zhe Song · Yonggang Qi
MAGVLT: Masked Generative Vision-and-Language Transformer
Sungwoong Kim · Daejin Jo · Donghoon Lee · Jongmin Kim
SketchXAI: A First Look at Explainability for Human Sketches
Zhiyu Qu · Yulia Gryaditskaya · Ke Li · Kaiyue Pang · Tao Xiang · Yi-Zhe Song
Learning Geometry-aware Representations by Sketching
Hyundo Lee · Inwoo Hwang · Hyunsung Go · Won-Seok Choi · Kibeom Kim · Byoung-Tak Zhang
Towards Generalisable Video Moment Retrieval: Visual-Dynamic Injection to Image-Text Pre-Training
Dezhao Luo · Jiabo Huang · Shaogang Gong · Hailin Jin · Yang Liu
Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting
Syed Talal Wasim · Muhammad Muzammal Naseer · Salman Khan · Fahad Khan · Mubarak Shah
Query-Dependent Video Representation for Moment Retrieval and Highlight Detection
WonJun Moon · Sangeek Hyun · SangUk Park · Dongchan Park · Jae-Pil Heo
Are Binary Annotations Sufficient? Video Moment Retrieval via Hierarchical Uncertainty-based Active Learning
Wei Ji · Renjie Liang · Zhedong Zheng · Wenqiao Zhang · Shengyu Zhang · Juncheng Li · Mengze Li · Tat-Seng Chua
Improving Weakly Supervised Temporal Action Localization by Bridging Train-Test Gap in Pseudo Labels
Jingqiu Zhou · Linjiang Huang · Liang Wang · Si Liu · Hongsheng Li
PivoTAL: Prior-Driven Supervision for Weakly-Supervised Temporal Action Localization
Mamshad Nayeem Rizve · Gaurav Mittal · Ye Yu · Matthew Hall · Sandra Sajeev · Mubarak Shah · Mei Chen
Open Set Action Recognition via Multi-Label Evidential Learning
Chen Zhao · Dawei Du · Anthony Hoogs · Christopher Funk
Object Discovery from Motion-Guided Tokens
Zhipeng Bao Pavel Tokmakov Yu-Xiong Wang Adrien Gaidon Martial Hebert
Unified Keypoint-based Action Recognition Framework via Structured Keypoint Pooling
Ryo Hachiuma Fumiaki Sato Taiki Sekii
Video Test-Time Adaptation for Action Recognition
Wei Lin Muhammad Mirza Mirza · Mateusz Kozinski · Horst Possegger · Hilde Kuehne · Horst Bischof
Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline
Tiantian Geng · Teng WANG · Jinming Duan · Runmin Cong · Feng Zheng
A Light Weight Model for Active Speaker Detection
Junhua Liao · Haihan Duan · Kanghui Feng · WanBing Zhao · Yanbing Yang · Liangyin Chen
AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot AV-ASR
Paul Hongsuck Seo · Arsha Nagrani · Cordelia Schmid
Egocentric Audio-Visual Object Localization
Chao Huang · Yapeng Tian · Anurag Kumar · Chenliang Xu
An Empirical Study of End-to-End Video-Language Transformers with Masked Visual Modeling
Tsu-Jui Fu · Linjie Li · Zhe Gan · Kevin Lin · William Yang Wang · Lijuan Wang · Zicheng Liu
Towards End-to-End Generative Modeling of Long Videos with Memory-Efficient Bidirectional Transformers
Jaehoon Yoo · Semin Kim · Doyup Lee · Chiheon Kim · Seunghoon Hong
Unifying Short and Long-Term Tracking with Graph Hierarchies
Orcun Cetintas · Guillem Braso · Laura Leal-Taixé
Hierarchical Neural Memory Network for Low Latency Event Processing
Ryuhei Hamaguchi Yasutaka Furukawa Masaki Onishi Ken Sakurada
Mask-Free Video Instance Segmentation
Lei Ke Martin Danelljan Henghui Ding Yu-Wing Tai Chi-Keung Tang Fisher Yu
Hierarchical Semantic Contrast for Scene-aware Video Anomaly Detection
Shengyang sun Xiaojin Gong
Breaking the “Object” in Video Object Segmentation
Pavel Tokmakov Jie Li Adrien Gaidon
VideoTrack: Learning to Track Objects via Video Transformer
Fei Xie Lei Chu Jiahao Li Yan Lu Chao Ma
Recurrence without Recurrence: Stable Video Landmark Detection with Deep Equilibrium Models
Paul Micaelli Arash Vahdat Hongxu Yin Jan Kautz Pavlo Molchanov
Unbiased Scene Graph Generation in Videos
Sayak Nag · Kyle Min · Subarna Tripathi · Amit Roy-Chowdhury
Graph Representation for Order-aware Visual Transformation
Yue Qiu · Yanjun Sun · Fumiya Matsuzawa · Kenji Iwata · Hirokatsu Kataoka
Prototype-based Embedding Network for Scene Graph Generation
Chaofan Zheng · Xinyu Lyu · Lianli Gao · Bo Dai · Jingkuan Song
Efficient Mask Correction for Click-Based Interactive Image Segmentation
Fei Du · Jianlong Yuan · Zhibin Wang · Fan Wang
G-MSM: Unsupervised Multi-Shape Matching with Graph-based Affinity Priors
Marvin Eisenberger · Aysim Toker · Laura Leal-Taixé · Daniel Cremers
Shape-Erased Feature Learning for Visible-Infrared Person Re-Identification
Jiawei Feng · Ancong Wu · Wei-Shi Zheng
Mixed Autoencoder for Self-supervised Visual Representation Learning
Kai Chen Zhili LIU Lanqing HONG Hang Xu Zhenguo Li Dit-Yan Yeung
Stare at What You See: Masked Image Modeling without Reconstruction
Hongwei Xue Peng Gao Hongyang Li Yu Qiao · Hao Sun · Houqiang Li · Jiebo Luo
ResFormer: Scaling ViTs with Multi-Resolution Training
Rui Tian · Zuxuan Wu · Qi Dai · Han Hu · Yu Qiao · Yu-Gang Jiang
Seeing Beyond the Brain: Conditional Diffusion Model with Sparse Masked Modeling for Vision Decoding
Zijiao Chen Jiaxin Qing Tiange Xiang Wan Lin Yue Juan Zhou Zhou
DropKey for Vision Transformer
Bonan Li Yinhan Hu Xuecheng Nie Congying Han Xiangjian Jiang Tiande Guo Luoqi Liu
Vision Transformer with Super Token Sampling
Huaibo Huang Xiaoqiang Zhou Jie Cao Ran He Tieniu Tan
Sparsifiner: Learning Sparse Instance-Dependent Attention for Efficient Vision Transformers
Cong Wei Brendan Duke Ruowei Jiang Parham Aarabi Graham Taylor Florian Shkurti
All are Worth Words: A ViT Backbone for Diffusion Models
Fan Bao · Shen Nie · Kaiwen Xue · Yue Cao · Chongxuan Li · Hang Su · Jun Zhu
Boost Vision Transformer with GPU-Friendly Sparsity and Quantization
Chong Yu · Tao Chen · Zhongxue Gan · Jiayuan Fan
DisCo-CLIP: A Distributed Contrastive Loss for Memory Efficient CLIP Training
Yihao Chen Xianbiao Qi Jianan Wang Lei Zhang
Structured Sparsity Learning for Efficient Video Super-Resolution
Bin Xia Jingwen He Yulun Zhang Yitong Wang Yapeng Tian Wenming Yang Luc Van Gool
Efficient Semantic Segmentation by Altering Resolutions for Compressed Videos
Yubin Hu Yuze He Yanghao Li Jisheng Li · Yuxing Han · jiangtao wen · Yong-jin Liu
Neural Video Compression with Diverse Contexts
Jiahao Li · Bin Li · Yan Lu
Large-capacity and Flexible Video Steganography via Invertible Neural Network
Chong Mou · Youmin Xu · Jiechong Song · Chen Zhao · Bernard Ghanem · Jian Zhang
Towards Accurate Image Coding: Improved Autoregressive Image Generation with Dynamic Vector Quantization
Mengqi Huang · Zhendong Mao · Zhuowei Chen · Yongdong Zhang
Binary Latent Diffusion
Ze Wang · Jiang Wang · Zicheng Liu · Qiang Qiu
Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models
Andreas Blattmann · Robin Rombach · Huan Ling · Tim Dockhorn · Seung Wook Kim · Sanja Fidler · Karsten Kreis
Diffusion Probabilistic Model Made Slim
Xingyi Yang · Daquan Zhou · Jiashi Feng · Xinchao Wang
Solving 3D Inverse Problems from Pre-trained 2D Diffusion Models
Hyungjin Chung · Dohoon Ryu · Michael McCann · Marc Klasky · Jong Ye
EDICT: Exact Diffusion Inversion via Coupled Transformations
Bram Wallace · Akash Gokul · Nikhil Naik
Safe Latent Diffusion: Mitigating Inappropriate Degeneration in Diffusion Models
Patrick Schramowski Manuel Brack Björn Deiseroth Kristian Kersting
GLIGEN: Open-Set Grounded Text-to-Image Generation
Yuheng Li Haotian Liu Qingyang Wu Fangzhou Mu Jianwei Yang Jianfeng Gao Chunyuan Li Yong Jae Lee
DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation
Nataniel Ruiz Yuanzhen Li Varun Jampani Yael Pritch Michael Rubinstein Kfir Aberman
Layout Diffusion: Controllable Diffusion Model for Layout-to-image Generation
Guangcong Zheng Xianpan Zhou Xuewei Li · Zhongang Qi · Ying Shan · Xi Li
Affordance Diffusion: Synthesizing Hand-Object Interactions
Yufei Ye · Xueting Li · Abhinav Gupta · Shalini De Mello · Stan Birchfield · Jiaming Song · Shubham Tulsiani · Sifei Liu
SceneComposer: Any-Level Semantic Image Synthesis
Yu Zeng · Zhe Lin · Jianming Zhang · Qing Liu · John Collomosse · Jason Kuen · Vishal Patel
Handwritten Text Generation from Visual Archetypes
Vittorio Pippi · Silvia Cascianelli · Rita Cucchiara
Referring Image Matting
Jizhizi Li · Jing Zhang · Dacheng Tao
Neural Transformation Fields for Arbitrary-Styled Font Generation
Bin Fu · Junjun He · Jianjun Wang · Yu Qiao
SmartBrush: Text and Shape Guided Object Inpainting with Diffusion Mode
Shaoan Xie Zhifei Zhang Zhe Lin Tobias Hinz Kun Zhang
Masked and Adaptive Transformer for Exemplar Based Image Translation
chang jiang Fei Gao Biao Ma Lin Yuhao Nannan Wang Gang Xu
Efficient Scale-Invariant Generator with Column-Row Entangled Pixel Synthesis
Thuan Nguyen Thanh Le Anh Tran
RWSC-Fusion: Region-Wise Style-Controlled Fusion Network for the Prohibited X-ray Security Image Synthesis
luwen duan Min Wu Lijian Mao Jun Yin Xiong Jianping Xi Li
Towards Artistic Image Aesthetics Assessment: a Large-scale Dataset and a New Method
Ran Yi Haoyuan Tian Zhihao Gu Yu-Kun Lai Paul Rosin
Omni Aggregation Networks for Lightweight Image Super-Resolution
Hang Wang Xuanhong Chen Bingbing Ni Yutian Liu Jinfan Liu
Activating More Pixels in Image Super-Resolution Transformer
Xiangyu Chen · Xintao Wang · Jiantao Zhou · Yu Qiao · Chao Dong
Spatial-Frequency Mutual Learning for Face Super-Resolution
Chenyang Wang · Junjun Jiang · Zhiwei Zhong · Xianming Liu
Kernel Aware Resampler
Michael Bernasconi · Abdelaziz Djelouah · Farnood Salehi · Markus Gross · Christopher Schroers
RGB no more: Minimally-decoded JPEG Vision Transformers
Jeongsoo Park · Justin Johnson
Multi-Realism Image Compression with a Conditional Generator
Eirikur Agustsson · David Minnen · George Toderici · Fabian Mentzer
Learning to Exploit the Sequence-Specific Prior Knowledge for Image Processing Pipelines Optimization
Haina Qin · Longfei Han · Weihua Xiong · Juan Wang · Wentao Ma · Bing Li · Weiming Hu
Quality-aware Pre-trained Models for Blind Image Quality Assessment
Kai Zhao · Kun Yuan · Ming Sun · Mading Li · Xing Wen
Robust Unsupervised StyleGAN Image Restoration
Yohan Poirier-Ginter · Jean-Francois Lalonde
RIDCP: Revitalizing Real Image Dehazing via High-Quality Codebook Priors
Rui-Qi Wu · Zheng-Peng Duan · Chunle Guo · Zhi Chai · Chongyi Li
Toward Stable, Interpretable, and Lightweight Hyperspectral Super-resolution
Wenjin Guo · Weiying Xie · Kai Jiang · Yunsong Li · Jie Lei · Leyuan Fang
Residual Degradation Learning Unfolding Framework with Mixing Priors across Spectral and Spatial for Compressive Spectral Imaging
Yubo Dong Dahua Gao Tian Qiu Yuyan Li Minxi Yang Guangming Shi
Learning a Simple Low-light Image Enhancer from Paired Low-light Instances
Zhenqi Fu Yan Yang Xiaotong Tu Yue Huang Xinghao Ding Kai-Kuang Ma
Learning a Deep Color Difference Metric for Photographic Images
Haoyu Chen Zhihua Wang Yang Yang Qilin Sun Kede Ma
Learning a Practical SDR-to-HDRTV Up-conversion using New Dataset and Degradation Models
Cheng Guo Leidong Fan Ziyu Xue Xiuhua Jiang
BiasBed - Rigorous Texture Bias Evaluation
Nikolai Kalischek · Rodrigo Daudt · Torben Peters · Reinhard Furrer · Jan D. Wegner · Konrad Schindler
A Unified HDR Imaging Method with Pixel and Patch Level
Qingsen Yan Weiye Chen song zhang Yu Zhu Jinqiu Sun Yanning Zhang
Gated Multi-Resolution Transfer Network for Burst Restoration and Enhancement
Nancy Mehta Akshay Dudhane Subrahmanyam Murala Syed Waqas Zamir Salman Khan Fahad Khan
Deep Discrimin ative Spatial and Temporal Network for Efficient Video Deblurring
Jinshan Pan Boming Xu Jiangxin Dong Jianjun Ge Jinhui Tang
1000 FPS HDR Video with a Spike-RGB Hybrid Camera
Yakun Chang Chu Zhou Yuchen Hong hu liwen Chao Xu Tiejun Huang Boxin Shi
Exploring Motion Ambiguity and Alignment for High-Quality Video Frame Interpolation
Kun Zhou · Wenbo Li · Xiaoguang Han · Jiangbo Lu
Range-nullspace Video Frame Interpolation with Focalized Motion Estimation
Zhiyang Yu · Yu Zhang · Dongqing Zou · Xijun Chen · Jimmy Ren · Shunqing Ren
Deep Polarization Reconstruction with PDAVIS Events
Haiyang Mei · Zuowen Wang · Xin Yang · Xiaopeng Wei · Tobi Delbruck
Un supervised space- time network for temporarily-consistent segmentation of multiple motions
Etienne Meunier Patrick Bouthemy
NeMo: Learning 3D Neural Motion Fields from Multiple Video Instances of the Same Action
Kuan-Chieh Wang Zhenzhen Weng Maria Xenochristou Joao Araujo Jeffrey Gu · Karen Liu · Serena Yeung
TranSG: Transformer-Based Skeleton Graph Prototype Contrastive Learning with Structure-Trajectory Prompted Reconstruction for Person Re-Identification
Haocong Rao · Chunyan Miao
FLAG3D: A 3D Fitness Activity Dataset with Language Instruction
Yansong Tang · Jinpeng Liu · Aoyang Liu · Bin Yang · Wenxun Dai · Yongming Rao · Jiwen Lu · Jie Zhou · Xiu Li
MetaPortrait: Identity-Preserving Talking Head Generation with Fast Personalized Adaptation
Bowen Zhang · Chenyang Qi · Pan Zhang · Bo Zhang · HsiangTao Wu · Dong Chen · Qifeng Chen · Yong Wang · Fang Wen
Feature Representation Learning with Adaptive Displacement Generation and Transformer Fusion for Micro-Expression Recognition
Zhijun Zhai · Jianhui Zhao · Chengjiang Long · Wenju Xu · He Shuangjiang · huijuan zhao
Clothing-Change Feature Augmentation for Person Re-Identification
Ke Han · Shaogang Gong · Yan Huang · Liang Wang · Tieniu Tan
MOTRv2: Bootstrapping End-to-End Multi-Object Tracking by Pretrained Object Detectors
Yuang Zhang · Tiancai Wang · Xiangyu Zhang
Camouflaged Object Detection with Feature Decomposition and Edge Reconstruction
Chunming He · Kai Li · Yachao Zhang · Longxiang Tang · Yulun Zhang · Zhenhua Guo · Xiu Li
Source-free Adaptive Gaze Estimation with Uncertainty Reduction
Xin Cai · Jiabei Zeng · Shiguang Shan · Xilin CHEN
PyPose: A Library for Robot Learning with Physics-based Optimization
Chen Wang · Dasong Gao · Kuan Xu · Junyi Geng · Yaoyu Hu · Yuheng Qiu · Bowen Li · Fan Yang · Brady Moon · Abhinav Pandey · Aryan FNU · Jiahe Xu · Tianhao Wu · Haonan He · Daning Huang · Zhongqiang Ren · Shibo Zhao · Taimeng Fu · Pranay Reddy Anthireddy · Xiao Lin · Wenshan Wang · Jingnan Shi · Rajat Talak · Kun Cao · Yi Du · Han Wang · Huai Yu · Shanzhao Wang · Siyu Chen · Ananth Kashyap · Rohan Bandaru · Karthik Dantu · Jiajun Wu · Lihua Xie · Luca Carlone · Marco Hutter · Sebastian Scherer
Stimulus Verification is a Universal and Effective Sampler in Multi-modal Human Trajectory Prediction
Jianhua Sun · Yuxuan Li · Liang Chai · Cewu Lu
StarCraftImage: A Dataset For Prototyping Spatial Reasoning Methods For Multi-Agent Environments
Sean Kulinski · Nicholas Waytowich · James Hare · David I. Inouye
ProphNet: Efficient Agent-Centric Motion Forecasting with Anchor-Informed Proposals
Xishun Wang · Tong Su · Fang Da · Xiaodong Yang
Think Twice before Driving: Towards Scalable Decoders for End-to- End Autonomous Driving
xiaosong JIA · Penghao wu · Li Chen · Jiangwei xiel He · JUNCHI yan · Hongyang Li
Humanbench: Towards General Human-Centric Perception w
itH Yuanzheng Ci · LEI BAI · Feng Zhu · Haiyang Yang · Li Yi · Rui Zhao · Wanli Ouyang
BEV-Guided Multi-Modality Fusion for Driving Perception
Yunze Man · Liangyan Gui · Yu-Xiong Wang
Robust and Scalable Gaussian Process Regression and Its Applications
Yifan Lu · Jiayi Ma · Leyuan Fang · Xin Tian · Junjun Jiang
Tangentially Elongated Gaussian Belief Propagation for Event-based Incremental Optical Flow Estimation
Jun Nagata · Yusuke Sekikawa
Adaptive Annealing for Robust Geometric Estimation
Sidhartha Chitturi · Lalit Manam · Venu Madhav Govindu
Iterative Geometry Encoding Volume for Stereo Matching
Xu Gangwei · Xianqi Wang · Xiaohuan Ding · Xin Yang
PMatch: Paired Masked Image Modeling for Dense Geometric Matching
Shengjie Zhu · Xiaoming Liu
Adaptive Spot-Guided Transformer for Consistent Local Feature Matching
Jiahuan Yu Jiahao Chang Jianfeng He Tianzhu Zhang Jiyang Yu Feng Wu
Learning Rotation-Equivariant Features for Visual Correspondence
Jongmin Lee Byungjin Kim Seungwook Kim Minsu Cho
UTM: A Unified Multiple Object Tracking Model with Identity-Aware Feature Enhancement
SISI You · Hantao Yao · Bing-Kun Bao · ChangSheng xu
Conjugate Product Graphs for Globally Optimal 2D-3D Shape Matching
Paul Rö Laehner · Florian Ber nard
LP-DIF: Learning Local Pattern-Specific Deep Implicit Function for 3D Objects and Scenes
Meng Wang · Yushen Liu · Yue Gao · Kanle Shi · Yi Fang · Zhizhong Han
HGNet: Learning Hierarchical Geometry from Points, Edges, and Surfaces
Ting Yao Yehao Li Yingwei Pan Tao Mei
Neural Intrinsic Embedding for Non-rigid Point Cloud Matching
puhua jiang Mingze Sun Ruqi Huang
PointClustering: Unsupervised Point Cloud Pre-training using Transformation Invariance in Clustering
Fuchen Long Ting Yao Zhaof an Qiu · Lusong Li · Tao Mei
Self-positioning Point-based Transformer for Point Cloud Understanding
Jinyoung Park · Sanghyeok Lee · Sihyeon Kim · Yunyang Xiong · Hyunwoo Kim
PointConvFormer: Revenge of the Point-Based Convolution
Wenxuan Wu · Li Fuxin · Qi Shan
Learning 3D Representations from 2D Pre-trained Models via Image-to-Point Masked Autoencoders
Renrui Zhang Liuhui Wang Yu Qiao Peng Gao Hongsheng Li
Geometry and Uncertainty-Aware 3D Point Cloud Class-Incremental Semantic Segmentation
Yuwei Yang Munawar Hayat Zhao Jin Chao Ren Yinjie Lei
Learning Weather-General and Weather-Specific Features for Image Restoration Under Multiple Adverse Weather Conditions
Yurui Zhu Tianyu Wang · Xueyang Fu Xuanyu Yang Xin Guo Jifeng Dai Yu Qiao Xiaowei Hu
PartSLIP: Low-Shot Part Segmentation for 3D Point Clouds via Pretrained Image-Language Models
Minghua Liu Yinhao Zhu Hong Cai Shizhong Han Zhan Ling Fatih Porikli Hao Su
Semi-Weakly Supervised Object Kinematic Motion Prediction
Gengxin Liu Qian Sun Haibin Huang Chongyang Ma Yulan Guo Li Yi Hui Huang Ruizhen Hu
Implicit Surface Contrastive Clustering for LiDAR Point Clouds
Zaiwei Zhang Min Bai Li Erran Li
LaserMix for Semi-Supervised LiDAR Semantic Segmentation
Lingdong Kong Jiawei Ren Liang Pan Ziwei Liu
MSeg3D: Multi-modal 3D Semantic Segmentation for Autonomous Driv ing
Jiale Li · Hang Dai Hao Han Yong Ding
GraVoS: Voxel Selection for 3D Point-Cloud Detection
Oren Shrout Yizhak Ben-Shabat Ayellet Tal
VoxelNeXt: Fully Sparse VoxelNet for 3D Object Detection and Tracking
Yukang Chen Jianhui Liu Xiangyu Zhang XIAOJUAN QI · Jiaya Jia
Virtual Sparse Convolution for Multimodal 3D Object Detection
Hai Wu · Chenglu Wen · Shaoshuai Shi · Xin Li · Cheng Wang
MSMDFusion: Fusing LiDAR and Camera at Multiple Scales with Multi-Depth Seeds for 3D Object Detection
Yang Jiao · ZEQUN JIE · Shaoxiang Chen · Jingjing Chen · Lin Ma · Yu-Gang Jiang
OrienterNet: Visual Localization in 2D Public Maps with Neural Matching
Paul-Edouard Sarlin · Daniel DeTone · Tsun-Yi Yang · Armen Avetisyan · Julian Straub · Tomasz Malisiewicz · Samuel Rota Bulò · Richard Newcombe · Peter Kontschieder · Vasileios Balntas
Uncertainty-aware Vision-based Metric Cross-view Geolocalization
Florian Fervers · Sebastian Bullinger · Christoph Bodensteiner · Michael Arens · Rainer Stiefelhagen
BEVHeight: A Robust Framework for Vision-based Roadside 3D Object Detection
Lei Yang · Kaicheng Yu · tao tang · Jun Li · Kun Yuan · Li Wang · Xinyu Zhang · Peng Chen
Understanding the Robustness of 3D Object Detection with Bird's-Eye-View Representations in Autonomous Driving
Zijian Zhu · Yichi Zhang · Hai Chen · Yinpeng Dong Shu Zhao Wenbo Ding Jiachen Zhong Shibao Zheng
Object Detection with Self-Supervised Scene Adaptation
ZEKUN ZHANG Minh Hoai
AeDet: Azimuth-invariant Multi-view 3D Object Detection
Chengjian Feng ZEQUN JIE Yujie Zhong Xiangxiang Chu L in Ma
CAPE: Camera View Position Embedding for Multi-View 3D Object Detection
Kaixin Xiong · Shi Gong · Xiaoqing Ye · Xiao Tan · Ji Wan · Errui Ding · Jingdong Wang · Xiang Bai
VL-SAT: Visual-Linguistic Semantics Assisted Training for 3D Semantic Scene Graph Prediction in Point Cloud
Ziqin Wang · Bowen Cheng · Lichen Zhao · Dong Xu · Yang Tang · Lyu Sheng
Modality-invariant Visual Odometry for Embodied Vision
Marius Memmel · Roman Bachmann · Amir Zamir
Learning to Fuse Monocular and Multi-view Cues for Multi-frame Depth Estimation in Dynamic Scenes
Rui Li · Dong Gong · Wei Yin · Hao Chen · Yu Zhu · Kaixuan Wang · Xiaozhi Chen · Jinqiu Sun · Yanning Zhang
OmniVidar: Omnidirectional Depth Estimation from Multi-Fisheye Images
Sheng Xie · Daochuan Wang · Yun-Hui Liu
DINN360: Deformable Invertible Neural Networks for Latitude-aware 360
\degree
Image Rescaling
Yichen Guo · Mai Xu · Lai Jiang · Ning Li · Leon Sigal · Yunjin Chen
GeoMVSNet: Learning Multi-View Stereo with Geometry Perception
Zhe Zhang · Rui Peng · Yuxi Hu · Ronggang Wang
A Practical Stereo Depth System for Smart Glasses
Jialiang Wang · Daniel Scharstein · Akash Bapat · Kevin Blackburn-Matzen · Matthew Yu · Jonathan Lehman · Suhib Alsisan · Yanghan Wang · Sam Tsai · Jan-Michael Frahm · Zijian He · Peter Vajda · Michael Cohen · Matt Uyttendaele DC 2 : Dual-Camera
Defocus
Control
by Learning to Refocus
Hadi AlZayer Abdullah Abuolaim Leung Chun Chan Yang Yang Ying Lou Jia-Bin Huang Abhishek Kar
iDisc: Internal Discretization for Monocular Depth Estimation
Luigi Piccinelli Christos Sakaridis Fisher Yu
SfM-TTR: Using Structure from Motion for Test-Time Refinement of Single-View Depth Networks
Sergio Izquierdo Javier Civera
Inverting the Imaging Process by Learning an Implicit Camera Model
Xin Huang Qi Zhang Ying Feng · Hongdong Li · Qing Wang
Learning to Measure the Point Cloud Reconstruction Loss in a Representation Space
Tianxin Huang · Zhonggan Ding · Jiangning Zhang · Ying Tai · Zhenyu Zhang · Mingang Chen · Chengjie Wang · Yong Liu
Better “CMOS” Produces Clearer Images: Learning Space-Variant Blur Estimation for Blind Image Super-Resolution
Xuhai Chen Jiangning Zhang Chao Xu Yabiao Wang Chengjie Wang Yong Liu
Delivering Arbitrary-Modal Semantic Segmentation
Jiaming Zhang Ruiping Liu Hao Shi Kailun Yang Simon Reiß Haodong Fu Kunyu Peng Kaiwei Wang Rainer Stiefelhagen Efficient Hierarchical
Entropy Model for Learned Point Cloud Compression
Rui Song Chunyang Fu Shan Liu · Ge Li
Revisiting Temporal Modeling for CLIP-based Image-to-Video Knowledge Transferring
Ruyang Liu Jingjia Huang Ge Li Jiashi Feng Xinglong Wu Thomas Li
Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP
Feng Liang Bichen Wu Xiaoliang Dai Kunpeng Li Yinan Zhao Hang Zhang Peizhao Zhang Peter Vajda Diana Marculescu
Imagic: Text-Based Real Image Editing with Diffusion Models
Bahjat Kawar · Shiran Zada ​​· Oran Lang · Omer Tov · Huiwen Chang · Tali Dekel · Inbar Mosseri · michal Irani
Neumann Network with Recursive Kernels for Single Image Defocus Deblurring
Yuhui Quan · Zicong Wu · Hui Ji
Transfer4D: A framework for frugal motion capture and deformation transfer
Shubh Maheshwari · Rahul Narain · Ramya Hebbalaguppe
Iterative Proposal Refinement for Weakly-Supervised Video Grounding
Meng Cao · Fangyun Wei · Can Xu · Xiubo Geng · Long Chen · Can Zhang · Yuexian Zou · Tao Shen · Daxin Jiang X
3
KD
: Knowledge Distillation Across Modalities, Tasks and Stages for Multi-Camera 3D Object Detection
Marvin Klingner · Shubhankar Borse · Varun Ravi Kumar · Behnaz Rezaei · Venkatraman Narayanan · Senthil Yogamani · Fatih Porikli
AnyFlow: Arbitrary Scale Optical Flow with Implicit Neural Representation
Hyunyoung Jung · Zhuo Hui · Lei Luo · Haitao Yang · Feng Liu · Sungjoo Yoo · Rakesh Ranjan · Denis Demandolx
IterativePFN: True Iterative Point Cloud Filtering
Dasith de Silva Edirimuni · Xuequan Lu · Zhiwen Shao · Gang Li · Antonio Robles-Kelly · Ying He
Fake it till you make it: Learning transferable representations from synthetic ImageNet clones
Mert Bulent Sariyildiz · Karteek Alahari · Diane Larlus · Yannis Kalantidis
Disentangling Orthogonal Planes for Indoor Panoramic Room Layout Estimation with Cross-Scale Distortion Awareness
Zhijie Shen · Zishuo Zheng · Chunyu Lin · Lang Nie · Kang Liao · Shuai Zheng · Yao Zhao
Exploring Incompatible Knowledge Transfer in Few-shot Image Generation
Yunqing Zhao · Chao Du · Milad Abdollahzadeh · Tianyu Pang · Min Lin · Shuicheng YAN · Ngai-man Cheung
OmniObject3D: Large Vocabulary 3D Object Dataset for Realistic Perception, Reconstruction and Generation
Tong Wu · Jiarui Zhang · Xiao Fu · Yuxin WANG · Jiawei Ren · Liang Pan · Wenyan Wu · Lei Yang · Jiaqi Wang · Chen Qian · Dahua Lin · Ziwei Liu
CelebV-Text: A Large-Scale Facial Text-Video Dataset
Jianhui Yu · Hao Zhu · Liming Jiang · CHEN CHANGE LOY · Weidong Cai · Wenyan Wu
TensoIR: Tensorial Inverse Rendering
Haian Jin · Isabella Liu · Peijia Xu · Xiaoshuai Zhang · Songfang Han · Sai Bi · Xiaowei Zhou · Zexiang Xu · Hao Su
Simultaneously Short- and Long-Term Temporal Modeling for Semi-Supervised Video Semantic Segmentation
Jiangwei Lao · Weixiang Hong · Xin Guo · Yingying Zhang · Wang Jian · Jingdong Chen · Wei Chu
Integral Neural Networks
Kirill Solodskikh · Azim Kurbanov · Ruslan Aydarkhanov · Irina Zhelavskaya · Yury Parfenov · Dehua Song · Stamatios Lefkimmiatis
FEND: A Future Enhanced Distribution-Aware Contrastive Learning Framework For Long-tail Trajectory Prediction
Yuning Wang · Pu Zhang · LEI BAI · Jianru Xue
NeuralEditor: Editing Neural Radiance Fields via Manipulating Point Clouds
Junkun Chen · Jipeng Lyu · Yu-Xiong Wang
3D Line Mapping Revisited
Shaohui Liu · Yifan Yu · Rémi Pautrat · Marc Pollefeys · Viktor Larsson
Single View Scene Scale Estimation using Scale Field
Byeong-Uk Lee · Jianming Zhang · Yannick Hold-Geoffroy · In So Kweon
PlaneDepth: Self-supervised Depth Estimation via Orthogonal Planes
Ruoyu Wang · Zehao Yu · Shenghua Gao
Self-supervised Super-plane for Neural 3D Reconstruction
Botao Ye · Sifei Liu · Xueting Li · Ming-Hsuan Yang
NeurOCS: Neural NOCS Supervision for Monocular 3D Object Localization
Zhixiang Min · Bingbing Zhuang · Samuel Schulter · Buyu Liu · Enrique Dunn · Manmohan Chandraker
Multi-sensor large-scale dataset for multi-view 3D reconstruction
Oleg Voynov · Gleb Bobrovskikh · Pavel Karpyshev · Saveliy Galochkin · Andrei-Timotei Ardelean · Arseniy Bozhenko · Ekaterina Karmanova · Pavel Kopanev · Yaroslav Labutin-Rymsho · Ruslan Rakhimov · Aleksandr Safin · Valerii Serpiva · Alexey Artemov · Evgeny Burnaev · Dzmitry Tsetserukou · Denis Zorin
AutoRecon: Automated 3D Object Discovery and Reconstruction
Yuang Wang · Xingyi He · Sida Peng · Haotong Lin · Hujun Bao · Xiaowei Zhou
A Large-Scale Homography Benchmark
Daniel Barath · Dmytro Mishkin · Michal Polic · Wolfgang Förstner · Jiri Matas
SparsePose: Sparse- View Camera Pose Regression and Refinement
Samarth Sinha · Jason Zhang · Andrea Tagliasacchi · Igor Gilitschenski · David Lindell
Few-shot Geometry-Aware Keypoint Localization
Xingzhe He · Gaurav Bharaj · David Ferman · Helge Rhodin · Pablo Garrido
Self-Supervised Representation Learning for CAD
Benjamin Jones · Michael Hu · Milin Kodnongbua · Vladimir Kim · Adriana Schulz
IMP: Iterative Matching and Pose Estimation with Adaptive Pooling
Fei XUE · Ignas Budvytis · Roberto Cipolla
SMOC-Net: Leveraging Camera Pose for Self-Supervised Monocular Object Pose Estimation
Tao Tan · Qiulei Dong
Markerless Camera-to-Robot Pose Estimation via Self-supervised Sim-to-Real Transfer
Jingpei Lu · Florian Richter · Michael Yip
TTA-COPE: Test-Time Adaptation for Category-Level Object Pose Estimation
Taeyeop Lee Jonathan Tremblay Valts Blukis Bowen Wen Byeong-Uk Lee Inkyu Shin Stan Birchfield In So Kweon Kuk-Jin YOON
3D-POP - An automated annotation approach to facilitate markerless 2D-3D tracking of freely moving birds with marker-based motion capture
Hemal Naik Hoi Hang Chan Junran Yang Mathilde Delacoux Iain Couzin Fumihiro Kano Máté Nagy
Delving into Discrete Normalizing Flows on SO(3) Manifold for Probabilistic Rotation Modeling
Yulin Liu Haoran Liu Yingda Yin · Yang Wang · Baoquan Chen · He Wang
PSVT: End-to-End Multi-person 3D Pose and Shape Estimation with Progressive Video Transformers
Zhongwei Qiu · Yang Qiansheng · Jian Wang · Haocheng Feng · Junyu Han · Errui Ding · Chang Xu · Dongmei Fu · Jingdong Wang
Hierarchical Temporal Transformer for 3D Hand Pose Estimation and Action Recognition from Egocentric RGB Videos
Yilin Wen · Hao Pan · Lei Yang · Jia Pan · Taku Komura · Wenping Wang
GarmentTracking: Category-Level Garment Pose Tracking
Han Xue · Wenqiang Xu · Jieyi Zhang · Tutian Tang · Yutong Li · Wenxin Du · Ruolin Ye · Cewu Lu
Towards Transferable Targeted Adversarial Examples
Zhibo Wang · Hongshan Yang · Yunhe Feng · Peng Sun · Hengchang Guo · Zhifei Zhang · Kui Ren
Proximal Splitting Adversarial Attack for Semantic Segmentation
Jérôme Rony · Jean-Christophe Pesquet · Ismail Ayed
T-SEA: Transfer-based Self-Ensemble Attack on Object Detection
Hao Huang · Ziyan Chen · Huanran Chen · Yongtao Wang · Kevin Zhang
Reinforcement Learning-Based Black-Box Model Inversion Attacks
Gyojin Han Jaehyun Choi Haeil Lee Junmo Kim
Progressive Backdoor Erasing via connecting Backdoor and Adversarial Attacks
Bingxu Mu Zhenxing Niu Le Wang xue wang Qiguang Miao Rong Jin Gang Hua
MEDIC: Remove Model Backdoors via Importance Driven Cloning
Qiuling Xu Guanhong Tao Jean Honorio Yingqi Liu Shengwei An Guangyu Shen Siyuan Cheng Xiangyu Zhang Model
Barrier: A Compact Un-Transferable Isolation Domain for Model Intellectual Property Protection
Liany u Wang · Meng Wang · Daoqiang Zhang · Huazhu Fu
Adversarially Masking Synthetic to Mimic Real: Adaptive Noise Injection for Point Cloud Segmentation Adaptation
Guangrui Li · Guoliang Kang · Xiaohan Wang · Yunchao Wei · Yi Yang
Instance-Aware Domain Generalization for Face Anti-Spoofing
Qianyu Zhou · Ke-Yue Zhang · Taiping Yao · Xuequan Lu · Ran Yi · Shouhong Ding · Lizhuang Ma
Bias-Eliminating Augmentation Learning for Debiased Federated Learning
Yuan-Yi Xu · Ci-Siang Lin · Yu-Chiang Frank Wang
Adaptive Channel Sparsity for Federated Learning under System Heterogeneity
Dongping Liao · Xitong Gao · Yiren Zhao · Cheng-zhong Xu
Reliable and Interpretable Personalized Federated Learning
Zixuan Qin · Liu Yang · Qilong Wang · Yahong Han · Qinghua Hu
DaFKD: Domain-aware Federated Knowledge Distillation
Haozhao Wang · Yichen Li · Wenchao Xu · Ruixuan Li · Yufeng Zhan · Zhigang Zeng
SimpleNet: A Simple Network for Image Anomaly Detection and Localization
Zhikang Liu · Yiming Zhou · Yuansheng Xu · Zilei Wang
A New Comprehensive Benchmark for Semi-supervised Video Anomaly Detection and Anticipation
Congqi Cao · Yue Lu · PENG WANG · Yanning Zhang
Masked Jigsaw Puzzle : A Versatile Position Embedding for Vision Transformers
Bin Ren · Yahui Liu · Yue Song · Wei Bi · Rita Cucchiara · Nicu Sebe · Wei Wang
ImageNet-E: Benchmarking Neural Network Robustness against Attribute Editing
Xiaodan Li · YUEFENG CHEN · Yao Zhu · Shuhui Wang · Rong Zhang · Hui Xue’
Private Image Generation with Dual-Purpose Auxiliary Classifier
Chen Chen · Daochang Liu · Siqi Ma · Surya Nepal · Chang Xu
Discriminator-Cooperated Feature Map Distillation for GAN Compression
Tie Hu · Mingbao Lin · Lizhou You · Fei Chao · Rongrong Ji
TeSLA: Test-Time Self-Learning With Automatic Adversarial Augmentation
DEVAVRAT TOMAR · Guillaume Vray · Behzad Bozorgtabar · Jean-Philippe Thiran
Practical Network Acceleration with Tiny Sets
Guo-Hua Wang · Jianxin Wu
NoisyQuant: Noisy Bias-Enhanced Post-Training Activation Quantization for Vision Transformers
Yijiang Liu · Huanrui Yang · ZHEN DONG · Kurt Keutzer · Li Du · Shanghang Zhang
Bias Mimicking: A Simple Sampling Approach for Bias Mitigation
Maan Qraitem · Kate Saenko · Bryan Plummer
Masked Images Are Counterfactual Samples for Robust Fine-tuning
Yao Xiao · Ziyi Tang · Pengxu Wei · Cong Liu · Liang Lin
Samples with Low Loss Curvature Improve Data Efficiency
Isha Garg · Kaushik Roy
Defining and Quantifying the Emergence of Sparse Concepts in DNNs
Jie Ren · Mingjie Li · Qirui Chen · Huiqi Deng · Quanshi Zhang
Network Expansion For Practical Training Acceleration
Ning Ding · Yehui Tang · Kai Han · Chao Xu · Yunhe Wang
AstroNet: When Astrocyte Meets Artificial Neural Network
Mengqiao Han · Liyuan Pan · Xiabi Liu
Gradient Norm Aware Minimization Seeks First-Order Flatness and Improves Generalization
Xingxuan Zhang · Renzhe Xu · Han Yu · Hao Zou · Peng Cui
Re-basin via implicit Sinkhorn differentiation
Fidel A Guerrero Pena · Heitor Medeiros · Thomas Dubail · Masih Aminbeidokhti · Eric Granger · Marco Pedersoli
Tunable Convolutions with Parametric Multi-Loss Optimization
Matteo Maggioni · Thomas Tanay · Francesca Babiloni · Steven McDonagh · Ales Leonardis
Frustratingly Easy Regularization on Representation Can Boost Deep Reinforcement Learning
Xinwen Hou · Huangyuan Su · Jieyu Zhang · Xinwen Hou
Simulated Annealing in Early Layers Leads to Better Generalization
Amirmohammad Sarfi · Zahra Karimpour · Muawiz Chaudhary · Nasir Khalid · Mirco Ravanelli · Sudhir Mudur · Eugene Belilovsky
On the Stability-Plasticity Dilemma of Class-Incremental Learning
Dongwan Kim Bohyung Han
Decoupling Learning and Remembering: a Bilevel Memory Framework with Knowledge Projection for Task-Incremental Learning
Wenju Sun Qingyong Li Jing Zhang Wen Wang Yangliao Geng
Generalization Matters: Loss Minima Flattening via Parameter Hybridization for Efficient Online Knowledge Distillation
Tianli Zhang · Mengqi Xue · Jiangtao Zhang · Haofei Zhang · Yu Wang · Lechao Cheng · Jie Song · Mingli Song
Regularizing Second-Order Influences for Continual Learning
Zhicheng Sun · Yadong MU · Gang Hua
Rethinking Feature-based Knowledge Distillation for Face Recognition
Jing zhi li · Zidong Guo · Hui Li · Seungju Han · Ji-won Baek · Min Yang · Ran Yang · Sungjoo Suh
ERM-KTP: Knowledge-level Machine Unlearning via Knowledge Transfer
Shen Lin Xiaoyu Zhang Chenyang Chen Xiaofeng Chen Willy Susilo
Partial Network Cloning
Jingwen Ye Songhua Liu Xinchao Wang
Rebalancing Batch Normalization for Exemplar-based Class-Incremental Lear ning
Sungmin Cha · Sungjun Cho · Dasol Hwang · Sunwon Hong · Moontae Lee · Taesup Moon
1% VS 100%: Parameter-Efficient Low Rank Adapter for Dense Predictions
Dongshuo Yin · Yiran Yang · Zhechao Wang · Hongfeng Yu · kaiwen wei · Xian Sun
MELTR: Meta Loss Transformer for Learning to Fine-tune Video Foundation Models
Dohwan Ko Joonmyung Choi Hyeong Kyu Choi Kyoung-Woon On Byungseok Roh Hyunwoo Kim
MDL-NAS: A Joint Multi-domain Learning framework for Vision Transformer
Shiguang Wang · TAO XIE · Jian Cheng · Xingcheng ZHANG · Haijun Liu
Independent Component Alignment for Multi-Task Learning
Dmitry Senushkin · Nikolay Patakin · Arsenii Kuznetsov · Anton Konushin
Revisiting Prototy pical Network for Cross Domain Few-Shot Learning
Fei Zhou Peng Wang Lei Zhang Wei Wei Yanning Zhang
Feature Alignment and Uniformity for Test Time Adaptation
Shuai Wang Daoan Zhang Zipei YAN Jianguo Zhang Rui Li
MMANet: Margin-aware Distillation and Modality-aware Regularization for Incomplete Multimodal Learning
shicai wei · Chunbo Luo · Yang Luo
PMR: Prototypical Modal Rebalance for Multimodal Learning
Yunfeng FAN · Wenchao Xu · Haozhao Wang · Junxiao Wang · Song Guo
Upcycling Models under Domain and Category Shift
Sanqing Qu · Tianpei Zou · Florian Röhrbein · Cewu Lu · Guang Chen · Dacheng Tao · changjun jiang
MHPL: Minimum Happy Points Learning for Active Source Free Domain Adaptation
Fan Wang · Zhongyi Han · Zhiyan Zhang · Rundong He · Yilong Yin
COT: Unsupervised Domain Adaptation with Clustering and Optimal Transport
Yang Liu · Zhipeng Zhou · Baigui Sun
FREDOM: Fairness Domain Adaptation Approach to Semantic Scene Understanding
Thanh-Dat Truong · Ngan Le · Bhiksha Raj · Jackson Cothren · Khoa Luu
Transfer Knowledge from Head to Tail: Uncertainty Calibration under Long-tailed Distribution
Jiahao Chen · Bing Su
Balanced Product of Calibrated Experts for Long-Tailed Recognition
Emanuel Sanchez Aimar · Arvi Jonnarth · Michael Felsberg · Marco Kuhlmann
Why is the winner the best?
Matthias Eisenmann · Annika Reinke · Vivienn Weru · Minu Tizabi · Fabian Isensee · Tim Adler · Sharib Ali · Vincent Andrearczyk · Marc Aubreville · Ujjwal Baid · Spyridon Bakas · Niranjan Balu · Sophia Bano · Jorge Bernal · Sebastian Bodenstedt · Alessandro Casella · Veronika Cheplygina · Marie Daum · Marleen de Bruijne · Adrien Depeursinge · Reuben Dorent · Jan Egger · David Ellis · Sandy Engelhardt · Melanie Ganz · Noha Ghatwary · Gabriel Girard · Patrick Godau · Anubha Gupta · Lasse Hansen · Kanako Harada · Mattias Heinrich · Nicholas Heller · Alessa Hering · Arnaud Huaulmé · Pierre Jannin · Ali Emre Kavur · Oldřich Kodym · Michal Kozubek · Jianning Li · Hongwei Li · Jun Ma · Carlos Isla · bjoern menze · Alison Noble · Valentin Oreiller · Nicolas Padoy · Sarthak Pati · Kelly Payette · Tim Rädsch · Jonathan Rafael-Patino · Vivek Bawa · Stefanie Speidel · Carole Sudre · Kimberlin van Wijnen · Martin Wagner · Donglai Wei · Amine Yamlahi · Moi Hoon Yap · Chun Yuan · Maximilian Zenk · Aneeq Zia · David Zimmerer · Dogu Baran Aydogan · Binod Bhattarai · Louise Bloch · Raphael Brüngel · Jihoon Cho · Chanyeol Choi · DOU QI · Ivan Ezhov · Christoph M. Friedrich · Clifton Fuller · Rebati Gaire · Adrian Galdran · Álvaro García Faura · Maria Grammatikopoulou · SeulGi Hong · Mostafa Jahanifar · Ikbeom Jang · Abdolrahim Kadkhodamohammadi · Inha Kang · Florian Kofler · Satoshi Kondo · Hugo Kuijf · Mingxing Li · Huan Luu · Tomaž Martinčič · Pedro Morais · Mohamed Naser · Bruno Oliveira · David Owen · Subeen Pang · Jinah Park · Sung-Hong Park · Szymon Plotka · Elodie Puybareau · Nasir Rajpoot · Kanghyun Ryu · Numan Saeed · Adam Shephard · Pengcheng Shi · Dejan Štepec · Ronast Subedi · Guillaume Tochon · Helena Torres · Helene Urien · João Vilaça · Kareem Wahid · haojie wang · jiacheng wang · Liansheng Wang · Xiyue Wang · Benedikt Wiestler · Marek Wodzinski · Fangfang Xia · Juanying Xie · Zhiwei Xiong · Sen Yang · Yanwu Yang · Zixuan Zhao · Klaus Maier-Hein · Paul Jaeger · Annette Kopp-Schneider · Lena Maier-hein
SuperDisco: Super-Class Discovery Improves Visual Recognition for the Long-Tail
Yingjun Du · Jiayi Shen · Xiantong Zhen · Cees Snoek
Learning from Noisy Labels with Decoupled Meta Label Purifier
Yuanpeng Tu · Boshen Zhang · Yuxi Li · Liang Liu · Jian Li · Yabiao Wang · Chengjie Wang · Cai Zhao
Class Prototypes based Contrastive Learning for Classifying Multi-Label and Fine-Grained Educational Videos
Rohit Gupta · Anirban Roy · Sujeong Kim · Claire Christensen · Todd Grindal · Sarah Gerard · Madeline Cincebeaux · Ajay Divakaran · Mubarak Shah
MaskCon: Masked Contrastive Learning for Coarse-Labelled Dataset
Chen Feng · Ioannis Patras
HIER: Metric Learning Beyond Class Labels via Hierarchical Regularization
Sungyeon Kim · Boseung Jeong · Suha Kwak
Bi-directional Distribution Alignment for Transductive Zero Shot Learning
Zhicai Wang · YANBIN HAO · Tingting Mu · Ouxiang Li · Shuo Wang · Xiangnan He
BiCro: Noisy Correspondence Rectification for Multi-modality Data via Bi-directional Cross-modal Similarity Consistency
Shuo Yang · xu Pan · Kai Wang · Yang You · Hongxun Yao · Tongliang Liu · Min Xu
Exploring and Exploiting Uncertainty for Incomplete Multi-View Classification
Mengyao Xie · Zongbo Han · Changqing Zhang · Yichen Bai · Qinghua Hu
GCFAgg: Global and Cross-view Feature Aggregation for Multi-view Clustering
Weiqing Yan · Yuanyang Zhang · Chenlei Lv · Chang Tang · Guanghui Yue · Liang Liao · Weisi Lin
LINe: Out-of-Distribution Detection by Leveraging Important Neurons
Yong Hyun Ahn · Gyeong-Moon Park · Seong Tae Kim
Visual prompt tuning for generative transfer learning
Kihyuk Sohn · Huiwen Chang · Jose Lezama · Luisa Polania Cabrera · Han Zhang · Yuan Hao · Irfan Essa · Lu Jiang
Interventional Bag Multi-Instance Learning On Whole-Slide Pathological Images
Tiancheng Lin · Yu Zhimiao · Hongyu Hu · Yi Xu · Chang-Wen Chen
Image Quality-aware Diagnosis via Meta-knowledge Co-embedding
Haoxuan Che · Siyu Chen · Hao Chen
KiUT: Knowledge-injected U-Transformer for Radiology Report Generation
Zhongzhen Huang · Xiaofan Zhang · Shaoting Zhang
Hierarchical discriminative learning improves visual representations of biomedical microscopy
Cheng Jiang · Xinhai Hou · Akhil Kondepudi · Asadur Chowdury · Christian Freudiger · Daniel Orringer · Honglak Lee · Todd Hollon
Pseudo-label Guided Contrastive Learning for Semi-supervised Medical Image Segmentation
Hritam Basak · Zhaozheng Yin
FFF: Fragment-Guided Flexible Fitting for Building Complete Protein Structures
Weijie Chen · Xinyan Wang · Yuhang Wang
Visual Language Pretrained Multiple Instance Zero-Shot Transfer for Histopathology Images
Ming Y. Lu · Bowen Chen · Andrew Zhang · Drew Williamson · Richard Chen · Tong Ding · Long Le · Yung-Sung Chuang · Faisal Mahmood
ProD: Prompting-to-disentangle Domain Knowledge for Cross-domain Few-shot Image Classification
Tianyi Ma · Yifan Sun · Zongxin Yang · Yi Yang
Open-Set Representation Learning through Combinatorial Embedding
Geeho Kim · Junoh Kang · Bohyung Han
Multiclass Confidence and Localization Calibration for Object Detection
Bimsara Pathiraja · Malitha Gunawardhana · Muhammad Khan Khan
Distilling Scale-Aware Knowledge in Small Object Detector
Yichen Zhu · Qiqi Zhou · Ning Liu · Zhiyuan Xu · Zhicai Ou · mou xiaofeng · Jian Tang
Generating Features with Increased Crop-related Diversity for Few-Shot Object Detection
Jingyi Xu · Hieu Le · Dimitris Samaras
DETRs with Hybrid Matching
Ding Jia · Yuhui Yuan · Haodi He · Xiaopei Wu · Haojun Yu · Weihong Lin · Lei Sun · Chao Zhang · Han Hu
Adaptive Sparse Pairwise Loss for Object Re-Identification
Xiao Zhou · Yujie Zhong · Zhen Cheng · Fan Liang · Lin Ma
CAT : LoCalization and IdentifyAtion Cascade Detection Transformer for Open-World Object Detection
Shuailei Ma · Yuefeng Wang · Ying Wei · Jiaqi Fan · Thomas Li · Hongli Liu · fanbing Lv
Weak-shot Object Detection through Mutual Knowledge Transfer
Xuanyi Du · Weitao Wan · Chong Sun · Chen Li
Modeling the Distributional Uncertainty for Salient Object Detection Models
Jing Zhang · Mochu Xiang · Yuchao Dai · Xinyu Tian
Supervised Masked Knowledge Distillation for Few-Shot Transformers
Han Lin · Guangxing Han · Jiawei Ma · Shiyuan Huang · Xudong Lin · Shih-Fu Chang
Co-Salient Object Detection with Uncertainty-aware Group Exchange-Masking
Yang Wu · Huihui Song · Bo Liu · Kaihua Zhang · Dong Liu
Distilling Self-Supervised Vision Transformers for Weakly-Supervised Few-Shot Classification & Segmentation
Dahyun Kang · Piotr Koniusz · Minsu Cho · Naila Murray
DualRel: Semi-Supervised Mitochondria Segmentation from A Prototype Perspective
Huayu Mai · Rui Sun · Tianzhu Zhang · Zhiwei Xiong · Feng Wu
WinCLIP: Zero-/Few-Shot Anomaly Classification and Segmentation
Jongheon Jeong · Yang Zou · Taewan Kim · DongQing Zhang · Avinash Ravichandran · Onkar Dabeer
Learning Multi-Modal Class-Specific tokens for Weakly Supervise Object LaCalizing
Lian XU · WANLI OUYANG · Mohammed BennaMoun · Farid Bourid · Dan · Dan XU
CONFLICT-BASED Cross-View Consistency for Semi-Supervised Semantic Segmentation
Zicheng Wang · Zhenzhao · Xiaoxia Xing · Dong Xu Xiangyu Kong Luping Zhou
Boundary-enhanced Co-training for Weakly Supervised Semantic Segmentation
Shenghai Rong Bohai Tu Zilei Wang Junjie Li
Balancing Logit Variation for Long-tailed Semantic Segmentation
Yuchao Wang Jingjing Fei Haochen Wang Wei Li · Tianpeng Bao · Liwei Wu · Rui Zhao · Yujun Shen
Leveraging Hidden Positives for Unsupervised Semantic Segmentation
Hyun Seok Seong · WonJun Moon · Su Been Lee · Jae-Pil Heo
PIDNet: A Real-time Semantic Segmentation Network Inspired by PID Controllers
Jiacong Xu · Zixiang Xiong · Shankar P Bhattacharyya
AttentionShift: Iteratively Estimated Part-based Attention Map for Pointly Supervised Instance Segmentation
Mingxiang Liao · Zonghao Guo · Yuze Wang · Peng Yuan · bailan feng · Fang Wan
Principles of Forgetting in Domain-Incremental Semantic Segmentation in Adverse Weather Conditions
Tobias Kalb · Jürgen Beyerer
Semantic-Promoted Debiasing and Background Disambiguation for Zero-Shot Instance Segmentation
SHUTING HE · Henghui Ding · Wei Jiang
Interactive Segmentation as Gaussion Process Classification
Minghao Zhou · Hong Wang · Qian Zhao · Yuexiang Li · Yawen Huang · Deyu Meng · Yefeng Zheng
Meta Compositional Referring Expression Segmentation
Li Xu · Mark Huang · Xindi Shang · Zehuan Yuan · Ying Sun · Jun Liu
DejaVu: Conditional Regenerative Learning to Enhance Dense Prediction
Shubhankar Borse Debasmit Das Hyojin Park Hong Cai Risheek Garrepalli Fatih Porikli
Zero-shot Referring Image Segmentation with Global-Local Context Features
seonghoon yu Paul Hongsuck Seo Jeany Son
FreeSeg: Unified, Universal and Open-Voc Abulary Image Segmentation
Jie Qin · Jie Wu · Pengxiang Yan · Ming Li · Yuxi Ren · Xuefeng Xiao · Yitong Wang · Rui Wang · Shilei Wen · Xin Pan · Xingang Wang
Semantic Human Parsing via Scalable Semantic Transfer over Multiple Label Domains
Jie Yang Chaoqun Wang Zhen Li Junle Wang Ruimao Zhang
Open Vocabulary Semantic Segmentation with Patch Aligned Contrastive Learning
Jishnu Mukhoti Tsung-Yu Lin Omid Poursaeed Rui Wang · Ashish Shah · Philip Torr · Ser-Nam Lim
Neural Congealing: Aligning Images to a Joint Semantic Atlas
Dolev Ofri-Amar · Michal Geyer · Yoni Kasten · Tali Dekel
Open-Category Human-Object Interaction Pre-training via Language Modeling Framework
Sipeng Zheng · Boshen Xu · Qin Jin
Open-set Fine-grained Retrieval via Prompting Vision-Language Evaluator
Shijie Wang · Jianlong Chang · Haojie Li · Zhihui Wang · Wanli Ouyang · Qi Tian
R
2
Former: Unified
R
etrieval and
R
eranking Transformer for Place Recognition
Sijie Zhu Linjie Yang Chen Chen Mubarak Shah Xiaohui Shen Heng Wang
EVA: Exploring the Limits of Masked Visual Representation Learning at Scale
Yuxin Fang Wen Wang Binhui Xie · Quan Sun · Ledell Wu · Xinggang Wang · Tiejun Huang · Xinlong Wang · Yue Cao
DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting
Maoyuan Ye · Jing Zhang · Shanshan Zhao · Juhua Liu · Tongliang Liu · Bo Du · Dacheng Tao
Finetune like you pretrain: Improved finetuning of zero-shot vision models
Sachin Goyal Ananya Kumar Sankalp Garg J Kolter Aditi Raghunathan
Multimodality Helps Unimodality: Cross-Modal Few-Shot Learning with Multimodal Models
Zhiqiu Lin Samuel Yu Zhiyi Kuang Deepak Pathak Deva Ramanan
DATE: Domain Adaptive Product Seeker for E-commerce
Haoyuan Li Hao Jiang Tao Jin Mengyan Li Yan Chen Zhijie Lin Yang Zhao Zhou Zhao
Pic2Word: Mapping Pictures to Words for Zero-shot Composed Image Retrieval
Kuniaki Saito Kihyuk Sohn Xiang Zhang Chun-Liang Li Chen-Yu Lee Kate Saenko Tomas Pfister
Text- guided Unsupervised Latent Transformations for Multi-attribute Image Manipulation
Xiwen Wei Zhen Xu Cheng Liu Si Wu Zhiwen Yu Hau-San Wong
Fine-grained Image-text Matching by Cross-modal Hard Aligning Network
pan zhengxin Fangyu Wu Bailing Zhang
RA-CLIP: Retrieval Augmented Contrastive Language-Image Pre-training
Chen-Wei Xie Siyang Sun Xiong Xiong Yun Zheng Deli Zhao Jingren Zhou
Unifying Vision, Language, Layout and Tasks for Universal Document Processing
Zineng Tang · Ziyi Yang · Guoxin Wang · Yuwei Fang · Yang Liu · Chenguang Zhu · Michael Zeng · Cha Zhang · Mohit Bansal MSINet
: Twins Contrastive Search of Multi-Scale Interaction for Object ReID
Jianyang Gu · Kai Wang · Hao Luo · Chen Chen · Wei Jiang · Yuqiang Fang · Shanghang Zhang · Yang You · Jian ZHAO
EDA: Explicit Text-Decoupling and Dense Alignment for 3D Visual Grounding
Yanmin Wu · Xinhua Cheng · Renrui Zhang · Zesen Cheng · Jian Zhang
L-CoIns: Language-based Colorization with Instance Awareness
Zheng Chang · Shuchen Weng · Peixuan Zhang · Yu Li · Si Li · Boxin Shi
Learning Visual Representations via Language-Guided Sampling
Mohamed Samir Mahmoud Hussein Elbanani · Karan Desai · Justin Johnson
Shepherding Slots to Objects: Towards Stable and Robust Object-Centric Learning
Jinwoo Kim · Janghyuk Choi · Ho-Jin Choi · Seon Joo Kim
Language in a Bottle: Language Model Guided Concept Bottlenecks for Interpretable Image Classification
Yue Yang · Artemis Panagopoulou · Shenghao Zhou · Daniel Jin · Chris Callison-Burch · Mark Yatskar
Image as a Foreign Language: BEiT Pretraining for Vision and Vision-Language Tasks
Wenhui Wang · Hangbo Bao · Li Dong · Johan Bjorck · Zhiliang Peng · Qiang Liu · Kriti Aggarwal · Owais Khan Mohammed · Saksham Singhal · Subhojit Som · Furu Wei
Improving Visual Grounding by Encouraging Consistent Gradient-based Explanations
Ziyan Yang · Kushal Kafle · Franck Dernoncourt · Vicente Ordonez
Leveraging per Image-Token Consistency for Vision-Language Pre-training
Yunhao GOU · Tom Ko · Hansi Yang · James Kwok · Yu Zhang · Mingxuan Wang
RefTeacher: A Strong Baseline for Semi-Supervised Referring Expression Comprehension
Jiamu Sun · Gen Luo · Yiyi Zhou · Xiaoshuai Sun · GUANNAN JIANG · Zhiyu Wang · Rongrong Ji
Understanding and Improving Visual Prompting: A Label-Mapping Perspective
Aochuan Chen · Yuguang Yao · Pin-Yu Chen · Yihua Zhang · Sijia Liu
Meta-Personalizing Vision-Language Models to Find Named Instances in Video
Chun-Hsiao Yeh · Bryan Russell · Josef Sivic · Fabian Caba · Simon Jenni
MaPLe: Multi-modal Prompt Learning
Muhammad Uzair Khattak · Hanoona Bangalath · Muhammad Maaz · Salman Khan · Fahad Khan
VQACL: A Novel Visual Question Answering Continual Learning Setting
Xi Zhang · Feifei Zhang · Changsheng Xu
Exploring the Effect of Primitives for Compositional Generalization in Vi sion-and-Language
Chuanhao Li Zhen Li Chenchen Jing Yunde Jia Yuwei Wu
Unicode Analogies: An Anti-Objectivist Visual Reasoning Challenge
Steven Spratley Krista A. Ehinger Tim Miller
Token Turing Machines
Michael Ryoo · Keerthana Gopalakrishnan · Kumara Kahatapitiya · Ted Xiao · Kanishka Rao · Austin Stone · Yao Lu · Julian Ibarz · Anurag Arnab Policy Adaptation
from Foundation Model Feedback
Yuying Ge · Annabella Macaluso · Li Erran Li · Ping Luo · Xiaol ong wang
LANA: A Language-Capable Navigator for Instruction Following and Generation
Xiaohan Wang Wenguan Wang Jiayi shao Yi Yang
LEGO-Net: Learning Regular Rearrangements of Objects in Rooms
Qiuhong Anna Wei Sijie Ding Jeong Joon Park Rahul Sajnani Adrien Poulenard · Srinath Sridhar · Leonidas Guibas
Discovering the Real Association: Multimodal Causal Reasoning in Video Question Answering
Chuanqi zang · Hanqing Wang · mingtao Pei · Wei Liang
Cico: Domain-AWARE SIGN LANGUAGE RETRIEVAL VIA CROSS-LINGUAL CONGE Learning
Yiting Yiting · Fangyun Wei · J. Ianmin Bao · Dong Chen · wenqiang zhang
context de-confounded emotion recography
dingkang yang · zhaoyu chen · Yuzheng Wang · Shunli Wang · Mingcheng Li · Liu Siao · Xiao Zhao · Shuai Huang · Zhiyan Dong · Peng Zhai · Lihua Zhang Learning Emotion
Representations from Verbal and Nonverbal Communication
Sitao Zhang · Yimu Pan · James Wang
CLIPPING: Distilling CLIP-Based Models with a Student Base for Video-Language Retrieval
Renjing Pei Jianzhuang Liu Weimian Li Bin Shao Songcen Xu Peng Dai Juwei Lu Youliang Yan
Dual Alignment Unsupervised Domain Adaptation for Video-Text Retrieval
Xiaoshuai Hao · Wanqian Zhang · Dayan Wu · Fei Zhu · Bo Li
StepFormer: Self-supervised Step Discovery and Localization in Instructional Videos
Nikita Dvornik · Isma Hadji · Ran Zhang · Konstantinos Derpanis · Rick Wildes · Allan Jepson
Text with Knowledge Graph Augmented Transformer for Video Captioning
Xin Gu · Guang Chen · Yufei Wang · Libo Zhang · Tiejian Luo · Longyin Wen
RILS: Masked Visual Reconstruction in Language Semantic Space
Shusheng Yang · Yixiao Ge · Kun Yi · Dian Li · Ying Shan · Xiaohu Qie · Xinggang Wang
DegAE: A New Pretraining Paradigm for Low-level Vision
Yihao Liu · Jingwen He · Jinjin Gu · Xiangtao Kong · Yu Qiao · Chao Dong
Teacher-Generatd Spatial-West Robustness and Accuracy of Contractive Models
YUSHI Yao · Gamaleldin Elsayed He
CLAMP: PROM PT-BASED CONTRASTIVE Learning for Connecting Language and Animal Pose
xu ZHANG · WENG · ZHE CHEN · YUFEI XU · Jing zhang · Dacheng Tao
Map: Multimodal Uncertainty-Aware Vision-Language Pre-Trayel
Yata Ji · Junjie WANG · LINRU ZHANG · WANRU · WANG Hong FA · JIAXING · TETSUYA SAKAI · YUJIU yang
intrinsic physical concepts discovery with object -Centric Predictive Models
qu tang Xiangyu Zhu Zhen Lei Zhaoxiang Zhang
Position-guided Text Prompt for Vision-Language Pre-training
Jinpeng Wang Pan Zhou Mike Zheng Shou Shuicheng YAN
LASP: Text-to-Text Optimization for Language-Aware Soft Prompting of Vision & Language Models
Adrian Bulat Georgios Tzimiropoulos
Being Comes from Not-being: Open-vocabulary Text-to- Motion Generation with Wordless Training
Junfan Lin Jianlong Chang Lingbo Liu Guanbin Li Liang Lin Qi Tian Chang-Wen Chen
GeoVLN: Learning Geometry-Enhanced Visual Representation with Slot Attention for Vision-and-Language Navigation
Jingyang Huo Qiang Sun · Boyan Jiang · Haitao Lin · Yanwei Fu
MetaCLUE: Towards Comprehensive Visual Metaphors Research
Arjun Akula · Brendan Driscoll · Pradyumna Narayana · Soravit Changpinyo · Zhiwei Jia · Suyash Damle · Garima Pruthi · S Basu · Leonidas Guibas · William Freeman · Yuanzhen Li · Varun Jampani
ANetQA: A Large-scale Benchmark for Fine-grained Compositional Reasoning over Untrimmed Videos
Zhou Yu · Lixiang Zheng · Zhou Zhao · Fei Wu · Jianping Fan · Kui Ren · Jun Yu
Where We Are and What We’re Looking At: Query Based Worldwide Image Geo-localization Using Hierarchies and Scenes
Brandon Clark · Alec Kerrigan · Parth Parag Kulkarni · Vicente Vivanco Cepeda · Mubarak Shah
CoWs on Pasture: Baselines and Benchmarks for Language-Driven Zero-Shot Object Navigation
Samir Yitzhak Gadre · Mitchell Wortsman · Gabriel Ilharco · Ludwig Schmidt · Shuran Song
Accelerating Vision-Language Pretraining with Free Language Modeling
Teng WANG · Yixiao Ge · Feng Zheng · Ran Cheng · Ying Shan · Xiaohu Qie · Ping Luo
Joint Visual Grounding and Tracking with Natural Language Specification
Li Zhou · Zikun Zhou · Kaige Mao · Zhenyu He
CVT-SLR: Contrastive Visual-Textual Transformation for Sign Language Recognition with Variational Alignment
Jiangbin Zheng · Yile Wang · Cheng Tan · Siyuan Li · Ge Wang · Jun Xia · Yidong Chen · Stan Li
LAVENDER: Unifying Video-Language Understanding as Masked Language Modeling
Linjie Li · Zhe Gan · Kevin Lin · Chung-Ching Lin · Zicheng Liu · Ce Liu · Lijuan Wang
Learning Action Changes by Measuring Verb-Adverb Textual Relationships
Davide Moltisanti · Frank Keller · Hakan Bilen · Laura Sevilla-Lara
WINNER: Weakly-supervised hIerarchical decompositionN and aligNment for spatio-tEmporal video gRounding
Mengze Li · Han Wang · Wenqiao Zhang · Jiaxu Miao · Zhou Zhao · Shengyu Zhang · Wei Ji · Fei Wu
HierVL: Learning Hierarchical Video-Language Embeddings
Kumar Ashutosh Rohit Girdhar Lorenzo Torresani Kristen Grauman
Hierarchical Video-Moment Retrieval and Step-Captioning
Abhay Zala Jaemin Cho Satwik Kottur Xilun Chen Barlas Oguz Yashar Mehd ad · Mohit Bansal
AutoAD : Movie Description in Context
Tengda Han Max Bain Arsha Nagrani Gul Varol Weidi Xie Andrew Zisserman
SViTT: Temporal Learning of Sparse Video-Text Transformers
Yi Li · Kyle Min · Subarna Tripathi · Nuno Vasconcelos
Weakly Supervised Temporal Sentence Grounding with Uncertainty-Guided Self-training
Yifei Huang · Lijin Yang · Yoichi Sato
Collaborative Noisy Label Cleaner: Learning Scene-aware Trailers for Multi-modal Highlight Detection in Movies
Bei Gan · Xiujun Shu · Ruizhi Qiao · Haoqian Wu · Keyu Chen · Hanjun Li · Bo Ren
Weakly Supervised Video Emotion Detection and Prediction via Cross-Modal Temporal Erasing Network
Zhicheng Zhang · Lijuan Wang · Jufeng Yang
Two-Stream Networks for Weakly-Supervised Temporal Action Localization with Semantic-Aware Mechanisms
Yu Wang · Yadong Li · Hongbin Wang
Hybrid Active Learning via Deep Clustering for Video Action Detection
Aayush Jung B Rana · Yogesh Rawat
TriDet: Temporal Action Detection with Relative Boundary Modeling
Dingfeng Shi · Yujie Zhong · Qiong Cao · Lin Ma · Jia Li · Dacheng Tao
HaLP: Hallucinating Latent Positives for Skeleton-based Self-Supervised Learning of Actions
Anshul Shah · Aniket Roy · Ketul Shah · Shlok Mishra · David Jacobs · Anoop Cherian · Rama Chellappa
Post-Processing Temporal Action Detection
Sauradip Nag · Xiatian Zhu · Yi-Zhe Song · Tao Xiang
Collecting Cross-Modal Presence-Absence Evidence for Weakly-Supervised Audio-Visual Event Perception
Junyu Gao · Mengyuan Chen · Changsheng Xu
SynthVSR: Scaling Up Visual Speech Recognition With Synthetic Supervision
Xubo Liu · Egor Lakomkin · Konstantinos Vougioukas · Pingchuan Ma · Honglie Chen · Ruiming Xie · Morrie Doulaty · Niko Moritz · Jachym Kolar · Stavros Petridis · Maja Pantic · Christian Fuegen
ReVISE: Self-Supervised Speech Resynthesis with Visual Input for Universal and Generalized Speech Regeneration
Wei-Ning Hsu · Tal Remez · Bowen Shi · Jacob Donley · Yossi Adi
Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling and Reliability Scoring
Joanna Hong · Minsu Kim · Jeongsoo Choi · Yong Man Ro
Temporal Attention Unit: Towards Efficient Spatiotemporal Predictive Learning
Cheng Tan · Zhangyang Gao · Lirong Wu · Yongjie Xu · Jun Xia · Siyuan Li · Stan Li
Latency Matters: Real-Time Action Forecasting Transformer
Harshayu Girase · Nakul Agarwal · Chiho Choi · Karttikeya Mangalam
Efficient Movie Scene Detection using State-Space Transformers
Md Mohaiminul Islam · Mahmudul Hasan · Kishan Shamsundar Athrey · Tony Braskich · Gediminas Bertasius
TarViS: A Unified Approach for Target-based Video Segmentation
Ali Athar · Alexander Hermans · Jonathon Luiten · Deva Ramanan · Bastian Leibe
HOOD: Hierarchical Graphs for Generalized Modelling of Clothing Dynamics
Artur Grigorev · Bernhard Thomaszewski · Michael Black · Otmar Hilliges
Structured 3D Features for Reconstructing Controllable Avatars
Enric Corona · Mihai Zanfir · Thiemo Alldieck · Eduard Bazavan · Andrei Zanfir · Cristian Sminchisescu
MonoHuman: Animatable Human Neural Field from Monocular Video
Zhengming Yu Wei Cheng Xian Liu Wenyan Wu Kwan-Yee Lin
JAWS: Just A Wild Shot for Cinematic Transfer in Neural Radiance Fields
Xi WANG Robin Courant Jinglei Shi Eric Marchand Marc Christie
InstantAvatar: Learning Avatars from Monocular Video in 60 Seconds
Tianjin Jiang · Xu Chen · Jie Song · Otmar Hilliges
X-Avatar: Expressive Human Avatars
Kaiyue Shen · Chen Guo · Manuel Kaufmann · Juan Zarate · Julien Valentin · Jie Song · Otmar Hilliges
OTAvatar: One-shot Talking Face Avatar with Controllable Tri-plane Rendering
Zhiyuan Ma · Xiangyu Zhu · Guo-Jun Qi · Zhen Lei · Lei Zhang
Learning Personalized High Quality Volumetric Head Avatars from Monocular RGB Videos
Ziqian Bai · Feitong Tan · Zeng Huang · Kripasindhu Sarkar · Danhang Tang · Di Qiu · Abhimitra Meka · Ruofei Du · Mingsong Dou · Sergio Orts-Escolano · Rohit Pandey · Ping Tan · Thabo Beeler · Sean Fanello · Yinda Zhang
AVFace: Towards Detailed Audio-Visual 4D Face Reconstruction
Aggelina Chatziagapi · Dimitris Samaras
NeuFace: Realistic 3D Neural Face Rendering from Multi-view Images
Mingwu Zheng · Haiyu Zhang · Hongyu Yang · Di Huang
Continuous Landmark Detection with 3D Queries
Prashanth Chandran · Gaspard Zoss · Paulo Gotardo · Derek Bradley
GlassesGAN: Eyewear Personalization using Synthetic Appearance Discovery and Targeted Subspace Modeling
Richard Plesh · Peter Peer · Vitomir Struc
High-Res Facial Appearance Capture from Polarized Smartphone Images
Dejan Azinovic · Olivier Maury · Christophe Hery · Matthias Niessner · Justus Thies
Interactive Cartoonization with Controllable Perceptual Factors
Namhyuk Ahn · Patrick Kwon · Jihye Back · Kibeom Hong · Mark Kim
SECAD-Net: Self-Supervised CAD Reconstruction by Learning Sketch-Extrude Operations
Pu Li · Jianwei Guo · Xiaopeng Zhang · Dong-ming Yan
TAPS3D: Text-Guided 3D Textured Shape Generation from Pseudo Supervision
Jiacheng Wei · Hao Wang · Jiashi Feng · Guosheng Lin · Kim-Hui Yap
High Fidelity 3D Hand Shape Reconstruction via Scalable Graph Frequency Decomposition
Tianyu Luan · Yuanhao Zhai · Jingjing Meng · Zhong Li · Zhang Chen · Yi Xu · Junsong Yuan
Generalized Deep 3D Shape Prior via Part-Discretized Diffusion Process
Yuhan Li · Yishun Dou · Xuanhong Chen · Bingbing Ni · Yilin Sun · Yutian Liu · Fuzhen Wang
Consistent View Synthesis with Pose-Guided Diffusion Models
Hung-Yu Tseng Qinbo Li Changil Kim Suhib Alsisan Jia-Bin Huang Johannes Kopf
Patch-based 3D Natural Scene Generation from a Single Example
Weiyu Li Xuelin Chen Jue Wang · Baoquan Chen
Diffusion-based Generation, Optimization, and Planning in 3D Scenes
Siyuan Huang · Zan Wang · Puhao Li · Baoxiong Jia · Tengyu Liu · Yixin Zhu · Wei Liang · Song-Chun Zhu
DA Wand: Distortion-Aware Selection using Neural Mesh Parameterization
Richard Liu · Noam Aigerman · Vladimir Kim · Rana Hanocka
Neural Vector Fields: Implicit Representation by Explicit Learning
Xianghui Yang · Guosheng Lin · Zhenghao Chen · Luping Zhou
Octree Guided Unoriented Surface Reconstruction
Chamin Hewa Koneputugodage · Yizhak Ben-Shabat · Stephen Gould
Structural Multiplane Image: Bridging Neural View Synthesis and 3D Reconstruction
Mingfang Zhang · Jinglu Wang · Xiao Li · Yifei Huang · Yoichi Sato · Yan Lu
Multi-View Reconstruction using Signed Ray Distance Functions (SRDF)
Pierre Zins · Yuanlu Xu · Edmond Boyer · Stefanie Wuhrer · Tony Tung
VolRecon: Volume Rendering of Signed Ray Distance Functions for Generalizable Multi-View Reconstruction
Yufan Ren · Fangjinhua Wang · Tong Zhang · Marc Pollefeys · Sabine Süsstrunk
TMO: Textured Mesh Acquisition of Objects with a Mobile Device by using Differentiable Rendering
Jaehoon Choi · Dongki Jung · Taejae Lee · SangWook Kim · YoungDong Jung · Dinesh Manocha · Donghwan Lee
RelightableHands: Efficient Neural Relighting of Articulated Hand Models
Shun Iwase · Shunsuke Saito · Tomas Simon · Stephen Lombardi · Timur Bagautdinov · Rohan Joshi · Fabian Prada · Takaaki Shiratori · Yaser Sheikh · Jason Saragih
Computational Flash Photography through Intrinsics
Sepideh Sarajian Maralan · Chris Careaga · Yagiz Aksoy
PixHt-Lab: Pixel Height Based Light Effect Generation for Image Compositing
Yichen Sheng · Jianming Zhang · Julien Philip · Yannick Hold-Geoffroy · Xin Sun · HE Zhang · Lu Ling · Bedrich Benes
Tensor4D : Efficient Neural 4D Decomposition for High-fidelity Dynamic Reconstruction and Rendering
Ruizhi Shao · Zerong Zheng · Hanzhang Tu · Boning Liu · Hongwen Zhang · Yebin Liu
UV Volumes for Real-time Rendering of Editable Free-view Human Performance
Yue Chen · Xuan Wang · Xingyu Chen · Qi Zhang · Xiaoyu Li · Yu Guo · Jue Wang · Fei Wang
HyperReel: High-Fidelity 6-DoF Video with Ray-Conditioned Sampling
Benjamin Attal · Jia-Bin Huang · Christian Richardt · Johannes Kopf · Michael Zollhöfer · Matthew O’Toole · Changil Kim
Complementary Intrinsics from Neural Radiance Fields and CNNs for Outdoor Scene Relighting
Siqi Yang · Xuanning Cui · Yongjie Zhu · Jiajun Tang · Si Li · Zhaofei Yu · Boxin Shi
Balanced Spherical Grid for Egocentric View Synthesis
Changwoon Choi · Sang Min Kim · Young Min Kim
pCON: Polarimetric Coordinate Networks for Neural Scene Representations
Henry Peters · Yunhao Ba · Achuta Kadambi
MobileNeRF: Exploiting the Polygon Rasterization Pipeline for Efficient Neural Field Rendering on Mobile Architectures
Zhiqin Chen · Thomas Funkhouser · Peter Hedman · Andrea Tagliasacchi
ABLE-NeRF: Attention-Based Rendering with Learnable Embeddings for Neural Radiance Field
Zhe Jun Tang · Tat-Jen Cham · Haiyu Zhao
NeRFVS: Neural Radiance Fields for Free View Synthesis via Geometry Scaffolds
chen yang · Peihao Li · Zanwei Zhou · Shanxin Yuan · Bingbing Liu · Xiaokang Yang · Weichao Qiu · Wei Shen
Progressively Optimized Local Radiance Fields for Robust View Synthesis
Andreas Meuleman · Yu-Lun Liu · Chen Gao · Jia-Bin Huang · Changil Kim · Min Kim Kim · Johannes Kopf
Removing Objects From Neural Radiance Fields
Silvan Weder · Guillermo Garcia-Hernando · Aron Monszpart · Marc Pollefeys · Gabriel Brostow · Michael Firman · Sara Vicente
SCADE: Space Carving with Ambiguity-aware Depth Estimates
Mikaela Uy · Ricardo Martin Brualla · Leonidas Guibas · Ke Li
ContraNeRF: Generalizable Neural Radiance Fields for Synthetic-to-real Novel View Synthesis via Contrastive Learning
Hao Yang Lanqing HONG Aoxue Li Tianyang Hu Zhenguo Li Gim Lee Liwei Wang
JacobiNeRF: NeRF Shaping with Mutual Information Gradients
Xiaomeng Xu · Yanchao Yang Kaichun Mo Boxiao Pan Li Yi Leonidas Guibas
Fresnel Microfacet BRDF: Unification of Polari-Radiometric Surface-Body Reflection
Tomoki Ichikawa Yoshiki Fukao Shohei Nobuhara Ko Nishino
DartBlur: Privacy Preservation with Detection Artifacts Suppression
Baow ei Jiang · Bing Bai · Haozhe Lin · Yu Wang · Yuchen Guo · LU FANG
Evading Forensic Classifiers with Attribute-Conditioned Adversarial Faces
Fahad Shamshad · Koushik Srivatsan · Karthik Nandakumar
RIATIG: Reliable and Imperceptible Adversarial Text-to-Image Generation with Natural Prompts
Han Liu · Yuhao Wu · Shixuan Zhai · Bo Yuan · Ning Zhang
Improving Robust Generalization by Direct PAC-Bayesian Bound Minimiz ation
Zifan Wang · Nan Ding Tomer Levinboim Xi Chen Radu Soricut
Randomized Adversarial Training via Taylor Expansion
Gaojie Jin Xinping Yi Dengyu Wu Ronghui Mu Xiaowei Huang
Adversarial Counterfactual Visual Explanations
Guillaume Jeanneret Loic Simon Frederic Juriet
Transferable Adversarial Attacks on Vision Transformers with Token Gradient Regularization
Jianping Zhang · Yizhan Huang · Weibin Wu · Michael Lyu
Dynamic Generative Targeted Attacks with Pattern Injection
Weiwei Feng · Nanqing Xu · Tianzhu Zhang · Yongdong Zhang
Turning Strengths into Weaknesses: A Certified Robustness Inspired Attack Framework against Graph Neural Networks
Binghui Wang · Meng Pang · Yun Dong
Re-thinking Model Inversion Attacks Against Deep Neural Networks
Ngoc-Bao Nguyen · Keshigeyan Chandrasegaran · Milad Abdollahzadeh · Ngai-man Cheung
Can’t Steal? Cont-Steal! Contrastive Stealing Attacks Against Image Encoders
Zeyang Sha · Xinlei He · Ning Yu · Michael Backes · Yang Zhang
Detecting Backdoors in Pre-trained Encoders
Shiwei Feng · Guanhong Tao · Siyuan Cheng · Guangyu Shen · Xiangzhe Xu · Yingqi Liu · Kaiyuan Zhang · Shiqing Ma · Xiangyu Zhang
STDLens: Model Hijacking-resilient Federated Learning for Object Detection
Ka-Ho Chow Ling Liu Wenqi Wei Fatih Ilhan Yanzhao Wu
Alias-Free Convnets: Fractional Shift Invariance via Polynomial Activations
Hagay Michaeli Tomer Michaeli Daniel Soudry
FedDM: Iterative Distribution Matching for Communication-Efficient Federated Learning
Yuanhao Xiong · Ruochen Wang · Minhao Cheng · Felix Yu · Cho-Jui Hsieh
Rethinking Federated Learning with Domain Shift: A Prototype View
Wenke Huang · Mang Ye · Zekun Shi · He Li · Bo Du
Fair Federated Medical Image Segmentation via Client Contribution Estimation
Meirui Jiang · Holger Roth · Wenqi Li · Dong Yang · Can Zhao · Vishwesh Nath · Daguang Xu · DOU QI · Ziyue Xu
Class Balanced Adaptive Pseudo Labeling for Federated Semi-Supervised Learning
Ming Li · Qingli Li · Yan Wang
Prototypical Residual Networks for Anomaly Detection and Localization
Hui Zhang · Zuxuan Wu · Zheng Wang · Zhineng Chen · Yu-Gang Jiang
Exploiting Completeness and Uncertainty of Pseudo Labels for Weakly Supervised Video Anomaly Detection
Chen Zhang · Guorong Li · Yuankai Qi · Shuhui Wang · Laiyun Qing · Qingming Huang · Ming-Hsuan Yang
A New Dataset Based on Images Taken by Blind People for Testing the Robustness of Image Classification Models Trained for ImageNet Categories
Reza Akbarian Bafghi · Danna Gurari
Boosting Verified Training for Robust Image Classifications via Abstraction
Zhaodi Zhang · Zhiyi Xue · Yang Chen · Si Liu · Yueling Zhang · Jing Liu · Min Zhang
Soft Augmentation for Image Classification
Yang Liu · Shen Yan · Laura Leal-Taixé · James Hays · Deva Ramanan
Re-GAN: Data-Efficient GANs Training via Architectural Reconfiguration
Divya Saxena · Jiannong Cao · Jiahao XU · Tarun Kulshrestha
AdaptiveMix: Improving GAN Training via Feature Space Shrinkage
Haozhe Liu · Wentian Zhang · Bing Li · Haoqian Wu · Nanjun He · Yawen Huang · Yuexiang Li · Bernard Ghan em Yefeng Zheng
Enhancing Multiple Reliability Measures via Nuisance-extended Information Bottleneck
Jongheon Jeong · Sihyun Yu · Hankook Lee · Jinwoo Shin
Bit-shrinking: Limiting Instantaneous Sharpness for Improving Post-training Quantization
Lin Chen · Bo Peng · Zheyang Li · Wenming Tan · Ye Ren · Jun Xiao · Shiliang Pu
Robust Generalization against Photon-Limited Corruptions via Worst-Case Sharpness Minimization
Zhuo Huang · Miaoxi Zhu · Xiaobo Xia · Li Shen · Jun Yu · Chen Gong · Bo Han · Bo Du · Tongliang Liu
OT-Filter: An Optimal Transport Filter for Learning with Noisy Labels
Chuanwen Feng · Yilong Ren · Xike Xie
Don’t Lie to Me! Robust and Efficient Explainability with Verified Perturbation Analysis
Thomas FEL · Melanie Ducoffe · David Vigouroux · Remi Cadene · Mikaël Capelle · Claire NICODEME · Thomas Serre
Shortcomings of Top-Down Randomization-Based Sanity Checks for Evaluations of Deep Neural Network Explanations
Alexander Binder · Leander Weber · Sebastian Lapuschkin · Grégoire Montavon · Klaus Muller · Wojciech Samek
ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders
Sanghyun Woo · Shoubhik Debnath · Ronghang Hu · Xinlei Chen · Zhuang Liu · In So Kweon · Saining Xie
Regularization of polynomial networks for image recognition
Grigorios Chrysos · Bohan Wang · Jiankang Deng · Volkan Cevher
Stitchable Neural Networks
Zizheng Pan · Jianfei Cai · Bohan Zhuang
DepGraph: Towards Any Structural Pruning
Gongfan Fang · Xinyin Ma · Mingli Song · Michael Bi Mi · Xinchao Wang
Meta-Learning with a Geometry-Adaptive Preconditioner
Suhyun Kang · Duhun Hwang · Moonjung Eo · Taesup Kim · Wonjong Rhee
Class Adaptive Network Calibration
Bingyuan Liu · Jérôme Rony · Adrian Galdran · Jose Dolz · Ismail Ayed
Differentiable Architecture Search with Random Features
zhang xuanyang · Yonggang Li · Xiangyu Zhang · Yongtao Wang · Jian Sun
DART: Diversify-Aggregate-Repeat Training Improves Generalization of Neural Networks
Samyak Jain · Sravanti Addepalli · Pawan Sahu · Priyam Dey · Venkatesh Babu Radhakrishnan
NICO++: Towards better bechmarks for Out-of-Distribution Generalization
Xingxuan Zhang · Yue He · Renzhe Xu · Han Yu · Zheyan Shen · Peng Cui
Bilateral Memory Consolidation for Continual Learning
Xing Nie · Shixiong Xu · Xiyan Liu · Gaofeng Meng · Chunlei Huo · Shiming Xiang
CafeBoost: Causal Feature Boost to Eliminate Task-Induced Bias for Class Incremental Learning
Benliu Qiu · Hongliang Li · Haitao Wen · Heqian Qiu · Lanxiao Wang · Fanman Meng · Qingbo Wu · Lili Pan
Towards a Smaller Student: Capacity Dynamic Distillation for Efficient Image Retrieval
Yi Xie · Huaidong Zhang · Xuemiao Xu · Jianqing Zhu · Shengfeng He
Generic-to-Specific Distillation of Masked Autoencoders
Wei Huang · Zhiliang Peng · Li Dong · Furu Wei · Jianbin Jiao · Qixiang Ye
Heterogeneous Continual Learning
Divyam Madaan · Hongxu Yin · Wonmin Byeon · Jan Kautz · Pavlo Molchanov
Manipulating Transfer Learning for Property Inference
Yulong Tian Fnu Suya Anshuman Suri Fengyuan Xu David Evans
Adapting Shortcut with Normalizing Flow: An Efficient Tuning Framework for Visual Recognition
Yaoming Wang Bowen Shi XIAOPENG ZHANG Jin Li Yuchen Liu Wenrui Dai · Chenglin Li · Hongkai Xiong · Qi Tian
A New Benchmark: On the Utility of Synthetic Data with Blender for Bare Supervised Learning and Downstream Domain Adaptation
Hui Tang · Kui Jia
Switchable Representation Learning Framework with Self-compatibility
shengsen wu · Yan Bai · Yihang Lou · Xiongkun Linghu · Jianzhong He · LINGYU DUAN
Domain Expansion of Image Generators
Yotam Nitzan · MICHAEL GHARBI · Richard Zhang · Taesung Park · Jun-Yan Zhu · Daniel Cohen-Or · Eli Shechtman
Robust Test-Time Adaptation in Dynamic Scenarios
Longhui Yuan · Binhui Xie · Shuang Li
Train/Test-Time Adaptation with Retrieval
Luca Zancato · Alessandro Achille · Tian Yu Liu · Matthew Trager · Pramuditha Perera · Stefano Soatto
Bi-level Meta-learning for Few-shot Domain Generalization
Xiaorong Qin · Xinhang Song · Shuqiang Jiang
Towards All-in-one Pre-training via Maximizing Multi-modal Mutual Information
Weijie Su · Xizhou Zhu · Chenxin Tao · Lewei Lu · Bin Li · Gao Huang · Yu Qiao · Xiaogang Wang · Jie Zhou · Jifeng Dai Multi-modal Learning with Missing Modality via
Shared-Specific Feature Modeling
Hu Wang · Yuanhong Chen · Congbo Ma · Jodie Avery · M. Louise Hull · Gustavo Carneiro
DiGA: Distil to Generalize and then Adapt for Domain Adaptive Semantic Segmentation
Fengyi Shen · Akhil Gurram · Ziyuan Liu · He Wang · Alois Knoll
Progressive Open Space Expansion for Open Set Model Attribution
Tianyun Yang · Danding Wang · Fan Tang · Xinying Zhao · Juan Cao · Sheng Tang
DLBD: A Self-Supervised Direct-Learned Binary Descriptor
Bin Xiao · Yang Hu · Bo Liu · Xiuli Bi · Weisheng Li · Xinbo Gao
DAA: A Delta Age AdaIN operation for age estimation via binary code transformer
Ping Chen Xingpeng Zhang Ye Li Ju Tao Bin Xiao Bing Wang zongjie jiang
Curvature-Balanced Feature Manifold Learning for Long-Tailed Classification
Yanbiao Ma Licheng Jiao Fang Liu Shuyuan Yang Xu Liu Lingling Li
Global and Local Mixture Consistency Cumulative Learning for Long-tailed Visual Recognitions
Fei Du peng yang Qi Jia Fengtao Nan · xiaoting chen · Yun Yang
No One Left Behind: Improving the Worst Categories in Long-Tailed Learning
Yingxiao Du · Jianxin Wu
Learning Imbalanced Data with Vision Transformers
Zhengzhuo Xu · Ruikang Liu · Shuo Yang · Zenghao Chai · Chun Yuan
Ranking Regularization for Critical Rare Classes: Minimizing False Positives at a High True Positive Rate
Kiarash Mohammadi · He Zhao · Mengyao Zhai · Frederick Tung
MarginMatch: Using Training Dynamics of Unlabeled Data for Semi-Supervised Learning
Tiberiu Sosea · Cornelia Caragea
CHMATCH: Contrastive Hierarchical Matching and Robust Adaptive Threshold Boosted Semi-Supervised Learning
Jianlong Wu · Haozhe Yang · Tian Gan · Ning Ding · Feijun Jiang · Liqiang Nie
Boosting Transductive Few-Shot Fine-tuning with Margin-based Uncertainty Weighting and Probability Regularization
Ran Tao · Hao Chen · Marios Savvides
Three Guidelines You Should Know for Universally Slimmable Self-Supervised Learning
Yun-Hao Cao · Peiqin Sun · Shuchang Zhou
Towards Bridging the Performance Gaps of Joint Energy-based Models
Xiulong Yang · Qing Su · Shihao Ji
Siamese DETR
Zeren Chen Gengshi Huang Wei Li Jianing Teng Kun Wang Jing Shao CHEN CHANGE LOY Lyu Sheng
Highly Confident Local Structure Based Consensus Graph Learning for Incomplete Multi-view Clustering
Jie Wen Chengliang Liu Gehui Xu Zhihao Wu Chao Huang Lunke Fei Yong Xu
Block Selection Method for Using Feature Norm in Out-of-Distribution Detection
Yeonguk Yu Sungho Shin Seongju Lee Changhyun Jun Kyoobin Lee
Causally-Aware Intraoperative Imputation for Overall Survival Time Prediction
Xiang Li Xuelin Qian Litian Liang Lingjie Kong Qiaole Dong Chen Jiejun Dingxia Liu Xiuzhong Yao Yanwei Fu
PEFAT: Boosting Semi-supervised Medical Image Classification via Pseudo-loss Estimation and Feature Adversarial Training
Zeng Qingjie · Yutong Xie · Lu Zilin · Yong Xia
Histopathology Whole Slide Image Analysis with Heterogeneous Graph Representation Learning
Tsai Chan Chan · Fernando Julio Cendra · Lan Ma · Guosheng Yin · Lequan Yu
MCF: Mutual Correction Framework for Semi-Supervised Medical Image Segmentation
Yongchao Wang · Bin Xiao · Xiuli Bi · Weisheng Li · Xinbo Gao
DoNet: Deep De-overlapping Network for Cytology Instance Segmentation
Hao JIANG · Rushan Zhang · Yanning Zhou · Yumeng Wang · Hao Chen
Weakly supervised segmentation with point annotations for histopathology images via contrast-based variational model
hongrun zhang · Liam Burrows · Yanda Meng · Declan Sculthorpe · ABHIK MUKHERJEE · Sarah Coupland · Ke Chen · Yalin Zheng
Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture
Mido Assran · Quentin Duval · Pascal Vincent · Ishan Misra · Piotr Bojanowski · Michael Rabbat · Yann LeCun · Nicolas Ballas
Boosting Detection in Crowd Analysis via Underutilized Output Features
Shaokai Wu · Fengyu Yang
Bi3D: Bi-domain Active Learning for Cross-domain 3D Object Detection
Jiakang Yuan · Bo Zhang · Xiangchao Yan · Tao Chen · Botian Shi · Yikang LI · Yu Qiao
Ambiguity-Resistant Semi-Supervised Learning for Dense Object Detection
Chang Liu · Weiming Zhang · Xiangru Lin · Wei Zhang · Xiao Tan · Junyu Han · Xiaomao Li · Errui Ding · Jingdong Wang
Large-scale Training Data Search for Object Re-identification
Yue Yao · Tom Gedeon · Liang Zheng
SOOD: Towards Semi- Supervised Oriented Object Detection
Wei Hua Dingkang Liang jingyu li Xiaolong Liu Zhikang Zou Xiaoqing Ye Xiang Bai
Zero-Shot Object Counting
Jingyi Xu Hieu Le Vu Nguyen Viresh Ranjan Dimitris Samaras
SAP-DETR: Bridging the Gap between Salient Points and Queries-Based Transformer Detector for Fast Model Convergency
Yang Liu Yao Zhang Yixin Wang Yang Zhang Jiang Tian zhongchao shi Jianping Fan Zhiqiang He
Knowledge Combination to Learn Rotated Detection Without Rotated Annotation
Tianyu Zhu · Bryce Ferenczi · Pulak Purkait · Tom Drummond · Hamid Rezatofighi · Anton Hengel
The Treasure Beneath Multiple Annotations: An Uncertainty-aware Edge Detector
Caixia Zhou · Yaping Huang · Mengyang Pu · Qingji Guan · Li Huang · Haibin Ling
Decoupled Semantic Prototypes enable learning from arbitrary annotation types for semi-weakly segmentation in expert-driven domains
Simon Reiß · Constantin Seibold · Alexander Freytag · Erik Rodner · Rainer Stiefelhagen
Boosting Low-Data Instance Segmentation by Unsupervised Pre-training with Saliency Prompt
HAO LI · Dingwen Zhang · Nian Liu · Lechao Cheng · Yalun Dai · Chao Zhang · Xinggang Wang · Junwei Han
STAR Loss: Reducing Semantic Ambiguity in Facial Landmark Detection
Zhenglin Zhou · Huaxia Li · Hong Liu · Nanyang Wang · Gang Yu · Rongrong Ji
Fuzzy Positive Learning for Semi-supervised Semantic Segmentation
Pengchong Qiao · Zhidan Wei · Yu Wang · Zhennan Wang · Guoli S ong · FAN XU · XIANGYANG JIANG LIANG LIU · JIE CHEN Spars
Sparsely Annotated Semantic Segmentation with Adaptive Gaussian Mixtures
Linshan wu · zhunzan fangxin h E · qiang Liu · JIAYI Ma · Hao Chen
Spatial-Temporal Concept
base Ji · Yu Wang · Jien Kato
Weakly-Supervised Domain Adaptive Semantic Segmentation with Prototypical Contrastive Learning
Anurag Das · Yongqin Xian · Dengxin Dai · Bernt Schiele
Exemplar-FreeSOLO: Enhancing Unsupervised Instance Segmentation with Exemplars
TAOSEEF ISHTIAK · Qing En · Yuhong Guo
Decoupling Human and Camera Motion from Videos in the Wild
Vickie Ye · Georgios Pavlakos · Jitendra Malik · Angjoo Kanazawa
CIRCLE: Capture In Rich Contextual Environments
Joao Araujo · Jiaman Li · Karthik Vetrivel · Rishi Agarwal · Deepak Gopinath · Jiajun Wu · Alexander Clegg · Karen Liu
CARTO: Category and Joint Agnostic Reconstruction of ARTiculated Objects
Nick Heppert · Muhammad Zubair Irshad · Sergey Zakharov · Katherine Liu · Rareș Ambruș · Jeannette Bohg · Abhinav Valada · Thomas Kollar
DexArt: Benchmarking Generalizable Dexterous Manipulation with Articulated Objects
Chen Bao · Helin Xu · Yuzhe Qin · Xiaolong Wang
FLEX: Full-Body Grasping Without Full-Body Grasps
Purva Tendulkar · Didac Suris Coll-Vinent · Carl Vondrick
Im2Hands: Learning Attentive Implicit Representation of Interacting Two-Hand Shapes
Jihyun Lee · Minhyuk Sung · Honggyu Choi · Tae-Kyun Kim
One-Stage 3D Whole-Body Mesh Recovery with Component Aware Transformer
Jing Lin · Ailing Zeng · Haoqian Wang · Lei Zhang · Yu Li
Implicit 3D Human Mesh Recovery using Consistency with Pose and Shape from Unseen-view
Hanbyel Cho · Yooshin Cho · Jaesung Ahn · Junmo Kim
Flow supervision for Deformable NeRF
Chaoyang Wang · Lachlan MacDonald · Laszlo Jeni · Simon Lucey
FlexNeRF: Photorealistic Free-viewpoint Rendering of Moving Humans from Sparse Views
Vinoj Yasanga Jayasundara Magalle Hewa · Amit Agrawal · Nicolas Heron · Abhinav Shrivastava · Larry Davis
POEM: Reconstructing Hand in a Point Embedded Multi-view Stereo
Lixin Yang · Jian Xu · Licheng Zhong · Xinyu Zhan · Zhicheng Wang · Kejian Wu · Cewu Lu
Clothed Human Performance Capture with a Double-layer Neural Radiance Fields
Kangkan Wang · Guofeng Zhang · Suxu Cong · Jian Yang
VGFlow: Visibility guided Flow Network for Human Reposing
Rishabh Jain · Krishna Kumar Singh · Mayur Hemani · Jingwan Lu · Mausoom Sarkar · Duygu Ceylan · Balaji Krishnamurthy
HandNeRF: Neural Radiance Fields for Animatable Interacting Hands
Zhiyang Guo · Wengang Zhou · Min Wang · Li Li · Houqiang Li
PAniC-3D: Stylized Single-view 3D Reconstruction from Portraits of Anime Characters
Shuhong Chen · Kevin Zhang · Yichun Shi · Heng Wang · Yiheng Zhu · Guoxian Song · Sizhe An · Janus Kristjansson · Xiao Yang · Matthias Zwicker
PointAvatar: Deformable Point-based Head Avatars from Videos
Yufeng Zheng · Wang Yifan · Gordon Wetzstein · Michael Black · Otmar Hilliges
Ham2Pose: Animating Sign Language Notation into Pose Sequences
Rotem Shalev Arkushin · Amit Moryossef Ohad Fried
Auto-CARD: Efficient and Robust Codec Avatar Driving for Real-time Mobile Telepresence
Yonggan Fu · Yuecheng Li · Chenghui Li · Jason Saragih · Peizhao Zhang · Xiaoliang Dai · Yingyan Lin
Learning Locally Editable Virtual Humans
Hsuan-I Ho · Lixin Xue · Jie Song · Otmar Hilliges
Zero-Shot Text-to-Parameter Translation for Game Character Auto-Creation
Rui Zhao · Wei Li · Zhipeng Hu · Lincheng Li · Zhengxia Zou · Zhenwei Shi · Changjie Fan
Learning Neural Parametric Head Models
Simon Giebenhain · Tobias Kirschstein · Markos Georgopoulos · Martin Rünz · Lourdes Agapito · Matthias Niessner
Next3D: Generative Neural Texture Rasterization for 3D-Aware Head Avatars
Jingxiang Sun · Xuan Wang · Lizhen Wang · Xiaoyu Li · Yong Zhang · Hongwen Zhang · Yebin Liu
Graphics Capsule: Learning Hierarchical 3D Face Representations from 2D Images
Chang Yu · Xiangyu Zhu · Xiaomei Zhang · Zhaoxiang Zhang · Zhen Lei
Parameter Efficient Local Implicit Image Function Network for Face Segmentation
Mausoom Sarkar · Nikitha S R · Mayur Hemani · Rishabh Jain · Balaji Krishnamurthy
StyleGene: Crossover and Mutation of Region-level Facial Genes for Kinship Face Synthesis
Hao Li · Xianxu Hou · Zepeng Huang · Linlin Shen
PanoHead: Geometry-Aware 3D Full-Head Synthesis in 360

Sizhe An · Hongyi Xu · Yichun Shi · Guoxian Song · Umit Ogras · Linjie Luo
Self-Supervised Geometry-Aware Encoder for Style-Based 3D GAN Inversion
Yushi LAN · Xuyi Meng · Shuai Yang · CHEN CHANGE LOY · Bo Dai
3D Highlighter: Localizing Regions on 3D Shapes via Text Descriptions
Dale Decatur · Itai Lang · Rana Hanocka
Dream3D: Zero-Shot Text-to-3D Synthesis Using 3D Shape Prior and Text-to-Image Diffusion Models
Jiale Xu · Xintao Wang · Weihao Cheng · Yan-Pei Cao · Ying Shan · Xiaohu Qie · Shenghua Gao
Efficient View Synthesis and 3D-based Multi-Frame Denoising with Multiplane Feature Representations
Thomas Tanay · Ales Leonardis · Matteo Maggioni
Diffusion-Based Signed Distance Fields for 3D Shape Generation
Jaehyeok Shim · Changwoo Kang · Kyungdon Joo
Persistent Nature: A Generative Model of Unbounded 3D Worlds
Lucy Chai · Richard Tucker · Zhengqi Li · Phillip Isola · Noah Snavely
OReX: Object Reconstruction from Planar Cross-sections Using Neural Fields
Haim Sawdayee · Amir Vaxman · Amit Bermano
Sphere-Guided Training of Neural Implicit Surfaces
Andreea Dogaru · Andrei-Timotei Ardelean · Savva Ignatyev · Egor Zakharov · Evgeny Burnaev
NeuralUDF: Learning Unsigned Distance Fields for Multi-view Reconstruction of Surfaces with Arbitrary Topologies
Xiaoxiao Long · Cheng Lin · Lingjie Liu · Yuan Liu · Peng Wang · Christian Theobalt · Taku Komura · Wenping Wang
Looking Through the Glass: Neural Surface Reconstruction Against High Specular Reflections
Jiaxiong Qiu · Peng-Tao Jiang · Yifan Zhu · Ze-Xin Yin · Ming-Ming Cheng · Bo Ren
Teleidoscopic Imaging System for Microscale 3D Shape Reconstruction
Ryo Kawahara · Meng-Yu Kuo · Shohei Nobuhara
The Differentiable Lens: Compound Lens Search over Glass Surfaces and Materials for Object Detection
Geoffroi Côté · Fahim Mannan · Simon Thibault · Jean-Francois Lalonde · Felix Heide
SunStage: Portrait Reconstruction and Relighting using the Sun as a Light Stage
Yifan Wang · Aleksander Holynski · Xiuming Zhang · Cecilia Zhang
Nighttime smartphone reflective flare removal using optical center symmetry prior
Yuekun Dai · Yihang Luo · Shangchen Zhou · Chongyi Li · CHEN CHANGE LOY
ORCA: Glossy Objects as Radiance Field Cameras
Kushagra Tiwary · Akshat Dave · Nikhil Behari · Tzofi Klinghoffer · Ashok Veeraraghavan · Ramesh Raskar
ReLight My NeRF: A Dataset for Novel View Synthesis and Relighting of Real World Objects
Marco Toschi · Riccardo De Matteo · Riccardo Spezialetti · Daniele Gregorio · Luigi Di Stefano · Samuele Salti
Neural Scene Chronology
Haotong Lin · Qianqian Wang · Ruojin Cai · Sida Peng · Hadar Averbuch-Elor · Xiaowei Zhou · Noah Snavely
DyNCA: Real-time Dynamic Texture Synthesis Using Neural Cellular Automata
Ehsan Pajouheshgar · Yitao Xu · Tong Zhang · Sabine Süsstrunk
TriVol: Point Cloud Rendering via Triple Volumes
Tao Hu · Xiaogang Xu · Ruihang Chu · Jiaya Jia
Occlusion-Free Scene Recovery via Neural Radiance Fields
Chengxuan Zhu · Renjie Wan · Yunkai Tang · Boxin Shi
Transforming Radiance Field with Lipschitz Network for Photorealistic 3D Scene Stylization
Zicheng Zhang · Yinglu Liu · Congying Han · Yingwei Pan · Tiande Guo · Ting Yao
PaletteNeRF: Palette-based Appearance Editing of Neural Radiance Fields
Zhengfei Kuang · Fujun Luan · Sai Bi · Zhixin Shu · Gordon Wetzstein · Kalyan Sunkavalli
Masked Wavelet Representation for Compact Neural Radiance Fields
Daniel Rho · Byeonghyeon Lee · Seungtae Nam · Joo Chan Lee · Jong Hwan Ko · Eunbyung Park
SPIn-NeRF: Multiview Segmentation and Perceptual Inpainting with Neural Radiance Fields
Ashkan Mirzaei · Tristan Aumentado-Armstrong · Konstantinos Derpanis · Jonathan Kelly · Marcus Brubaker · Igor Gilitschenski · Alex Levinshtein
MixNeRF: Modeling a Ray with Mixture Density for Novel View Synthesis from Sparse Inputs
Seunghyeon Seo · Donghoon Han · Yeonjin Chang · Nojun Kwak
GM-NeRF: Learning Generalizable Model-based Neural Radiance Fields from Multi-view Images
Jianchuan Chen · Wentao Yi · Liqian Ma · Xu Jia · Huchuan Lu
NeRDi: Single-View NeRF Synthesis with Language-Guided Diffusion as General Image Priors
Congyue Deng · Chiyu Jiang · Charles R. Qi · Xinchen Yan · Yin Zhou · Leonidas Guibas · Dragomir Anguelov
RobustNeRF: Ignoring Distractors with Robust Losses
Sara Sabour · Suhani Vora · Daniel Duckworth · Ivan Krasin · David Fleet · Andrea Tagliasacchi
High-fidelity Event-Radiance Recovery via Transient Event Frequency
Jin Han · Yuta Asano · Boxin Shi · Yinqiang Zheng · Zhihang Zhong
TruFor: Leveraging all-round clues for trustworthy image forgery detection and localization
Fabrizio Guillaro · Davide Cozzolino · Avneesh Sud · Nicholas Dufour · Luisa Verdoliva
CLIP2Protect: Protecting Facial Privacy using Text-Guided Makeup via Adversarial Latent Search
Fahad Shamshad · Muhammad Muzammal Naseer · Karthik Nandakumar
Discrete Point-wise Attack Is Not Enough: Generalized Manifold Adversarial Attack for Face Recognition
Qian Li · Yuxiao Hu · Ye Liu · Dongxiao Zhang · Xin Jin · Yuntian Chen
Generalist: Decoupling Natural and Robust Generalization
Hongjun Wang · Yisen Wang
AGAIN: Adversarial Training with Attribution Span Enlargement and Hybrid Feature Fusion
Shenglin Yin · kelu Yao · Sheng Shi · Yangzhou Du Zhen Xiao
HGFormer: Hierarchical Grouping Transformer for Domain Generalized Semantic Segmentation
Jian Ding Nan Xue Gui-Song Xia Bernt Schiele Dengxin Dai
Pruning Parameterization with Bi-level Optimization for Efficient Semantic Segmentation on the Edge
Changdi Yang Pu Zhao · Yanyu Li · Wei Niu · Jiexiong Guan · Hao Tang · Minghai Qin · Bin Ren · Xue Lin · Yanzhi Wang
Towards Open-World Segmentation of Parts
Tai-Yu Pan Qing Liu Wei-Lun Chao Brian Price
SegLoc: Learning Segmentation-based Representations for Privacy-Preserving Visual Localization
Maxime Pietrantoni Martin Humenberger Torsten Sattler Gabriela Csurka
GeoNet: Benchmarking Unsupervised Adaptation across Geograph ies
Tarun Kalluri · Wangdong Xu Manmohan Chandraker
Modeling Entities as Semantic Points for Visual Information Extraction in the Wild
Zhibo Yang Rujiao Long Pengfei Wang Sibo Song Humen Zhong Wenqing Cheng Xiang Bai Cong Yao
DPF: Learning Dense Prediction Fields with Weak Supervision
Xiaoxue Chen · Yuhang Zheng · Yupeng Zheng · Qiang Zhou · Hao Zhao · Guyue Zhou · Ya-Qin Zhang
Progressive Semantic-Visual Mutual Adaption for Generalized Zero-Shot Learning
Man Liu · Feng Li · Chunjie Zhang · Yunchao Wei · Huihui Bai · Yao Zhao
Universal Instance Perception as Object Discovery and Retrieval
Bin Yan · Yi Jiang · Jiannan Wu · Dong Wang · Ping Luo · Zehuan Yuan · Huchuan Lu
Learning Attention as Disentangler for Compositional Zero-shot Learning
Shaozhe Hao · Kai Han · Kwan-Yee K. Wong
CLIP is Also an Efficient Segmenter: A Text-Driven Approach for Weakly Supervised Semantic Segmentation
Yuqi Lin · Minghao Chen · Wenxiao Wang · Boxi Wu · Ke Li · Binbin Lin · Haifeng Liu · Xiaofei He
Self-supervised Implicit Glyph Attention for Text Recognition
Tongkun Guan · Chaochen Gu · Jingzheng Tu · Xue Yang · Qi Feng · yudi zhao · Wei Shen
Visual Recognition by Request
Chufeng Tang · Lingxi Xie · XIAOPENG ZHANG · Xiaolin Hu · Qi Tian
Aligning Bag of Regions for Open-Vocabulary Object Detection
Size Wu · Wenwei Zhang · Sheng Jin · Wentao Liu · CHEN CHANGE LOY
CLIP^2: Contrastive Language-Image-Point Pretraining from Real-World Point Cloud Data
Yihan Zeng · Chenhan Jiang · Jiageng Mao · Jianhua Han · Chaoqiang Ye · Qingqiu Huang · Dit-Yan Yeung · Zhen Yang · Xiaodan Liang · Hang Xu
CapDet: Unifying Dense Captioning and Open-World Detection Pretraining
Yanxin Long · Youpeng Wen · Jianhua Han · Hang Xu · Pengzhen Ren · Wei Zhang · Shen Zhao · Xiaodan Liang
Towards Unified Scene Text Spotting based on Sequence Generation
Taeho Kil · Seonghyeon Kim · Sukmin Seo · Yoonsik Kim · Daehee Kim
Prompt, Generate, then Cache: Cascade of Foundation Models makes Strong Few-shot Learners
Renrui Zhang · Xiangfei Hu · Bohao Li · Siyuan Huang · Hanqiu Deng · Yu Qiao · Peng Gao · Hongsheng Li
Boundary-aware Backward-Compatible Representation via Adversarial Learning in Image Retrieval
Tan Pan · Furong Xu · Xudong Yang · Sifeng He · Chen Jiang · Qingpei Guo · Feng Qian · Xiaobo Zhang · Yuan Cheng · Lei Yang · Wei Chu
Q: How to Specialize Large Vision-Language Models to Data-Scarce VQA Tasks? A: Self-Train on Unlabeled Images!
Zaid Khan · Vijay Kumar B G · Samuel Schulter · Xiang Yu · Yun Fu · Manmohan Chandraker
ConStruct-VL: Data-Free Continual Structured VL Concepts Learning
James Smith · Paola Cascante-Bonilla · Assaf Arbelle · Donghyun Kim · Rameswar Panda · David Cox · Diyi Yang · Zsolt Kira · Rogerio Feris · Leonid Karlinsky
À-la-carte Prompt Tuning (APT): Combining Distinct Data Via Composable Prompting
Benjamin Bowman · Alessandro Achille · Luca Zancato · Matthew Trager · Pramuditha Perera · Giovanni Paolini · Stefano Soatto
Prompting Large Language Models with Answer Heuristics for Knowledge-based Visual Question Answering
Zhenwei Shao · Zhou Yu · Meng Wang · Jun Yu
Super-CLEVR: A Virtual Benchmark to Diagnose Domain Robustness in Visual Reasoning
Zhuowan Li · Xingrui Wang · Elias Stengel-Eskin · Adam Kortzlewski · Wufei Ma · Benjamin Van Durme · Alan Yuille
Visual Programming: Compositional visual reasoning without training
Tanmay Gupta · Aniruddha Kembhavi
Multimodal Prompting with Missing Modalities for Visual Recognition
Yi-Lun Lee · Yi-Hsuan Tsai · Wei-Chen Chiu · Chen-Yu Lee
EXCALIBUR: Encouraging and Evaluating Embodied Exploration
Hao Zhu · Raghav Kapoor · So Yeon Min · Winson Han · Jiatai Li · Kaiwen Geng · Graham Neubig · Yonatan Bisk · Aniruddha Kembhavi · Luca Weihs
Iterative Vision-and-Language Navigation
Jacob Krantz · Shurjo Banerjee · Wang Zhu · Jason Corso · Peter Anderson · Stefan Lee · Jesse Thomason
Adaptive Zone-aware Hierarchical Planner for Vision-Language Navigation
Chen Gao · Xingyu Peng · Mi Yan · He Wang · Lirong Yang · Haibing Ren · Hongsheng Li · Si Liu
SkyEye: Self-Supervised Bird’s-Eye-View Semantic Mapping Using Monocular Frontal View Images
Nikhil Gosala · Kürsat Petek · Paulo Drews-Jr · Wolfram Burgard · Abhinav Valada
Natural Language-Assisted Sign Language Recognition
Ronglai Zuo · Fangyun Wei · Brian Mak
Learning to Predict Situation Hyper-Graphs for Video Question Answering
Aisha Urooj · Hilde Kuehne · Bo Wu · Kim Chheu · Walid Bousselham · Chuang Gan · Niels Lobo · Mubarak Shah
Align and Attend: Multimodal Summarization with Dual Contrastive Losses
Bo He · Jun Wang · Jielin Qiu · Trung Bui · Abhinav Shrivastava · Zhaowen Wang
Clover: Towards A Unified Video-Language Alignment and Fusion Model
Jingjia Huang · Yinan Li · Jiashi Feng · Xinglong Wu · Xiaoshuai Sun · Rongrong Ji
Towards Fast Adaptation of Pretrained Contrastive Models for Multi-channel Video-Language Retrieval
Xudong Lin · Simran Tiwari · Shiyuan Huang · Manling Li · Mike Zheng Shou · Heng Ji · Shih-Fu Chang
PDPP:Projected Diffusion for Procedure Planning in Instructional Videos
Hanlin Wang · Yilu Wu · Sheng Guo · Limin Wang
Learning Procedure-aware Video Representation from Instructional Videos and Their Narrations
Yiwu Zhong · Licheng Yu · Yang Bai · Shangwen Li · Xueting Yan · Yin Li
Text-Visual Prompting for EFFICient 2D Temporal Video Group
Yimeng zhang · xin chenhan jinghan jiajia LiU · Ke Ding
Languaged Music Recoming F OR VIDEO VIA Prompt Analogies
Daniel McKee · Justin Salamon · Josef Sivic · Bryan Russell
Mist: Multi-Modal Iterateive Spatial-Temporal Transformer for Long-Form Video Question Answering Difei Gao · Luowei zhou
· Lei Ji · Linchao zhu · yang Shou
Distilling VISION-LANGUAGE PRAINING to Collaborate with Weakly-Supervable Temporal Action
Zheng · Jinxiang Liu · Peisen Zhao · Ya Zhang · Jianlong Chang · Qi Tian · Yanfeng Wang
Cascade Evidential Learning for Open-world Weakly-supervised Temporal Action Localization
Mengyuan Chen Junyu Gao Changsheng Xu
STMixer: A One-Stage Sparse Action Detector
Tao Wu Mengqi Cao Ziteng Gao Gangshan Wu Limin Wang
The Wisdom of Crowds: Temporal Progressive Attention for Early Action Prediction
Alexandros Stergiou Dima Damen
A Large -scale Robustness Analysis of Video Action Recognition Models
Madeline Chantry Naman Biyani Prudvi Kamtam Shruti Vyas Hamid Palangi Vibhav Vineet Yogesh Rawat
Learning to Dub Movies via Hierarchical Prosody Models
Gaoxiang Cong Liang Li Yuankai Qi Zheng -Jun Zha · Qi Wu · Wenyu Wang · Bin. Jiang · Ming-Hsuan Yang · Qingming Huang
iQuery: Instruments as Queries for Audio-Visual Sound Separation
Jiaben Chen · Renrui Zhang · Dongze Lian · Jiaqi Yang · Ziyao Zeng · Jianbo Shi
Egocentric Auditory Attention Localization in Conversations
Fiona Ryan · Hao Jiang · Abhinav Shukla · James Rehg · Vamsi Krishna Ithapu
Seeing What You Said: Talking Face Gene ration Guided by a Lip Reading Expert
Jiadong Wang · Xinyuan Qian · Malu Zhang · Robby Tan · Haizhou Li
Source-Free Video Domain Adaptation with Spatial-Temporal-Historical Consistency Learning
Kai Li · Deep A Patel · Erik Kruus · Martin Min
Referring Multi-Object Tracking
Dongming Wu · Wencheng Han Tiancai Wang Xingping Dong Xiangyu Zhang Jianbing Shen
A Generalized Framework for Video Instance Segmentation
Miran Heo · Sukjun Hwang · Jeongseok Hyun · Hanjung Kim · Seoung Wug Oh · Joon-Young Lee · Seon Joo Kim
LSTFE-Net:Long Short-Term Feature Enhancement Network for Video Small Object Detection
Jinsheng Xiao · Yuanxu Wu · Yunhua Chen · Shurui Wang · Zhongyuan Wang · Jiayi Ma
Streaming Video Model
Yucheng Zhao · Chong Luo · Chuanxin Tang · Dongdong Chen · Noel Codella · Zheng-Jun Zha
Video Event Restoration Based on Keyframes for Video Anomaly Detection
Zhiwei Yang · Jing Liu · Zhaoyang Wu · Peng Wu · Xiaotao Liu
Bootstrapping Objectness from Videos by Relaxed Common Fate and Visual Grouping
Long Lian · Zhirong Wu · Stella Yu
SeqTrack: Sequence to Sequence Learning for Visual Object Tracking
Xin Chen Houwen Peng Dong Wang Huchuan Lu Han Hu
VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking
Limin Wang Bingkun Huang Zhiyu Zhao Zhan Tong Yinan He Yi Wang Yali Wang Yu Qiao
Iterative Next Boundary Detection for Instance Segmentation of Tree Rings in Microscopy Images of Shrub Cross Sections
Alexander Gillert Giulia Resente Alba Anadon-Rosell Martin Wilmking Uwe Freiherr von Lukas
Visual Dependency Transformers: Dependency Tree Emerges from Reversed Attention
Mingyu Ding · Yikang Shen · Lijie Fan · Zhenfang Chen · Zitian Chen · Ping Luo · Joshua Tenenbaum · Chuang Gan
SimpSON: Simplifying Photo Cleanup with Single-Click Distracting Object Segmentation Network
Chuong Huynh · Yuqian Zhou · Zhe Lin · Connelly Barnes · Eli Shechtman · Sohrab Amirghodsi · Abhinav Shrivastava
Ada
MAE: Adaptive Masking for Efficient Spatiotemporal Learning with Masked Autoencoders
Wele Bandara Bandara · Naman Patel · Ali Gholami · Mehdi Nikkhah · Motilal Agrawal · Vishal Patel
FlexiViT: One Model for All Patch Sizes
Lucas Beyer · Pavel Izmailov · Alexander Kolesnikov · Mathilde Caron · Simon Kornblith · Xiaohua Zhai · Matthias Minderer · Michael Tschannen · Ibrahim Alabdulmohsin · Filip Pavetic
Improving Visual Representation Learning through Perceptual Understanding
Samyakh Tukra · Fred Hoffman · Ken Chatfield
Revealing the Dark Secrets of Masked Image Modeling
Zhenda Xie · Zigang Geng · Jingcheng Hu · Zheng Zhang · Han Hu · Yue Cao
Non-Contrastive Unsupervised Learning of Physiological Signals from Video
Jeremy Speth · Nathan Vance · Patrick Flynn · Adam Czajka
High-resolution image reconstruction with latent diffusion models from human brain activity
Yu Takagi Shinji Nishimoto
RIFormer: Keep Your Vision Backbone Effective But Removing Token Mixer
Jiahao Wang Songyang Zhang Yong Liu Taiqiang Wu Yujiu Yang Xihui Liu Kai Chen Ping Luo Dahua Lin
Castling-ViT: Compressing Se lf- Attention via Switching Towards Linear-Angular Attention During Vision Transformer Inference
Haoran You · Yunyang Xiong · Xiaoliang Dai · Peizhao Zhang · Bichen Wu · Haoqi Fan · Peter Vajda · Yingyan Lin
EfficientViT: Memory Efficient Vision Transformer with Cascaded Group Attention
Xinyu Liu · Houwen Peng · Ningxin Zheng · Yuqing Yang · Han Hu · Yixuan Yuan
InternImage: Exploring Large-Scale Vision Fundamental Models with Deformable Convolutions
Wenhai Wang · Jifeng Dai · Zhe Chen · Zhenhang Huang · Zhiqi Li · Xizhou Zhu · Xiaowei Hu · Tong Lu · Lewei Lu · Hongsheng Li · Xiaogang Wang · Yu Qiao
Memory-friendly Scalable Super-resolution via Rewinding Lottery Ticket Hypothesis
林 锦 · Xiaotong Luo · ming Hong · Yanyun Qu · Yuan Xie · Zongze Wu
Learned Image Compression with Mixed Transformer-CNN Architectures
Jinming Liu · Heming Sun · Jiro Katto
NIRVANA: Neural Implicit Representations of Videos with Adaptive Networks and Autoregressive Patch-wise Modeling
Shishira Maiya · Sharath Girish · Max Ehrlich · Hanyu Wang · Kwot Sin Lee · Patrick Poirson · Pengxiang Wu · Chen Wang · Abhinav Shrivastava
Complexity-guided Slimmable Decoder for Efficient Deep Video Compression
Zhihao Hu · Dong Xu
Context-Based Trit-Plane Coding for Progressive Image Compression
Seungmin Jeon · KWANG PYO CHOI · YOUNGO PARK · Chang-Su Kim
End-to-end Video Matting with Trimap Propagation
Wei-Lun Huang · Ming-Sui Lee
Rethinking Image Super Resolution from Long-Tailed Distribution Learning Perspective
Yuanbiao Gou · Peng Hu · Jiancheng Lv · Hongyuan Zhu · Xi Peng
Shape-aware Text-driven Layered Video Editing
Yao-Chih Lee · Ji-Ze Jang · Yi-Ting Chen · Elizabeth Qiu · Jia-Bin Huang
Dimensionality-Varying Diffusion Process
Han Zhang · Ruili Feng · Zhantao Yang · Lianghua Huang · Yu Liu · Yifei Zhang · Yujun Shen · Deli Zhao · Jingren Zhou · Fan Cheng
On Distillation of Guided Diffusion Models
Chenlin Meng · Robin Rombach · Ruiqi Gao · Diederik Kingma · Stefano Ermon · Jonathan Ho · Tim Salimans
Towards Flexible Multi-modal Document Models
Naoto Inoue · Kotaro Kikuchi · Edgar Simo-Serra · Mayu Otani · Kota Yamaguchi
Toward verifiable and reproducible human evaluation for text-to-image generation
Mayu Otani Riku Togashi Yu Sawai Ryosuke Ishigami Yuta Nakashima Esa Rahtu Janne Heikkila Shin'ichi Satoh
Specialist Diffusion: Plug-and-Play Sample-Efficient Fine-Tuning of Text-to-Image Diffusion Models to Learn Any Unseen Style
Haoming Lu Hazarapet Tunanyan Kai Wang Shant Navasardyan Zhangyang Wang Humphrey Shi
Freestyle Layout-to-Image Synthesis
Han Xue Zhiwu Huang Qianru Sun Li Song Wenjun Zhang
ReCo: Region-Controlled Text-to-Image Generation
Zhengyuan Yang Jianfeng Wang Zhe Gan Linjie Li Kevin Lin Chenfei Wu Nan Duan Zicheng Liu Ce Liu Michael Zeng Lijuan Wang
Conditional Text Image Generation with Diffusion Models
Yuanzhi Zhu Zhaohai Li Tianwei Wang Mengchao He Cong Yao
Fix the Noise: Disentangling Source Feature for Controllable Domain Translation
Dongyeun Lee Jae Young Lee Doyeon Kim Jaehyun Choi Jaejun Yoo Junmo Kim
GALIP: Generative Adversarial CLIPs for Text -to-Image Synthesis
Ming Tao · Bing-Kun BAO · Hao Tang · Changsheng Xu
DATID-3D: Diversity-Preserved Domain Adaptation Using Text-to-Image Diffusion for 3D Generative Model
Gwanghyun Kim · Se Young Chun
NÜWA-LIP: Language- guided Image Inpainting with Defect-free VQGAN
Minheng Ni Xiaoming Li Wangmeng Zuo
Neural Preset for Color Style Transfer
Zhanghan Ke Yuhao LIU Lei Zhu Nanxuan Zhao Rynson Lau
Restoration of Hand-Drawn Architectural Drawings using Latent Space Mapping with Degradation Generator
Nakkwan Choi Seungjae Lee Yongsik Lee Seungjoon Yang
Neural Fourier Filter Bank
Zhijie Wu Yuhe Jin Kwang Moo Yi
PyramidFlow: High-Resolution De fect Contrastive Localization using Pyramid Normalizing Flow
Jiarui Lei · Xiaobo Hu · Yue Wang · Dong Liu
PHA: Patch-wise High-frequency Augmentation for Transformer-based Person Re-identification
Guiwei Zhang · Yongfei Zhang · Tianyu Zhang · Bo Li · Shiliang Pu
Comprehensive and Delicate: An Efficient Transformer for Image Restoration
Haiyu Zhao · Yuanbiao Gou · Boyun Li · Dezhong Peng · Jiancheng Lv · Xi Peng
Ultrahigh Resolution Image/Video Matting with Spatio-Temporal Sparsity
Yanan SUN · Chi-Keung Tang · Yu-Wing Tai
Equivalent Transformation and Dual Stream Network Construction for Mobile Image Super-Resolution
Jiahao Chao · Zhou Zhou · Hongfan Gao · Jiali Gong · Zhengfeng Yang · Zhenbing Zeng · Lydia Dehbi
Real-time 6K Image Rescaling with Rate-distortion Optimization
Chenyang Qi · XIN YANG · Ka Leong Cheng · Ying-Cong Chen · Qifeng Chen
Human Guided Ground-truth Generation for Realistic Image Super-resolution
Du Chen · Jie Liang · Xindong Zhang · Ming Liu · Hui Zeng · Lei Zhang
Blind Image Quality Assessment via Vision-Language Correspondence: A Multitask Learning Perspective
Weixia Zhang · Guangtao Zhai · Ying Wei · Xiaokang Yang · Kede Ma
Visual Recognition-Driven Image Restoration for Multiple Degradation with Intrinsic Semantics Recovery
Zizheng Yang · Jie Huang · Jiahao Chang · man zhou · Hu Yu · Jinghao Zhang · Feng Zhao
ShadowDiffusion: When Degradation Prior Meets Diffusion Model for Shadow Removal
Lanqing Guo · Chong Wang · Wenhan Yang · Siyu Huang · Yufei Wang · Hanspeter Pfister · Bihan Wen
Probability-based Global Cross-modal Upsampling for Pan-sharpening
Zeyu Zhu · Xiangyong Cao · man zhou · Junhao Huang · Deyu Meng
Real-time Controllable Denoising for Image and Video
Zhaoyang Zhang · Yitong Jiang · Wenqi Shao · Xiaogang Wang · Ping Luo · Kaimo Lin · Jinwei Gu
Zero-Shot Noise2Noise: Efficient Image Denoising without any Data
Youssef Mansour · Reinhard Heckel
Rawgment: Noise-Accounted RAW Augmentation Enables Recognition in a Wide Variety of Environments
Masakazu Yoshimura · Junji Otsuka · Atsushi Irie · Takeshi Ohashi
Structure Aggregation for Cross-Spectral Stereo Image Guided Denoising
Zehua Sheng · Zhu Yu · Xiongwei Liu · Siyuan Cao · Yuqi Liu · Hui-liang Shen · Huaqi Zhang
Self-supervised Blind Motion Deblurring with Deep Expectation Maximization
Ji Li · Weixi Wang · YUESONG NAN · Hui Ji
Joint HDR Denoising and Fusion: A Real-World Mobile HDR Image Dataset
Shuaizheng Liu · Xindong Zhang · Lingchen Sun · Zhetong Liang · Hui Zeng · Lei Zhang
MetaFusion: Infrared and Visible Image Fusion via Meta-Feature Embedding from Object Detection
Wenda Zhao Shigeng Xie Fan Zhao You He Huchuan Lu
FeatER: An Efficient Network for Human Reconstruction via Feature Map-Based TransformER
Ce Zheng Matias Mendieta Taojiannan Yang Guo-Jun Qi Chen Chen
Joint Video Multi-Frame Interpolation and Deblurring under Unknown Exposure Time
Wei Shang Dongwei Ren yi yang Hongzhi Zhang Kede Ma Wangmeng Zuo
Learning Event Guided High Dynamic Range Video Reconstruction
Yixin Y ang Jin Han Jinxiu Liang Zhihang Zhong Boxin Shi
Multi Domain Learning for Motion Magnification
JASDEEP SINGH Subrahmanyam Murala G Sankara Kosuru
EvShutter: Transforming Events for Unconstrained Rolling Shutter Correction
Julius Erbach · Stepan Tulyakov · Patricia Vitoria · Alfredo Bochicchio · YUANYOU LI
Continuous Intermediate Token Learning with Implicit Motion Manifold for Keyframe Based Motion Interpolation
Clinton Mo · Kun Hu · Chengjiang Long · Zhiyong Wang
Recurrent Vision Transformers for Object Detection with Event Cameras
Mathias Gehrig · Davide Scaramuzza
MoDi: Unconditional Motion Synthesis from Diverse Data
Sigal Raab · Inbal Leibovitch · Peizhuo Li · Kfir Aberman · Olga Sorkine-Hornung · Daniel Cohen-Or
Skinned Motion Retargeting with Residual Perception of Motion Semantics & Geometry
Jiaxu Zhang · Junwu Weng · Di Kang · Fang Zhao · Shaoli Huang · Xuefei Zhe · Linchao Bao · Ying Shan · Jue Wang · Zhigang Tu
Real-time Multi-person Eyeblink Detection in the Wild for Untrimmed Video
Wenzheng Zeng · Yang Xiao · Sicheng Wei · Jinfang Gan · Xintao Zhang · Zhiguo Cao · Zhiwen Fang · Joey Zhou
SelfME: Self-Supervised Motion Learning for Micro-Expression Recognition
Xinqi Fan · Xueli CHEN · Mingjie Jiang · Ali Shahid · Hong Yan
An In-depth Exploration of Person Re-identification and Gait Recognition in Cloth-Changing Conditions
Weijia Li · Saihui Hou · Chunjie Zhang · Chunshui Cao · Xu Liu · Yongzhen Huang · Yao Zhao
Simple Cues Lead to a Strong Multi-Object Tracker
Jenny Seidenschwarz · Guillem Braso · Víctor Castro Serrano · Ismail Elezi · Laura Leal-Taixé
Tracking through Containers and Occluders in the Wild
Basile Van Hoorick · Pavel Tokmakov · Simon Stent · Jie Li · Carl Vondrick
Indiscernible Object Counting in Underwater Scenes
Guolei Sun · Zhaochong An · Yun Liu · Ce Liu · Christos Sakaridis · Deng-Ping Fan · Luc Van Gool
Affordances from Human Videos as a Versatile Representation for Robotics
Shikhar Bahl · Russell Mendonca · Lili Chen · Unnat Jain · Deepak Pathak
Galactic: Scaling End-to-End Reinforcement Learning for Rearrangement at 100k Steps-Per-Second
Vincent-Pierre Berges · Andrew Szot · Devendra Singh Chaplot · Aaron Gokaslan · Roozbeh Mottaghi · Dhruv Batra · Eric Undersander
Trace and Pace: Controllable Pedestrian Animation via Guided Trajectory Diffusion
Davis Rempe · Zhengyi Luo · Xue Bin Peng · Ye Yuan · Kris Kitani · Karsten Kreis · Sanja Fidler · Or Litany
FJMP: Factorized Joint Multi-Agent Motion Prediction over Learned Directed Acyclic Interaction Graphs
Luke Rowe · Martin Ethier · Eli-Henry Dykhne · Krzysztof Czarnecki
Open-World Multi-Task Control Through Goal-Aware Representation Learning and Adaptive Horizon Prediction
Shaofei Cai · Zihao Wang · Xiaojian Ma · Anji Liu · Yitao Liang
ReasonNet: End-to-End Driving with Temporal and Global Reasoning
Hao Shao · Letian Wang · Ruobing Chen · Steven Waslander · Hongsheng Li · Yu Liu
V2V4Real: A large-scale real-world dataset for Vehicle-to-Vehicle Cooperative Perception
Runsheng Xu · Xin Xia · JINLONG LI · Hanzhao Li · Shuo Zhang · Zhengzhong Tu · Zonglin Meng · Hao Xiang · Xiaoyu Dong · Rui Song · Hongkai Yu · Bolei Zhou · Jiaqi Ma
Bayesian posterior approximation with stochastic ensembles
Oleksandr Balabanov · Bernhard Mehlig · Hampus Linander
DistractFlow: Improving Optical Flow Estimation via Realistic Distractions and Pseudo-Labeling
Jisoo Jeong · Hong Cai · Risheek Garrepalli · Fatih Porikli
Sliced optimal partial transport
Yikun Bai · Bernhard Schmitzer · Matthew Thorpe · Soheil Kolouri
Unsupervised Deep Asymmetric Stereo Matching with Spatially-Adaptive Self-Similarity
Taeyong Song Sunok Kim Kwanghoon Sohn
Similarity Metric Learning For RGB-Infrared Group Re-Identification
Jianghao Xiong Jianhuang Lai
Generalizable Local Feature Pre-training for Deformable Shape Analysis
SOUHAIB ATTAIKI Lei Li Maks Ovsjanikov
Quantum Multi- Model Fitting
Matteo Farina · Luca Magri · Willi Menapace · Elisa Ricci · Vladislav Golyanik · Federica Arrigoni
Bridging Search Region Interaction with Template for RGB-T Tracking
Tianrui Hui · Zizheng Xun · Fengguang Peng · Junshi Huang · Xiaoming Wei · Xiaolin Wei · Jiao Dai · Jizhong Han · Si Liu
Local Connectivity-Based Density Estimation for Face Clustering
Junho Shin · Hyo-Jun Lee · Hyunseop Kim · Jong-Hyeon Baek · Daehyun Kim · Yeong Jun Koh
Unsupervised Deep Probabilistic Approach for Partial Point Cloud Registration
Guofeng Mei · Hao Tang · Xiaoshui Huang · Weijie Wang · Juan Liu · Jian Zhang · Luc Van Gool Qiang Wu
NerVE: Neural Volumetric Edges for Parametric Curve Extraction from Point Cloud
Xiangyu Zhu Dong Du Weikai Chen Zhiyou Zhao Yinyu Nie Xiaoguang Han
SHS-Net: Learning Signed Hyper Surfaces for Oriented Normal Estimation of Point Clouds
Qing Li · Huifang Feng · Kanle Shi · Yue Gao · Yi Fang · Yushen Liu · Zhizhong Han
AnchorFormer: Point Cloud Completion from Discriminative Nodes
ZHIKAI CHEN · Fuchen Long · Zhaofan Qiu · Ting Yao · Wengang Zhou · Jiebo Luo · Tao Mei
GeoMAE: Masked Geometric Target Prediction for Self-supervised Point Cloud Pre-Training
Xiaoyu Tian · Haoxi Ran · Yue Wang · Hang Zhao
Symmetric Shape-Preserving Autoencoder for Unsupervised Real Scene Point Cloud Completion
Changfeng Ma Yinuo Chen Pengxiao Guo Jie Guo Chongjun Wang Yanwen Guo ISBNet
: a 3D Point Cloud Instance Segmentation Network with Instance-aware Sampling and Box-aware Dynamic Convolution
Tuan Ngo Binh- Son Hua · Khoi Nguyen
itKD: Interchange Transfer-based Knowledge Distillation for 3D Object Detection
Hyeon Cho · Junyong Choi · Geonwoo Baek · Wonjun Hwang
DSVT: Dynamic Sparse Voxel Transformer with Rotated Sets
Haiyang Wang · Chen Shi · Shaoshuai Shi · Meng Lei · Sen Wang · Di He · Bernt Schiele · Liwei Wang
WeatherStream: Light Transport Automation of Single Image Deweathering
Howard Zhang · Yunhao Ba · Ethan Yang · Varan Mehra · Blake Gella · Akira Suzuki · Arnold Pfahnl · Chethan Chinder Chandrappa · ​​Alex Wong · Achuta Kadambi
LargeKernel3D: Scaling up Kernels in 3D Sparse CNNs
Yukang Chen · Jianhui Liu · Xiangyu Zhang · XIAOJUAN QI · Jiaya Jia
PVT -SSD: Single- Stage 3D Object Detector with Point-Voxel Transformer
Honghui Yang Wenxiao Wang Minghao Chen Binbin Lin Tong He Hua Chen Xiaofei He Wanli Ouyang
Unsupervised Intrinsic Image Decomposition with LiDAR Intensity
Shogo Sato · Yasuhiro Yao · Taiga Yoshida · Takuhiro Kaneko · Shingo Ando · Jun Shimamura
ALSO: Automotive Lidar Self-supervision by Occupancy estimation
Alexandre Boulch · Corentin Sautier · Björn Michele · Gilles Puy · Renaud Marlet
MV-JAR: Masked Voxel Jigsaw and Reconstruction for LiDAR-Based Self-Supervised Pre-Training
Runsen Xu · Tai Wang · Wenwei Zhang · Runjian Chen · Jinkun Cao · Jiangmiao Pang · Dahua Lin
Adaptive Sparse Convolutional Networks with Global Context Enhancement for Faster Object Detection on Drone Images
bowei du · Yecheng Huang · JX Chen · Di Huang
Center Focusing Network for Real-Time LiDAR Panoptic Segmentation
Xiaoyan Li · Gang Zhang · Boyue Wang · Yongli Hu · Baocai Yin
Learning and Aggregating Lane Graphs for Urban Automated Driving
Martin Büchner · Jannik Zürn · Ion-George Todoran · Abhinav Valada · Wolfram Burgard
LiDAR-in-the-loop Hyperparameter Optimization
Félix Antoine Goudreault · Dominik Scheuble · Mario Bijelic · Nicolas Robidoux · Felix Heide
Bi-directional LiDAR-Radar Fusion for 3D Dynamic Object Detection
颖杰 王 · Jiajun Deng · Yao Li · Jinshui Hu · Cong Liu · Yu Zhang · Jianmin Ji · Wanli Ouyang · Yanyong Zhang
Toward RAW Object Detection: A New Benchmark and A New Model
Ruikang Xu · Chang Chen · Jingyang Peng · Cheng Li · Yibin Huang · Fenglong Song · Youliang Yan · Zhiwei Xiong
Resource-Efficient RGBD Aerial Tracking
Jinyu Yang · Shang Gao · Zhe Li · Feng Zheng · Ales Leonardis
Learned Two-Plane Perspective Prior based Image Resampling for Efficient Object Detection
Anurag Ghosh · Dinesh Reddy Narapureddy · Christoph Mertz · Srinivasa Narasimhan
Phase-Shifting Coder: Predicting Accurate Orientation in Oriented Object Detection
Yi Yu · Feipeng Da
PaCa-ViT: Learning Patch-to-Cluster Attention in Vision Transformers
Ryan Grainger · Thomas Paniagua · Xi Song · Naresh Cuntoor · MUN WAI LEE · Tianfu Wu
Global Vision Transformer Pruning with Hessian-Aware Saliency
Huanrui Yang · Hongxu Yin · Maying Shen · Pavlo Molchanov · Hai Li · Jan Kautz
Lite-Mono: A Lightweight CNN and Transformer Architecture for Self-Supervised Monocular Depth Estimation
Ning Zhang · Francesco Nex · George Vosselman · Norman Kerle
CompletionFormer: Depth Completion with Convolutions and Vision Transformers
Youmin Zhang · Xianda Guo · Matteo Poggi · Zheng Zhu · Guan Huang · Stefano Mattoccia
TINC: Tree-structured Implicit Neural Compression
Runzhao Yang
WIRE: Wavelet Implicit Neural Representations
Vishwanath Saragadam · Daniel LeJeune · Jasper Tan · Guha Balakrishnan · Ashok Veeraraghavan · Richard Baraniuk
Video Compression with Entropy-Constrained Neural Representations
Carlos Gomes · Roberto Azevedo · Christopher Schroers
MMVC: Learned Multi-Mode Video Compression with Block-based Prediction Mode Selection and Density-Adaptive Entropy Coding
Bowen Liu · Yu Chen · Rakesh Chowdary Machineni · Shiyu Liu · Hun-Seok Kim
EfficientSCI: Densely Connected Network with Space-time Factorization for Large-scale Video Snapshot Compressive Imaging
lishun wang · Miao Cao · Xin Yuan
Regularized Vector Quantization for Tokenized Image Synthesis
Jiahui Zhang · Fangneng Zhan · Christian Theobalt · Shijian Lu
Video Probabilistic Diffusion Models in Projected Latent Space
Sihyun Yu · Kihyuk Sohn · Subin Kim · Jinwoo Shin
Conditional Image-to-Video Generation with Latent Flow Diffusion Models
Haomiao Ni · Changhao Shi · Kai Li · Sharon Huang · Martin Min
Class-Balancing Diffusion Models
Yiming QIN Huangjie Zheng Jiangchao Yao Mingyuan Zhou Ya Zhang
HOLODIFFUSION: Training a 3D Diffusion Model using 2D Images
Animesh Karnewar Andrea Vedaldi David Novotny Niloy Mitra
Self-Guided Diffusion Models
Tao Hu · David Zhang · Yuki Asano · Gertjan Burghouts · Cees Snoek
LayoutFormer++: Conditional Graphic Layout Generation via Constraint Serialization and Decoding Space Restriction
Zhaoyun Jiang · Jiaqi Guo · Shizhao Sun · Huayu Deng · Zhongkai Wu · Vuksan Mijovic · Zijiang Yang · Jian -Guang Lou Dongmei Zhang
InstructPix2Pix: Learning to Follow Image Editing Instructions
Tim Brooks Aleksander Holynski Alexei A. Efros
SpaText: Spatio-Textual Representation for Controllable Image Generation Omri Avrahami · Thomas Hayes. Paint by Example
:
Spatio-Textual Representation for Controllable Image Generation
Omri Avrahami · Thomas Hayes · Oran Gafni · Sonal Gupta · Yaniv Taigman · Devi Parikh · Dani Lischinski · Ohad Fried · Xi Yin
Image Editor and EditBench: Advancing and Evaluating Text-Guided Image Painting
Su Wang · Chitwan Saharia · Ceslee Montgomery · Jordi Pont-Tuset · Shai Noy · Stefano Pellegrini · Yasumasa Onoe · Sarah Laszlo · David Fleet · Radu Soricut · Jason Baldridge · Mohammad Norouzi · Peter Anderson · William Chan
LayoutDM: Transformer-based Diffusion Model for Layout Generation
Shang Chai · Liansheng Zhuang · Fengying Yan
CLIP-Sculptor: Zero-Shot Generation of High-Fidelity and Diverse Shapes from Natural Language
Aditya Sanghi · Rao Fu · Vivian Liu · Karl Willis · Hooman Shayani · Amir Khasahmadi · Srinath Sridhar · Daniel Ritchie
Master: Meta Style Transformer for Controllable Zero-Shot and Few-Shot Artistic Style Transfer
Hao Tang · Songhua Liu · Tianwei Lin · Shaoli Huang · Fu Li · Dongliang He · Xinchao Wang
DeepVecFont-v2: Exploiting Transformers to Synthesize Vector Fonts with Higher Quality
Yuqing Wang · Yizhi Wang · Longhui Yu · Yuesheng Zhu · Zhouhui Lian
ObjectStitch: Object Compositing with Diffusion Model
Yizhi Song · Zhifei Zhang · Zhe Lin · Scott Cohen · Brian Price · Jianming Zhang · Soo Ye Kim · Daniel Aliaga
CAP-VSTNet: Content Affinity Preserved Versatile Style Transfer
Linfeng Wen · Chengying Gao · Changqing Zou
LEMaRT: Label-Efficient Masked Region Transform for Image Harmonization
Sheng Liu · Cong Phuoc Huynh · Cong Chen · Maxim Arap · Raffay Hamid
Efficient and Explicit Modelling of Image Hierarchies for Image Restoration
Yawei Li · Yuchen Fan · Xiaoyu Xiang · Denis Demandolx · Rakesh Ranjan · Radu Timofte · Luc Van Gool
GamutMLP: A Lightweight MLP for Color Loss Recovery
Hoang Le · Brian Price · Scott Cohen · Michael Brown
Cascaded Local Implicit Transformer for Arbitrary-Scale Super-Resolution
Hao-Wei Chen · Yu-Syuan Xu · Min-Fong Hong · Yi-Min Tsai · Hsien-Kai Kuo · Chun-Yi Lee
Super-Resolution Neural Operator
Min Wei · Xuesong Zhang
Guided Depth Super-Resolution by Deep Anisotropic Diffusion
Nando Metzger · Rodrigo Daudt · Konrad Schindler
AutoFocusFormer: Image Segmentation off the Grid
Ziwen Chen · Kaushik Patnaik · Shuangfei Zhai · Alvin Wan · Zhile Ren · Alexander Schwing · R Colburn · Li Fuxin
AccelIR: Task-aware Image Compression for Accelerating Neural Restoration
Juncheol Ye · Hyunho Yeo · Jinwoo Park · Dongsu Han
Raw Image Reconstruction with Learned Compact Metadata
Yufei Wang · Yi Yu · Wenhan Yang · Lanqing Guo · Lap-Pui Chau · Alex Kot · Bihan Wen
Context-aware Pretraining for Efficient Blind Image Decomposition
Chao Wang · Zhedong Zheng · Ruijie Quan · Yifan Sun · Yi Yang
Deep Random Projector: Accelerated Deep Image Prior
Taihui Li · Hengkang Wang · Zhong Zhuang · Ju Sun
Spectral Bayesian Uncertainty for Image Super-resolu the action
Tao Liu Jun Cheng Shan Tan
Contrastive Semi-supervised Learning for Underwater Image Restoration via Reliable Bank
Shirui Huang Keyan Wang Huan Liu Jun Chen Yunsong Li
You Do Not Need Additional Priors or Regularizers in Retinex-based Low-light Image Enhancement
Huiyuan Fu · Wenkai Zheng · Xiangyu Meng · Xin Wang · Chuanming Wang · Huadong Ma
Decoupling-and-Aggregating for Image Exposure Correction
Yang Wang · Long Peng · Liang Li · Yang Cao · Zheng-Jun Zha
Self-supervised Non-uniform Kernel Estimation with Flow-based Motion Prior for Blind Image Deblurring
Zhenxuan Fang · Fangfang Wu · Weisheng Dong · Xin Li · Jinjian Wu · Guangming Shi
Neural Texture Synthesis with Guided Correspondence
Yang Zhou Kaijian Chen rongjun xiao Hui Huang
GradICON: Approximate Diffeomorphisms via Gradient Inverse Consistency
Lin Tian Thomas Greer François-Xavier Vialard Roland Kwitt Raul San Jose E stepar Richard Rushmore Nikolaos Makris · Sylvain Bouix · Marc Niethammer
TransFlow: Transformer as Flow Learner
Yawen Lu · Qifan Wang · Siqi Ma · Tong Geng · Yingjie Victor Chen · Huaijin Chen · Dongfang Liu
Video Dehazing via a Multi-Range Temporal Alignment Network with Physical Prior
Jiaqi Xu · Xiaowei Hu · Lei Zhu · DOU QI · Jifeng Dai · Yu Qiao · Pheng-Ann Heng
Event-Based Frame Interpolation with Ad-hoc Deblurring
Lei Sun · Christos Sakaridis · Jingyun Liang · Peng Sun · Jiezhang Cao · Kai Zhang · Qi Jiang · Kaiwei Wang · Luc Van Gool
Event-based Video Frame Interpolation with Cross-Modal Asymmetric Bidirectional Motion Fields
Taewoo Kim · Yujeong Chae · Hyun-Kurl Jang · Kuk-Jin YOON
"Seeing’’ Electric Network Frequency from Events
Lexuan Xu · Guang Hua · Haijian Zhang · Lei Yu · Ning Qiao
Executing your Commands via Motion Diffusion in Latent Space
Xin Chen · Biao Jiang · Wen Liu · Zilong Huang · BIN FU · Tao Chen · Gang Yu
Event-guided Person Re-Identification via Sparse-Dense Complementary Learning
Chengzhi Cao · Xueyang Fu · Hongjian Liu · Yukun Huang · Kunyu Wang · Jiebo Luo · Zheng-Jun Zha
Progressive Disentangled Representation Learning for Fine-Grained Controllable Talking Head Synthesis
Duomin Wang · Yu Deng · Zixin Yin · Heung-Yeung Shum · Baoyuan Wang
One-Shot High-Fidelity Talking-Head Synthesis with Deformable Neural Radiance Field
Weich uang li · Longhao Zhang · Dong Wang · Bin Zhao · Zhigang Wang · Mulin Chen · Bang Zhang · Zhongjian Wang · Liefeng Bo · Xuelong Li
Rethinking the Learning Paradigm for Dynamic Facial Expression Recognition
Hanyang Wang · Bo Li · Shuang Wu · Siyuan Shen · Feng Liu · Shouhong Ding · Aimin Zhou
Multi-modal Gait Recognition via Effective Spatial-Temporal Feature Fusion
Yufeng Cui · Yimei Kang
MotionTrack: Learning Robust Short-term and Long-term Motions for Multi-Object Tracking
Zheng Qin Sanping Zhou Le Wang Jinghai Duan Gang Hua Wei Tang
Standing Between Past and Future: Spatio-Temporal Modeling for Multi-Camera 3D Multi-Object Tracking
Ziqi Pang Jie Li Pavel Tokmakov Dian Chen · Sergey Zagoruyko · Yu-Xiong Wang
Camouflaged Instance Segmentation via Explicit De-camouflaging
Naisong Luo · Yuwen Pan · Rui Sun · Tianzhu Zhang · Zhiwei Xiong · Feng Wu
NeRF in the Palm of Your Hand: Corrective Robot Augmentation via Novel-View Synthesis
Allan Zhou · Moo J Kim · Lirui Wang · Pete Florence · Chelsea Finn
PIRLNav: Pretraining with Imitation and RL Finetuning for ObjectNav
Ram Ramrakhya · Dhruv Batra · Erik Wijmans · Abhishek Das
AdamsFormer for Spatial Action Localization in the Future
Hyung-gun Chi · Kwonjoon Lee · Nakul Agarwal · Yi Xu · Karthik Ramani · Chiho Choi
Unsupervised Sampling Promoting for Stochastic Human Trajectory Prediction
Guangyi Chen · Zhenhao Chen · Shunxing Fan · Kun Zhang
Query-Centric Trajectory Prediction
Zikang Zhou · Jianping Wang · Yung-Hui Li · Yu-Kai Huang
Planning-oriented Autonomous Driving
yihan hu Jiazhi Yang Li Chen Keyu Li Chonghao Sima Xizhou Zhu Siqi Chai Senyao Du Tianwei Lin Wenhai Wang Lewei Lu Xiaosong Jia Qiang Liu Jifeng Dai Yu Qiao Hongyang Li UniHCP:
A Unified Model for Human-Centric Perceptions
Yuanzheng Ci · Yizhou Wang · Meilin Chen · SHIXIANG TANG · LEI BAI · Feng Zhu · Rui Zhao · Fengwei Yu · Donglian Qi · Wanli Ouyang You Only Segment Once: Towards Real-Time
Panoptic Segmentation
Jie Hu · Linyan Huang Tianhe Ren shengchuan zhang Rongrong Ji Liujuan Cao
On the Convergence of IRLS and Its Variants in Outlier-Robust Estimation
Liangzu Peng Christian Kümmerle Rene Vidal
Learning Adaptive Dense Event Stereo from the Image Domain
Hoonhee Cho · Jegyeong Cho · Kuk-Jin YOON
Correspondence Transformers with Asymmetric Feature Learning and Matching Flow Super-Resolution
Yixuan Sun · Dongyang Zhao · Zhangyue Yin · Yiwen Huang · Tao Gui · Wenqiang Zhang · Weifeng Ge
DKM: Dense Kernelized Feature Matching for Geometry Estimation
Johan Edstedt · Ioannis Athanasiadis · Mårten Wadenbäck · Michael Felsberg
3D Registration with Maximal Cliques
Xiyu Zhang · Jiaqi Yang · Shikun Zhang · Yanning Zhang
Self-Supervised Learning for Multimodal Non-Rigid 3D Shape Matching
Dongliang Cao · Florian Bernard
Towards Better Gradient Consistency for Neural Signed Distance Functions via Level Set Alignment
Baorui Ma · Junsheng Zhou · Yushen Liu · Zhizhong Han
Unsupervised Inference of Signed Distance Functions from Single Sparse Point Clouds without Learning Priors
Chao Chen · Yushen Liu · Zhizhong Han
PEAL: Prior-embedded Explicit Attention Learning for low-overlap Point Cloud Registration
Junle Yu · Luwei Ren · Yu Zhang · Wenhui Zhou · Lili Lin · Guojun Dai
PointListNet: Deep Learning on 3D Point Lists
Hehe Fan · Linchao Zhu · Yi Yang · Mohan Kankanhalli
Meta Architecture for Point Cloud Analysis
Haojia Lin · Xiawu Zheng · lijiang Li · Fei Chao · Shanshan Wang · Yan Wang · Yonghong Tian · Rongrong Ji
Learnable Skeleton-Aware 3D Point Cloud Sampling
Cheng Wen · Baosheng Yu · Dacheng Tao
Complete-to-Partial 4D Distillation for Self-Supervised Point Cloud Sequence Representation Learning
Zhuoyang Zhang Yuhao Dong Yunze Liu Li Yi
ViewNet: A Novel Projection-Based Backbone with View Pooling for Few-shot Point Cloud Classification
Jiajing Chen Minmin Yang · Senem Velipasalar
SCPNet: Semantic Scene Completion on Point Cloud
Zhaoyang Xia · Youquan Liu · Xin Li · Xinge ZHU · Yuexin Ma · Yikang LI · Yuenan Hou · Yu Qiao
SCoDA: Domain Adaptive Shape Completion for Real Scans
Yushuang Wu · Zizheng Yan · Ce Chen Lai Wei Xiao Li Guanbin Li Yihao Li Shuguang Cui Xiaoguang Han
GrowSP: Unsupervised Semantic Segmentation of 3D Point Clouds
zihui zhang Bo Yang Bing WANG Bo Li
MethaneMapper: Spectral Absorption aware Hyperspectral Transformer for Methane Detection
Satish Kumar · Ivan Arevalo · A S M Iftekhar · B.S. Manjunath
Weakly Supervised Class-agnostic Motion Prediction for Autonomous Driving
Ruibo Li · Hanyu Shi · Ziang Fu · Zhe Wang · Guosheng Lin
Single Domain Generalization for LiDAR Semantic Segmentation
Hyeonseong Kim · Yoonsu Kang · Changgyoon Oh · Kuk-Jin YOON
PeakConv: Learning Peak Receptive Field for Radar Semantic Segmentation
Liwen Zhang · Xinyan Zhang · Youcheng Zhang · Yufei Guo · Yuanpei Chen · Xuhui Huang · Zhe Ma
PillarNeXt: Rethinking Network Designs for 3D Object Detection in LiDAR Point Clouds
Jinyu Li · Chenxu Luo · Xiaodong Yang
Density-Insensitive Unsupervised Domain Adaption on 3D Object Detection
Qianjiang Hu · Daizong Liu · Wei Hu
Spherical Transformer for LiDAR-based 3D Recognition
Xin Lai · Yukang Chen · Fanbin Lu · Jianhui Liu · Jiaya Jia
Neural Map Prior for Autonomous Driving
Xuan Xiong · Yicheng Liu · Tianyuan Yuan · Yue Wang · Yilun Wang · Hang Zhao
LoGoNet: Towards Accurate 3D Object Detection with Local-to-Global Cross-Modal Fusion
Xin Li · Tao MA · Yuenan Hou · Botian Shi · Yuchen Yang · Youquan Liu · Xingjiao Wu · Qin Chen · Yikang LI · Yu Qiao · Liang He
Pix2map: Cross-modal Retrieval for Inferring Street Maps From Images
Xindi Wu · Kwun Fung Lau · Francesco Ferroni · Aljosa Osep · Deva Ramanan
Azimuth Super-Resolution for FMCW Radar in Autonomous Driving
Yu-Jhe Li Shawn Hunt Jinhyung Park Matthew O'Toole Kris Kitani
MonoATT: Online Monocular 3D Object Detection with Adaptive Token Transformer
Yunsong Zhou Hongzi Zhu Quan Liu Shan Chang · Minyi Guo
Weakly Supervised Monocular 3D Object Detection using Multi-View Projection and Direction Consistency
Runzhou Tao · Wencheng Han · Zhongying Qiu · Cheng-zhong Xu · Jianbing Shen
Semi-Supervised Stereo-based 3D Object Detection via Cross-View Consensus
Wenhao Wu · Hau-San Wong Si Wu
BEV-SAN: Accurate BEV 3D Object Detection via Slice Attention Networks
Xiaowei Chi Jiaming Liu Ming Lu Rongyu Zhang Zhaoqing Wang Yandong Guo Shanghang Zhang
Anchor3DLane: Learning to Regress 3D Anchors for Monocular 3D Lane Detection
Shaofei Huang · Zhenwei Shen · Zehao Huang · Zi-han Ding · Jiao Dai · Jizhong Han · Naiyan Wang · Si Liu
Learning Transformations To Reduce the Geometric Shift in Object Detection
Vidit Vidit · Martin Engilberge · Mathieu Salzmann
Look, Radiate, and Learn: Self-Supervised Localisation via Radio-Visual Correspondence
Mo Alloulah · Maximilian Arnold
Non-line-of-sight Imaging with Signal Superresolution Network
Jianyu Wang · Xintong Liu · Leping Xiao · Zuoqiang Shi · Lingyun Qiu · Xing Fu
ESLAM: Efficient Dense SLAM System Based on Hybrid Representation of Signed Distance Fields
Seyed Mohammad Mahdi Johari · Camilla Carta · François Fleuret
OmniCity: Omnipotent City Understanding with Multi-level and Multi-view Images
Weijia Li · Yawen Lai · Linning Xu · Yuanbo Xiangli · Yu Jinhua · Conghui He · Gui-Song Xia · Dahua Lin
Semantic Ray: Learning a Generalizable Semantic Field with Cross-Reprojection Attention
Fangfu Liu · Chubin Zhang · Yu Zheng · Yueqi Duan
Multi-View Stereo Representation Revist: Region-Aware MVSNet
Yisu Zhang · Jianke Zhu · Lixiang Lin
All-in-focus Imaging from Event Focal Stack
Hanyue Lou · Minggui Teng · Yixin Yang · Boxin Shi
Wide-angle Rectification via Content-aware Conformal Mapping
Qi Zhang · Hongdong Li · Qing Wang
Single Image Depth Prediction Made Better: A Multivariate Gaussian Take
Ce Liu · Suryansh Kumar · Shuhang Gu · Radu Timofte · Luc Van Gool
DeepLSD: Line Segment Detection and Refinement with Deep Image Gradients
Rémi Pautrat · Daniel Barath · Viktor Larsson · Martin Oswald · Marc Pollefeys
VisFusion: Visibility-aware Online 3D Scene Reconstruction from Videos
Huiyu Gao · Wei Mao · miaomiao Liu
Perspective Fields for Single Image Camera Calibration
Linyi Jin · Jianming Zhang · Yannick Hold-Geoffroy · Oliver Wang · Kevin Blackburn-Matzen · Matthew Sticha · David Fouhey
RUST: Latent Neural Scene Representations from Unposed Imagery
Mehdi S. M. Sajjadi · Aravindh Mahendran · Thomas Kipf · Etienne Pot · Daniel Duckworth · Mario Lucic · Klaus Greff
Learning Accurate 3D Shape Based on Stereo Polarimetric Imaging
Tianyu Huang · Haoang Li · Kejing He · Congying SUI · Bin Li · Yun-Hui Liu
The ObjectFolder Benchmark: Multisensory Learning with Neural and Real Objects
Ruohan Gao · Yiming Dou · Hao Li · Tanmay Agarwal · Jeannette Bohg · Yunzhu Li · Li Fei-Fei · Jiajun Wu
Paired-Point Lifting for Enhanced Privacy-Preserving Visual Localization
Chunghwan Lee · Jaihoon Kim · Chanhyuk Yun · Je Hyeong Hong
Learning to Predict Scene-Level Implicit 3D from Posed RGBD Data
Nilesh Kulkarni · Linyi Jin · Justin Johnson · David Fouhey
Long-term Visual Localization with Mobile Sensors
Shen yan · yan · long Wang · Zehong shen · zhen Peng · haomin liu · maojun zhang · guofeng zhang · xiaowei zhou lead the distribution of errors in stereo matching For job disparity and unitedRTAINTY ESTIMATION liyan Chen · weihan Wang · Philippos
Mordohai
Revisiting
Rotation Averaging: Uncertainties and Robust Losses
Ganlin Zhang · Viktor Larsson · Daniel Barath
Level-S
2
fM: Structure from Motion on Neural Level Set of Implicit Surfaces
Yuxi Xiao · Nan Xue · Tianfu Wu · Gui-Song Xia
Linking Garment with Person via Semantically Associated Landmarks for Virtual Try-On
Keyu Yan Tingwei Gao Hui Zhang Chengjun Xie
Cross-domain 3D Hand Pose Estimation with Dual Modalities
Qiuxia Lin Linlin Yang Angela Yao
ScarceNet: Animal Pose Estimation with Scarce Annotations
Chen Li Gim Lee
HS-Pose: Hybrid Scope Feature Extraction for Category-level Object Pose Estimation
Linfang Zheng Chen Wang Yinghan Sun Esha Dasgupta Hua Chen · Ales Leonardis · Wei Zhang · Hyung Jin Chang
ViPLO: Vision Transformer based Pose-Conditioned Self-Loop Graph for Human-Object Interaction Detection
Jeeseung Park · Jin-Woo Park · Jong-Seok Lee
Ego-Body Pose Estimation via Ego-Head Pose Estimation
Jiaman Li · Karen Liu · Jiajun Wu
Mutual Information-Based Temporal Difference Learning for Human Pose Estimation in Video
Runyang Feng · Yixing Gao · Xueqing Ma · Tze Ho Elden Tse · Hyung Jin Chang
Trajectory-Aware Body Interaction Transformer for Multi-Person Pose Forecasting
Xiaogang Peng · Siyuan Mao · Zizhao Wu
What Happened 3 Seconds Ago? Inferring the Past with Thermal Imaging
Zitian Tang · Wenjie Ye · Wei-Chiu Ma · Hang Zhao
Detecting Human-Object Contact in Images
Yixin Chen · Sai Kumar Dwivedi · Michael Black · Dimitrios Tzionas
In-Hand 3D Object Scanning from an RGB Sequence
Shreyas Hampali · Tomas Hodan · LUAN TRAN · Lingni Ma · Cem Keskin · Vincent Lepetit
Autonomous Manipulation Learning for Similar Deformable Objects via Only One Demonstration
Yu Ren · Ronghan Chen · Yang Cong
What You Can Reconstruct from a Shadow
Ruoshi Liu · Sachit Menon · Chengzhi Mao · Dennis Park · Simon Stent · Carl Vondrick
H2ONet: Hand-Occlusion-and-Orientation-aware Network for Real-time 3D Hand Mesh Reconstruction
Hao Xu · Tianyu Wang · Xiao Tang · Chi-Wing Fu
Learning Human Mesh Recovery in 3D Scenes
Zehong Shen · Zhi Cen · Sida Peng · Qing Shuai · Hujun Bao · Xiaowei Zhou
Bringing Inputs to Shared Domains for 3D Interacting Hands Recovery in the Wild
Gyeongsik Moon
Hi4D: 4D Instance Segmentation of Close Human Interaction
Yifei Yin · Chen Guo · Manuel Kaufmann · Juan Zarate · Jie Song · Otmar Hilliges
Deformable Mesh Transformer for 3D Human Mesh Recovery
Yusuke Yoshiyasu
Reconstructing Animatable 3D Categories from Videos
Gengshan Yang Chaoyang Wang Dinesh Reddy Narapureddy Deva Ramanan
Learning Semantic-Aware Disentangled Representation for 3D Human Body Editing
Xiaokun Sun Qiao Feng Xiongzheng Li Jinsong Zhang Yu-Kun Lai Jingyu Yang Kun Li
Physically Realizable Natural-L looking Clothing Textures Evade Person Detectors via 3D Modeling
Zhanhao Hu Wenda Chu Xiaopei Zhu Hui Zhang Bo Zhang Xiaolin Hu
Towards Domain Generalization for Multi-view 3D Object Detection in Bird-Eye-View
Shuo Wang Xinhai Zhao Haiming Xu Zehui Chen Dameng Yu Jiahao Chang Zhen Yang Feng Zhao
Listening Human Behavior: 3D Human Pose Estimation with Acoustic Signals
Yuto Shibata · Yutaka Kawashima · Mariko Isogawa · Go Irie · Akisato Kimura · Yoshimitsu Aoki
NLOST: Non-Line-of-Sight Imaging with Transformer
Yue Li · Jiayong Peng · Juntian Ye · Yueyi Zhang · Feihu Xu · Zhiwei Xiong
Few-shot Non-line-of-sight Imaging with Signal-surface Collaborative Regularization
Xintong Liu · Jianyu Wang · Leping Xiao · Xing Fu · Lingyun Qiu · Zuoqiang Shi
Co-SLAM: Joint Coordinate and Sparse Parametric Encodings for Neural Real-Time SLAM
Hengyi Wang · Jingwen Wang · Lourdes Agapito
OSRT: Omnidirectional Image Super-Resolution with Distortion-aware Transformer
Fanghua Yu · Xintao Wang · Mingdeng Cao · Gen Li · Ying Shan · Chao Dong
HRDFuse: Monocular 360

Depth Estimation by Collaboratively Learning Holistic-with-Regional Depth Distributions
Hao Ai · Zidong Cao · Yan-Pei Cao · Ying Shan · Lin Wang
K3DN: Disparity-aware Kernel Estimation for Dual-Pixel Defocus Deblurring
Yan Yang · Liyuan Pan · Liu Liu · miaomiao Liu
Shakes on a Plane: Unsupervised Depth Estimation from Unstabilized Photography
Ilya Chugunov · Yuxuan Zhang · Felix Heide
DynamicStereo: Consistent Dynamic Depth from Stereo Videos
Nikita Karaev · Ignacio Rocco · Benjamin Graham · Natalia Neverova · Andrea Vedaldi · Christian Rupprecht
End-to-End Vectorized HD-map Construction with Piecewise Bezier Curve
Limeng Qiao · Wenjie Ding · Xi Qiu · Chi Zhang
Enhanced Stable View Synthesis
Nishant Jain · Suryansh Kumar · Luc Van Gool
Scalable, Detailed and Mask-Free Universal Photometric Stereo
Satoshi Ikehata
PRISE: Demystifying Deep Lucas-Kanade with Strongly Star-Convex Constraints for Multimodel Image Alignment
Yiqing Zhang · Xinming Huang · Ziming Zhang
Visual Localization using Imperfect 3D Models from the Internet
Vojtech Panek · Zuzana Kukelova · Torsten Sattler
HelixSurf: A Robust and Efficient Neural Implicit Surface Learning of Indoor Scenes with Iterative Intertwined Regularization
Zhihao Liang · Zhangjin Huang · Changxing Ding · Kui Jia
Omni3D: A Large Benchmark and Model for 3D Object Detection in the Wild
Garrick Brazil · Abhinav Kumar · Julian Straub · Nikhila Ravi · Justin Johnson · Georgia Gkioxari
Objaverse: A Universe of Annotated 3D Objects
Matt Deitke · Dustin Schwenk · Jordi Salvador Marcos · Luca Weihs · Oscar Michel · Eli VanderBilt · Ludwig Schmidt · Kiana Ehsani · Aniruddha Kembhavi · Ali Farhadi
Privacy-Preserving Representations are not Enough: Recovering Scene Content from Camera Poses
Kunal Chelani · Torsten Sattler · Fredrik Kahl · Zuzana Kukelova
Learning a Depth Covariance Function
Eric Dexheimer · Andrew Davison
Defending Against Patch-based Backdoor Attacks on Self-Supervised Learning
Ajinkya Tejankar · Maziar Sanjabi · Qifan Wang · Sinong Wang · Hamed Firooz · Hamed Pirsiavash · Liang Tan
Backdoor Defense via Deconfounded Representation Learning
Zaixi Zhang Qi Liu Zhicai Wang Zepu Lu Qingyong Hu
Backdoor Cleansing with Unlabeled Data
Lu Pang Tao Sun Haibin Ling Chao Chen
Breaching FedMD: Image Recovery via Paired-Logits Inversion Attack
Hidea ki Takahashi · Jingjing Liu Yang Liu
ELASTIC AGGREGATION FOR FEDERATED OPTIMIZATION
Chen Dengsheng Jie Hu Vince Tan Xiaoming Wei Enhua Wu
DynaFed: Tackling Client Data Heterogeneity with Global Dynamics
Renjie PI WEIZHONG ZHANG Yueqi Xie Jiahui Gao Xiaoyu Wang Sunghun Kim · Qifeng Chen
How to Prevent the Poor Performance Clients for Personalized Federated Learning?
Zhe Qu · Xingyu Li · Xiao Han · Rui Duan · Chengchao Shen · Lixing Chen
Cloud-Device Collaborative Adaptation to Continual Changing Environments in the Real-world
Yulu Gan · Mingjie Pan · Rongyu Zhang · Zijian Ling · Lingran Zhao · Jiaming Liu · Shanghang Zhang
Diversity-Measurable Anomaly Detection
Wenrui Liu · Hong Chang · Bingpeng Ma · Shiguang Shan · Xilin CHEN
Look Around for Anomalies: Weakly-supervised Anomaly Detection via Context-Motion Relational Learning
MyeongAh Cho · Minjung Kim · Sangwon Hwang · Chaewon Park · Kyungjae Lee · Sangyoun Lee
Semi-supervised Hand Appearance Recovery via Structure Disentanglement and Dual Adversarial Discrimination
Zimeng Zhao · Binghui Zuo · Zhiyu Long · Yangang Wang
Adversarial Normalization: I Can visualize Everything (ICE)
Hoyoung Choi Seungwan Jin Kyungsik Han
Learning on Gradients: Generalized Artifacts Representation for GAN-Generated Images Detection
Chuangchuang Tan Yao Zhao Shikui Wei Guanghua Gu Yunchao Wei
GLeaD: Improving GANs with A Generator-Leading Task
Qingyan Bai Ceyuan Yang Yinghao Xu Xihui Liu Yujiu Yang Yujun Shen
Data-Free Sketch-Based Image Retrieval
Abhra Chaudhuri Ayan Kumar Bhunia Yi-Zhe Song Anjan Dutta
OpenMix: Exploring Outlier Samples for Misclassification Detection
Fei Zhu · Zhen Cheng · Xu-yao Zhang · Cheng-lin Liu
Genie: Show Me the Data for Quantization
Yongkweon Jeon · Chungman Lee · Ho-young Kim
How to Prevent the Continuous Damage of Noises to Model training?
Xiaotian Yu · Yang Jiang · Tianqi Shi · Zunlei Feng · Yuexuan Wang · Mingli Song · Li Sun
Gradient-based Uncertainty Attribution for Explainable Bayesian Deep Learning
Hanjing Wang · Dhiraj Joshi · Shiqiang Wang · Qiang Ji
FIANCEE: Faster Inference of Adversarial Networks via Conditional Early Exits
Polina Karpikova · Ekaterina Radionova · Anastasia Yaschenko · Andrei Spiridonov · Leonid Kostyushko · Riccardo Fabbricatore · Aleksei Ivakhnenko
Run, Don’t Walk: Chasing Higher FLOPS for Faster Neural Networks
Jierun Chen · Shiu-hong Kao · Hao He · Weipeng Zhuo · Song Wen · Chul-Ho Lee · S.-H. Chan
FFCV: Accelerating Training by Removing Data Bottlenecks
Guillaume Leclerc · Andrew Ilyas · Logan Engstrom · Sung Min Park · Hadi Salman · Aleksander Madry
Disentangled Representation Learning for Unsupervised Neural Quantization
Haechan Noh · Sangeek Hyun · Woojin Jeong · Hanshin Lim · Jae-Pil Heo
HOTNAS: Hierarchical Optimum al Transport for Neural Architecture Search
Jiechao Yang Yong Liu Hongteng Xu
Solving relaxations of MAP-MRF problems: Combinatorial in-face Frank-Wolfe directions
Vladimir Kolmogorov
Transformer-Based Learned Optimization
Erik Gärtner Luke Metz Misha Andriluka C. Freeman Cristian Sinchises cu
Multi-Agent Automated Machine Learning
Zhaozhi Wang Kefan Su Jian Zhang Huizhu Jia Qixiang Ye Xiaodong Xie Zongqing Lu
Accelerating Dataset Distillation via Model Augmentation
Lei Zhang · Jie Zhang · Bowen Lei · Subhabrata Mukherjee · Xiang Pan · Bo Zhao · Caiwen Ding · Yao Li · Dongkuan Xu
PA&DA: Jointly Sampling Path and Data for Consistent NAS
Shun Lu · Yu Hu · Longxing Yang · Zihao Sun · Jilin Mei · Jianchao Tan · Chengru Song
Achieving a Better Stability-Plasticity Trade-off via Auxiliary Networks in Continual Learning
Sanghwan Kim · Lorenzo Noci · Antonio Orvieto · Thomas Hofmann
EcoTTA: Memory-Efficient Continual Test-time Adaptation via Self-distilled Regularization
Junha Song · Jungsoo Lee · In So Kweon · Sungha Choi
CODA-Prompt: COntinual Decomposed Attention-based Prompting for Rehearsal-Free Continual Learning
James Smith · Leonid Karlinsky · Vyshnavi Gutta · Paola Cascante-Bonilla · Donghyun Kim · Assaf Arbelle · Rameswar Panda · Rogerio Feris · Zsolt Kira
DisWOT: Student Architecture Search for Distillation WithOut Training
Peijie Dong · Lujun Li · Zimian Wei
Real-Time Evaluation in Online Continual Learning: A New Hope
Yasir Ghunaim · Adel Bibi · Kumail Alhamoud · Motasem Alfarra · Hasan Hammoud Hammoud · Ameya Prabhu · Philip Torr · Bernard Ghanem
Dealing with Cross-Task Class Discrimination in Online Continual Learning
Yiduo Guo · Bing Liu · Dongyan Zhao
Class Attention Transfer Based Knowledge Distillation
Ziyao Guo · Haonan Yan · HUI LI · Xiaodong Lin
Dense Network Expansion for Class Incremental Learning
Zhiyuan Hu · Yunsheng Li · Jiancheng Lyu · Dashan Gao · Nuno Vasconcelos
Multi-Mode Online Knowledge Distillation for Self-Supervised Visual Representation Learning
Kaiyou Song · Jin Xie · Shan Zhang · Zimeng Luo
Few-Shot Class-Incremental Learning via Class-Aware Bilateral Distillation
Linglan Zhao · Jing Lu · Yunlu Xu · Zhanzhan Cheng · Dashan Guo · Yi Niu · Xiangzhong Fang
Mod-Squad: Designing Mixtures of Experts As Modular Multi-Task Learners
Zitian Chen · Yikang Shen · Mingyu Ding · Zhenfang Chen · Hengshuang Zhao · Erik Learned-Miller · Chuang Gan
Train-Once-for-All Personalization
Hong-You Chen · YANDONG LI · Yin Cui · Mingda Zhang · Wei-Lun Chao · Li Zhang
Generalizable Implicit Neural Representations with Instance Pattern Composers
Chiheon Kim · Doyup Lee · Saehoon Kim · Minsu Cho · Wook-Shin Han
Deep Frequency Filtering for Domain Generalization
Shiqi Lin · Zhizheng Zhang · Zhipeng Huang · Yan Lu · Cuiling Lan · Peng Chu · Quanzeng You · Jiang Wang · Zicheng Liu · Viraj Navkal · Amey Parulkar · Zhibo Chen
Back to the Source: Diffusion-Driven Adaptation to Test-Time Corruption
Jin Gao · Jialing Zhang · Xihui Liu · Trevor Darrell · Evan Shelhamer · Dequan Wang
Decompose, Adjust, Compose: Effective Normalization by Playing with Frequency for Domain Generalization
Sangrok Lee · Jongseong Bae · Ha Kim Kim
Enhanced Multimodal Representation Learning with Cross-modal KD
mengxi Chen · Linyu XING · Yu Wang · Ya Zhang
Equiangular Basis Vectors
Yang Shen · Xu-Hao Sun · Xiu-Shen Wei
DARE-GRAM : Unsupervised Domain Adaptation Regression by Aligning Inverse Gram Matrices
Ismail Nejjar · Qin Wang · Olga Fink
Towards Better Stability and Adaptability: Improve Online Self-Training for Model Adaptation in Semantic Segmentation
Dong Zhao · Shuang Wang · Qi Zang · Dou Quan · XIUTIAO YE · Licheng Jiao
MIC: Masked Image Consistency for Context-Enhanced Domain Adaptation
Lukas Hoyer · Dengxin Dai · Haoran Wang · Luc Van Gool
Neural Dependencies Emerging from Learning Massive Categories
Ruili Feng Kecheng Zheng Kai Zhu Yujun Shen Jian Zhao Yukun Huang Deli Zhao Jingren Zhou Michael Jordan Zheng-Jun Zha
Co-training
2
L
submodels for image recognition
Hugo Touvron Matthieu CORD Maxime Oquab Piotr Bojanowski · Jakob Verbeek · Herve Jegou
On-the-fly Category Discovery
Ruoyi Du · Dongliang Chang · Kongming Liang · Timothy Hospedales · Yi-Zhe Song · Zhanyu Ma
Generative Bias for Robust Visual Question Answering
Jae Won Cho · Dong-Jin Kim · Hyeonggon Ryu In So Kweon
RMLVQA: A Margin Loss Approach For Visual Question Answering with Language Biases
Abhipsa Basu Sravanti Addepalli Venkatesh Babu Radhakrishnan
Twin Contrastive Learning with Noisy Labels
Zhizhong Huang Junping Zhang Hongming Shan
Fine-Grained Classification with Noisy Labels
Qi Wei Lei Feng Haoliang Sun Ren Wang Chenhui Guo Yilong Yin
ProtoCon: Pseudo-label Refinement via Online Clustering and Prototypical Consist ency for Efficient Semi-supervised Learning
Islam Nassar · Munawar Hayat · Ehsan Abbasnejad · Hamid Rezatofighi · Gholamreza Haffari
Zero-shot Model Diagnosis
Jinqi Luo · Zhaoning Wang · Chen Henry Wu · Dong Huang · Fernando de la Torre
Mind the Label Shift of Augmentation-based Graph OOD Generalization
Junchi Yu Jian Liang Ran He
RONO: Robust Discriminative Learning with Noisy Labels for 2D-3D Cross-Modal Retrieval
Yanglin Feng · Hongyuan Zhu · Dezhong Peng · Xi Peng · Peng Hu
Deep Incomplete Multi-view Clustering with Cross-view Partial Sample and Prototype Alignment
Jiaqi Jin · Siwei Wang · Zhibin Dong · Xinwang Liu · En Zhu
MetaViewer: Towards A Unified Multi-View Representation
Ren Wang · Haoliang Sun · Yuling Ma · Xiaoming Xi · Yilong Yin
Rethinking Out-of-Distribution Detection: Masked Image Modeling is All You Need
Jingyao Li · Pengguang Chen · Zexin He · Shaozuo Yu · Shu Liu · Jiaya Jia
Towards Trustable Skin Cancer Diagnosis via Rewriting Model’s Decision
Siyuan Yan · zhen yu · Xuelin Zhang · Dwarikanath Mahapatra · Shekhar Chandra · Monika Janda · H. Peter Soyer · Zongyuan Ge
METransformer: Radiology Report Generation by Transformer with Multiple Learnable Expert Tokens
Zhanyu Wang · Lingqiao Liu · Lei Wang · Luping Zhou
Sparse Multi-Modal Graph Transformer with Shared-Context Processing for Representation Learning of Giga-pixel Images
Ramin Nakhli · Puria Azadi Moghadam · Haoyang Mi · Hossein Farahani · Alexander Baras · Blake Gilks · Ali Bashashati
Ambiguous Medical Image Segmentation using Diffusion Models
AIMON RAHMAN · Jeya Maria Jose Valanarasu · Ilker Hacihaliloglu · Vishal Patel
Directional Connectivity-based Segmentation of Medical Images
Ziyun Yang · Sina Farsiu
Bidirectional Copy-Paste for Semi-Supervised Medical Image Segmentation
Yunhao Bai · Duowen Chen · Qingli Li · Wei Shen · Yan Wang
AutoLabel: CLIP-based framework for Open-set Video Domain Adaptation
Giacomo Zara · Subhankar Roy · Paolo Rota · Elisa Ricci
Zero-shot Generative Model Adaptation via Image-specific Prompt Learning
Jiayi Guo · Chaofei Wang · You Wu · Eric Zhang · Kai Wang · Xingqian Xu · Shiji Song · Humphrey Shi · Gao Huang
2PCNet: Two-Phase Consistency Training for Day-to-Night Unsupervised Domain Adaptive Object Detection
Mikhail Kennerley · Jian-Gang Wang · Bharadwaj Veeravalli · Robby Tan
Bridging Precision and Confidence: A Train-Time Loss for Calibrating Object Detection
Muhammad Akhtar Munir · Muhammad Khan Khan · Salman Khan · Fahad Khan
Learning Transformation-Predictive Representations for Detection and Description of Local Features
Zihao Wang · Chunxu Wu · Yifei Yang · Zhen Li
Annealing-based Label-Transfer Learning for Open World Object Detection
Yuqing Ma · Hainan Li · Zhange Zhang · Jinyang Guo · Shanghang Zhang · Ruihao Gong · Xianglong Liu
PROB: Probabilistic Objectness for Open World Object Detection
Orr Zohar · Kuan-Chieh Wang · Serena Yeung
Detecting Everything in the Open World: Towards Universal Object Detection
Zhenyu Wang · Ya-Li Li · Xi Chen · Ser-Nam Lim · Antonio Torralba · Hengshuang Zhao · Shengjin Wang
DETR with Additional Global Aggregation for Cross-domain Weakly Supervised Object Detection
Zongheng Tang · Yifan Sun · Si Liu · Yi Yang
Self-supervised AutoFlow
Hsin-Ping Huang · Charles Herrmann · Junhwa Hur · Erika Lu · Kyle Sargent · Austin Stone · Ming-Hsuan Yang · Deqing Sun
Detection Hub: Unifying Object Detection Datasets via Query Adaptation on Language Embedding
Lingchen Meng Xiyang Dai Yinpeng Chen Pengchuan Zhang Dongdong Chen Mengchen Liu Jianfeng Wang Zuxuan Wu Lu Yuan Yu-Gang Jiang Learning Common
Rationale to Improve Self-Supervised Representation for Fine-Grained Visual Recognition Problems
Yangyang Shu · Anton Hengel · Lingqiao Liu
Full or weak annotations? An adaptive strategy for budget-constrained annotation campaigns
Javier Gamazo Tejero · Martin Zinkernagel · Sebastian Wolf · Raphael Sznitman · Pablo Márquez Neila
Class-Incremental Exemplar Compression for Class-Incremental Learning
Zilin Luo · Yaoyao Liu · Bernt Schiele · Qianru Sun
The Devil is in the Points: Weakly Semi-Supervised Instance Segmentation via Point-Guided Mask Representation
Beomyoung Kim · Joonhyun Jeong · Dongyoon Han · Sung Ju Hwang
Augmentation Matters: A Simple-yet-Effective Approach to Semi-supervised Semantic Segmentation
Zhen Zhao · Lihe Yang · Sifan Long · Jimin Pi · Luping Zhou · Jingdong Wang
Weakly Supervised Semantic Segmentation via Adversarial Learning of Classifier and Reconstructor
Hyeokjun Kweon · Sung-Hoon Yoon · Kuk-Jin YOON
Learning Orthogonal Prototypes for Generalized Few-shot Semantic Segmentation
Sun-Ao Liu Yiheng Zhang Zhaofan Qiu Hongtao Xie Yongdong Zhang Ting Yao
Beyond mAP: Towards better evaluation of instance segmentation
Rohit Kumar Jena Lukas Zhornyak Neh al Doiphode Pratik Chaudhari · Vivek Buch · James Gee · Jianbo Shi
Dynamic Focus-aware Positional Queries for Semantic Segmentation
Haoyu He · Jianfei Cai · Zizheng Pan · Jing Liu · Jing Zhang · Dacheng Tao · Bohan Zhuang
Focus On Details: Online Multi-object Tracking with Diverse Fine -grained Representation
Hao Ren · Shoudong Han · Huilin Ding · Ziwen Zhang · Hongwei Wang · Faquan Wang
DynaMask: Dynamic Mask Selection for Instance Segmentation
Ruihuang Li · Chenhang HE · Shuai Li · Yabin Zhang · Lei Zhang
A Strong Baseline for Generalized Few-Shot Semantic Segmentation
Seyed Mohammadsina Hajimiri · Malik Boudiaf · Ismail Ayed · Jose Dolz
Compositor: Bottom-up Clustering and Compositing for Robust Part and Object Segment ation
Ju He · Jieneng Chen · Ming-Xian Lin · Qihang Yu · Alan Yuille
Inferring and Leveraging Parts from Object Shape for Improving Semantic Image Synthesis
Yuxiang Wei · Zhilong Ji · Xiaohe Wu · Jinfeng Bai · Lei Zhang · Wangmeng Zuo
Primitive Generation and Semantic- related Alignment for Universal Zero-Shot Segmentation
SHUTING HE Henghui Ding Wei Jiang
UniDAformer: Unified Domain Adaptive Panoptic Segmentation Transformer via Hierarchical Mask Calibration
Jingyi Zhang Jiaxing Huang Xiaoqin Zhang Shijian Lu
StructVPR: Distill Structural Knowledge with Weighting Samples for Visual Place Recognition
Yanqing Shen Sanping Zhou J ingwen Fu · Ruotong Wang · Shitao Chen · Nanning Zheng
Clip-S
4
: Language-GUIDED SELF-SUPERVISED SEGMENTATION
We · Suphanut Jamonnak · LIANG GOU · LIU RIURERINING
CONDITINGAL Attributes F or compositional Zero-Shot Learning
qingsheng Wangqiao LiU · Chenchen jing · hao Chen · Guoqiang Liang · PENG WANG Chunhua Shen
Object-Aware Distillation Pyramid for Open-Vocabulary Object Detection
Luting Wang · Yi Liu · Penghui Du · Zihan Ding · Yue Liao · Qiaosong Qi · Biaolong Chen · Si Liu
ZegCLIP: Towards Adapting CLIP for Zero-shot Semantic Segmentation
Ziqin Zhou · Yinjie Lei · Bowen Zhang · Lingqiao Liu · Yifan Liu
Learning to Generate Text-grounded Mask for Open-world Semantic Segmentation from Only Image-Text Pairs
Junbum Cha Jonghwan Mun Byungseok Roh
Mobile User Interface Element Detection Via Adaptively Prompt Tuning
Weiqiang Wang Zhuoer Xu Haoxing Chen jun lan Changhua Meng Weiqi ang Wang
Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers
Dahun Kim Anelia Angelova Weicheng Kuo
Improving Table Structure Recognition with Visual-Alignment Sequential Coordinate Modeling
yongshuai huang · Ning Lu · Dapeng Chen · Yibo Li · Zecheng Xie · Shenggao Zhu · Liangcai Gao · Wei Peng
End-to-End 3D Dense Captioning with Vote2Cap-DETR
Sijin Chen · Hongyuan Zhu · Xin Chen · Yinjie Lei · Gang Yu · Tao Chen
Visual DNA: Representing and Comparing Images using Distributions of Neuron Activations
Benjamin Ramtoula · Matthew Gadd · Paul Newman · Daniele De Martini
Hint-Aug: Drawing Hints from Foundation Vision Transformers towards Boosted Few-shot Parameter-Efficient Tuning
Zhongzhi Yu · Shang Wu · Shunyao Zhang · Yonggan Fu · Yingyan Lin
Improving Zero-shot Generalization and Robustness of Multi-modal Models
Yunhao Ge · Jie Ren · Andrew Gallagher · Yuxiao Wang · Ming-Hsuan Yang · Hartwig Adam · Laurent Itti · Balaji Lakshminarayanan · Jiaping Zhao
Asymmetric Feature Fusion for Image Retrieval
Hui Wu · Min Wang · Wengang Zhou · Zhenbo Lu · Houqiang Li
Cross-Image-Attention for Conditional Embeddings in Deep Metric Learning
Dmytro Kotovenko · Pingchuan Ma · Timo Milbich · Björn Ommer
Learning Instance-Level Representation for Large-Scale Multi-Modal Pretraining in E-commerce
Yang Jin · Yongzhi Li · Zehuan Yuan · Yadong MU
Learning Attribute and Class Specific Representation Duet for Fine-grained Fashion Analysis
Yang Jiao · Yan Gao · Jingjing Meng · Jin Shang · Yi Sun
HAAV: Hierarchical Aggregation of Augmented Views for Image Captioning
Chia-Wen Kuo Zsolt Kira
Non-Contrastive Learning Meets Language-Image Pre-Training
Jinghao Zhou Li Dong Zhe Gan Lijuan Wang Furu Wei
ViLEM: Visual-Language Error Modeling for Image-Text Retrieval
Yuxin Chen Zongyang Ma ziqi zhang Zhongang Qi Chunfeng Yuan Ying Shan Bing Li Weiming Hu Xiaohu Qie Jianping WU
CLIPPO: Image-and-Language Understanding from Pixels Only
Michael Tschannen Basil Mustafa Neil Houlsby
MaskCLIP: Masked Self-Distillation Advances Contrastive Language-Image Pretraining
Xiaoyi Dong Jianmin Bao Yinglin Zheng Ting Zhang Dongdong Chen Hao Yang Ming Zeng Weiming Zhang Lu Yuan Dong Chen Fang Wen Nenghai Yu Context-aware Alignment and Mutual Masking for 3D-Language Pre-
training
Zhao Jin Munawar Hayat Yuwei Yang Yulan Guo Yinjie Lei
SceneTrilogy: On Human Scene-Sketch and its Complementarity with Photo and Text
Pinaki Nath Chowdhury Ayan Kumar Bhunia Aneeshan Sain Subhadeep Koley Tao Xiang Yi-Zhe Song
Learning Bott leneck Concepts in Image Classification
Bowen Wang Liangzhi Li Yuta Nakashima Hajime Nagahara
GIVL: Improving Geographical Inclusivity of Vision-and-Language Models with Pre-Training Methods
Da Yin · Feng Gao · Govind Thattai · Michael Johnston · Kai-Wei Chang
Grounding Counterfactual Explanation of Image Classifiers to Textual Concept Space
Siwon Kim · Jinoh Oh · SUNGJIN LEE · Seunghak Yu · Jaeyoung Do · Tara Taghavi
Overlooked factors in concept-based explanations: Dataset choice, concept learnability, and human capability
Vikram V. Ramaswamy · Sunnie S. Y. Kim · Ruth Fong · Olga Russakovsky
LOCATE: Localize and Transfer Object Parts for Weakly Supervised Affordance Grounding
Gen Li · Varun Jampani · Deqing Sun · Laura Sevilla-Lara
Task Residual for Tuning Vision-Language Models
Tao Yu · Zhihe Lu · Xin Jin · Zhibo Chen · Xinchao Wang
Hierarchical Prompt Learning for Multi-Task Learning
Yajing Liu Yuning Lu Hao Liu Yaozu An Zhuoran Xu Yao Zhuokun Zhang Baofeng Zhiwei Xiong Chenguang Gui
Diversity-Aware Meta Visual Prompting
Qidong Huang Xiaoyi Dong Dongdong Chen Weiming Zhang Feifei Wang Gang Hua Nenghai Yu
From Images to Textual Prompts: Zero-shot Visual Question Answering with Frozen Large Language Models
Jiaxian Guo Junnan Li Dongxu Li Anthony Tiong Boyang Li Dacheng Tao Steven Hoi
Language Adaptive Weight Generation for Multi-task Visual Grounding
Wei Su · Peihan Miao · Huanzhang Dou · Gaoang Wang · Liang Qiao · Zheyang Li · Xi Li
Fusing Pre-trained Language Models with Multimodal Prompts through Reinforcement Learning
Youngjae Yu · Jiwan Chung · Heeseung Yun · Jack Hessel · Jae Sung Park · Ximing Lu · Rowan Zellers · Prithviraj Ammanabrolu · Ronan Le Bras · Gunhee Kim · Yejin Choi
Are Deep Neural Networks SMARTer than Second Graders?
Anoop Cherian · Kuan-Chuan Peng · Suhas Lohit · Kevin Smith · Joshua Tenenbaum
A-CAP: Anticipation Captioning with Commonsense Knowledge
MINH DUC VO · An Luong · Akihiro Sugimoto · Hideki Nakayama
A New Path: Scaling Vision-and-Language Navigation with Synthetic Instructions and Imitation Learning
Aishwarya Kamath · Peter Anderson · Su Wang · Jing Yu Koh · Alexander Ku · Austin Waters · Yinfei Yang · Jason Baldridge · Zarana Parekh
Improving Vision-and-Language Navigation by Generating Future-View Image Semantics
Jialu Li · Mohit Bansal
Layout-based Causal Inference for Object Navigation
Sixian Zhang · Xinhang Song · Weijie Li · Yubing Bai · Xinyao Yu · Shuqiang Jiang
You Need Multiple Exiting: Dynamic Early Exiting for Accelerating Unified Vision Language Model
Shengkun Tang · Yaqing Wang · Zhenglun Kong · Tianchi Zhang · Yao Li · Caiwen Ding · Yanzhi Wang · Yi Liang · Dongkuan Xu
Distilling Cross-Temporal Contexts for Continuous Sign Language Recognition
Leming Guo · Wanli Xue · Qing Guo · Bo Liu · Kaihua Zhang · Tiantian Yuan · Shengyong Chen
Multivariate, Multi-frequency and Multimodal: Rethinking Graph Neural Networks for Emotion Recognition in Conversation
Feiyu Chen · Jie Shao · Shuyuan Zhu · Heng Tao Shen
Modular Memorability: Tiered Representations for Video Memorability Prediction
Théo Dumont · Juan Hevia · Camilo Fosco
VindLU: A Recipe for Effective Video-and-Language Pretraining
Feng Cheng · Xizi Wang · Jie Lei · David Crandall · Mohit Bansal · Gediminas Bertasius
Procedure-Aware Pretraining for Instructional Video Understanding
Honglu Zhou · Roberto Martín-Martín · Mubbasir Kapadia · Silvio Savarese · Juan Carlos Niebles
Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning
Antoine Yang · Arsha Nagrani · Paul Hongsuck Seo · Antoine Miech · Jordi Pont-Tuset · Ivan Laptev · Josef Sivic · Cordelia Schmid
Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?
Wenhao Wu · Haipeng Luo · Bo Fang · Jingdong Wang · Wanli Ouyang
Leveraging Temporal Context in Low Representational Power Regimes
Camilo Fosco · SouYoung Jin · Emilie Josephs · Aude Oliva
Tell Me What Happened: Unifying Text-guided Video Completion via Multimodal Masked Video Generation
Tsu-Jui Fu · Licheng Yu · Ning Zhang · Cheng-Yang Fu · Jong-Chyi Su · William Yang Wang · Sean Bell
NewsNet: A Novel Dataset for Hierarchical Temporal Segmentation
Haoqian Wu · Keyu Chen · Haozhe Liu · Mingchen Zhuge · Bing Li · Ruizhi Qiao · Xiujun Shu · Bei Gan · Liangsheng Xu · Bo Ren · Mengmeng Xu · Wentian Zhang · Raghavendra Ramachandra · Chia-Wen Lin · Bernard Ghanem
Perception and Semantic Aware Regularization for Sequential Confidence Calibration
Zhenghua Peng Yu Luo Tianshui Chen Keke Xu Shuangping Huang
Boosting Weakly-Supervised Temporal Action Localization with Text Information
Guozhang Li De Cheng Xinpeng Ding Nannan Wang Xiaoyu Wang Xinbo Gao
Re2TAL: Rewiring Pretrained Video Backbones for Reversib le Temporal Action Localization
Chen Zhao · Shuming Liu · Karttikeya Mangalam · Bernard Ghanem
Search-Map-Search: A Frame Selection Paradigm for Action Recognition
Mingjun Zhao · Yakun Yu · Xiaoli Wang · Lei Yang · Di Niu
Therbligs In Action: Video Understanding through Motion Primitive s
Eadom Dessalene · Michael Maynord · Cornelia Fermuller · Yiannis Aloimonos
Learning Discriminative Representations for Skeleton Based Action Recognition
Huanyu Zhou · Qingjie Liu · Yunhong Wang
MOSO: Decomposing MOtion, Scene and Object for Video Prediction
Mingzhen Sun · Weining Wang · Xinxin Zhu · Jing Liu
EVAL: Explainable Video Anomaly Localization
Ashish Singh · Michael Jones · Erik Learned-Miller
Unified Mask Embedding and Correspondence Learning for Self-Supervised Video Segmentation
Liulei Li · Wenguan Wang · Tianfei Zhou · Jianwu Li · Yi Yang
Representation Learning for Visual Object Tracking by Masked Appearance Transfer
Haojie Zhao · Dong Wang · Huchuan Lu
Generalized Relation Modeling for Transformer Tracking
Shenyuan Gao · Chunluan Zhou · Jun Zhang
Panoptic Video Scene Graph Generation
Jingkang Yang Wenxuan Peng Xiangtai Li ZUJIN GUO Liangyu Chen Bo Li Zheng Ma Wayne Zhang Kaiyang Zhou CHEN CHANGE LOY Ziwei Liu Devil's on the Edges: Selective Quad Attention for Scene Graph Generation Deunsol Jung
Sanghyun
Kim · Won Hwa Kim · Minsu Cho
Focused and Collaborative Feedback Integration for Interactive Image Segmentation
Qiaoqiao Wei · Hui Zhang · Jun-Hai Yong
Knowledge Distillation for 6D Pose Estimation by Aligning Distributions of Local Predictions
Shuxuan Guo · Yinlin Hu · Jose Alvarez · Mathieu Sal zmann
PartMix : Regularization Strategy to Learn Part Discovery for Visible-Infrared Person Re-identification
Minsu Kim Seungryong Kim Jungin Park Seongheon Park Kwanghoon Sohn
Integrally Pre-Trained Transformer Pyramid Networks
Yunjie Tian · Lingxi Xie · Zhaozhi Wang · Longhui Wei · XIAOPENG ZHANG · Jianbin Jiao · Yaowei Wang · Qi Tian · Qixiang Ye
Explaining Image Classifiers with Multiscale Directional Image Representation
Stefan Kolek · Robert Windesheim · Hector Andrade Loarca · Gitta Kutyniok · Ron Levie
Neuron Structure Modeling for Generalized Remote Physiological Measurement
Hao LU · Zitong Yu · Xuesong Niu · Ying-Cong Chen
Visual Atoms: Pre-training Vision Transformers with Sinusoidal Waves
Sora Takashima · Ryo Hayamizu · Nakamasa Inoue · Hirokatsu Kataoka · Rio Yokota
Model-Agnostic Gender Debiased Image Captioning
Yusuke Hirota · Yuta Nakashima · Noa Garcia
ImageBind: One Embedding Space To Bind Them All
Rohit Girdhar · Alaaeldin El-Nouby · Zhuang Liu · Mannat Singh · Kalyan Vasudev Alwala · Armand Joulin · Ishan Misra
I2MVFormer: Large Language Model Generated Multi-View Document Supervision for Zero-Shot Image Classification
Muhammad Naeem Naeem · Gul Zain Khan · Yongqin Xian · Muhammad Zeshan Afzal · Didier Stricker · Luc Van Gool · Federico Tombari
Learning Semantic Relationship among Instances for Image-Text Matching
Zheren Fu · Zhendong Mao · Yan Song · Yongdong Zhang
Learning Customized Visual Models with Retrieval-Augmented Knowledge
Haotian Liu · Kilho Son · Jianwei Yang · Ce Liu · Jianfeng Gao · Yong Jae Lee · Chunyuan Li
M
6
Doc: A Large-Scale Multi-Format, Multi-Type, Multi-Layout, Multi-Language, Multi-Annotation Category Dataset for \ Modern Document Layout Analysis Hiuyi Cheng · Peirong Zhang · Sihang
Wu · Jiaxin Zhang · Qiyuan · Zecheng Xie · Jing Li Kai Ding Lianwen Jin
Towards Modality-Agnostic Person Re-identification with Descriptive Query
Cuiqun Chen Mang Ye Ding Jiang
Generalized Decoding for Pixel, Image, and Language
Xueyan Zou Zi-Yi Dou Jianwei Yang Zhe Gan Linjie Li · Chunyuan Li · Xiyang Dai · Harkirat Behl · Jianfeng Wang · Lu Yuan · Nanyun Peng · Lijuan Wang · Yong Jae Lee · Jianfeng Gao Correlational
Image Modeling for Self-Supervised Visual Pre-Training
Wei Li · Jiahao Xie · CHEN CHANGE LOY
Revisiting Multimodal Representation in Contrastive Learning: From Patch and Token embeddings to Finite Discrete Tokens
Yuxiao Chen · Jianbo Yuan · Yu Tian · Shijie Geng · Xinyu Li · Ding Zhou · Dimitris Metaxas · Hongxia Yang
What Can Human Sketches Do for Object Detection?
Pinaki Nath Chowdhury · Ayan Kumar Bhunia · Aneeshan Sain · Subhadeep Koley · Tao Xiang · Yi-Zhe Song
Local-guided Global: Paired Similarity Representation for Visual Reinforcement Learning
Hyesong Choi · Hunsang Lee · Wonil Song · Sangryul Jeon · Kwanghoon Sohn · Dongbo Min
OCTET: Object-Aware Counterfactual Explanations
Mehdi Zemni · Mickael Chen · Eloi Zablocki · Hedi Ben younes · Patrick Perez · Matthieu CORD
Beyond Appearance: a Semantic Controllable Self-Supervised Learning Framework for Human-Centric Visual Tasks
Weihua Chen · Xianzhe Xu · Jian Jia · Hao Luo · Yaohua Wang · Fan Wang · Rong Jin · Xiuyu Sun
Advancing Visual Grounding with Scene Knowledge: Benchmark and Method
Zhihong Chen Ruifei Zhang Yibing Song Xiang Wan Guanbin Li
FashionSAP: Symbols and Attributes Prompt for Fine-grained Fashion Vision-Language Pre-training
Yunpeng Han Lisai Zhang Qingcai Chen chen zhijian Zhonghua Li Jianxin Yang Zhao Cao
Learning to Exploit Temporal Structure for Biomedical Vision–Language Processing
Shruthi Bannur · Stephanie Hyland · Qianchu Liu · Fernando Pérez-García · Maximilian Ilse · Daniel Castro · Benedikt Boecking · Harshita Sharma · Kenza Bouzid · Anja Thieme · Anton Schwaighofer · Maria Teodora Wetscherek · Matthew Lungren · Aditya Nori · Javier Alvarez Valle · Ozan Oktay
Neural Koopman Pooling: Control-Inspired Temporal Dynamics Encoding for Skeleton-Based Action Recognition
Xinghan Wang · Xin Xu · Yadong MU
Fine-grained Audible Video Description
Xuyang Shen · Dong Li · Jinxing Zhou · Zhen Qin · Bowen He · Xiaodong Han · Aixuan Li · Yuchao Dai · Lingpeng Kong · Meng Wang · Yu Qiao · Yiran Zhong
Language-Guided Audio-Visual Source Separation via Trimodal Consistency
Reuben Tan · Arijit Ray · Andrea Burns · Bryan Plummer · Justin Salamon · Oriol Nieto · Bryan Russell · Kate Saenko
Audio-Visual Grouping Network for Sound Localization from Mixtures
Shentong Mo · Yapeng Tian
Chat2Map: Efficient Scene Mapping from Multi-Ego Conversations
Sagnik Majumder · Hao Jiang · Pierre Moulon · Ethan Henderson · Paul Calamia · Kristen Grauman · Vamsi Krishna Ithapu
Taming Diffusion Models for Audio-Driven Co-Speech Gesture Generation
Lingting Zhu · Xian Liu · Xuanyu Liu · Rui Qian · Ziwei Liu · Lequan Yu
Spatio-Temporal Pixel-Level Contrastive Learning-based Source-Free Domain Adaptation for Video Semantic Segmentation
Shao-Yuan Lo · Poojan Oza · Sumanth Chennupati · Patricio Galindo · Vishal Patel
MDQE: Mining Discriminative Query Embeddings to Segment Occluded Instances on Challenging Videos
Minghan Li · Shuai Li · Wangmeng Xiang · Lei Zhang
System-status-aware Adaptive Network for Online Streaming Video Understanding
Lin Geng Foo · GONG JIA · Zhipeng Fan · Jun Liu
Frame Flexible Network
Yitian Zhang · Yue Bai · Chang Liu · Huan Wang · Sheng Li · Yun Fu
Self-Supervised Video Forensics by Audio-Visual Anomaly Detection
Chao Feng · Ziyang Chen · Andrew Owens
MobileVOS: Real-Time Video Object Segmentation Contrastive Learning meets Knowledge Distillation
ROY MILES · Mehmet Kerim Yucel · Bruno Manganelli · Albert Saa-Garriga
Improving Robustness of Semantic Segmentation to Motion-Blur using Class-Centric Augmentation
Aakanksha Aakanksha Rajagopalan Ambasamduram
MAGVIT: Masked Generative Video Transformer
Lijun Yu Yong Cheng Kihyuk Sohn Jose Lezama Han Zhang Huiwen Chang Alexander Hauptmann Ming-Hsuan Yang · Yuan Hao · Irfan Essa · Lu Jiang
SCOTCH and SODA: A Transformer Video Shadow Detection Framework
Lihao Liu · Jean Prost · Lei Zhu · Nicolas Papadakis · Pietro Lio · Carola-Bibiane Schönlieb · Angelica Aviles-Rivero
Blind Video Deflickering by Neural Filter ing with a Flawed Atlas
Chenyang Lei · Xuanchi Ren · Zhaoxiang Zhang · Qifeng Chen
Probabilistic Debiasing of Scene Graphs
Bashirul Biswas Biswas · Qiang Ji
ViTs for SITS: Vision Transformers for Satellite Image Time Series
Michail Tarasiou · Erik Chavez · Stefanos Zafeiriou
OmniMAE: Single Model Masked Pretraining on Images and Videos
Rohit Girdhar · Alaaeldin El-Nouby · Mannat Singh · Kalyan Vasudev Alwala · Armand Joulin · Ishan Misra
BASiS: Batch Aligned Spectral Embedding Space
Or Streicher · Ido Cohen · Guy Gilboa
Evolved Part Masking for Self-Supervised Learning
Zhanzhou FENG · Shiliang Shiliang
Hard Patches Mining for Masked Image Modeling
Haochen Wang · Kaiyou Song · Junsong Fan · Yuxi Wang · Jin Xie · Zhaoxiang Zhang
Pose-disentangled Contrastive Learning for Self-supervised Facial Representation
Yuanyuan Liu Wenbin Wang Yibing Zhan Shaoze Feng Kejun Liu Zhe Chen
OpenGait: Revisiting Gait Recognition Towards Better Practicality
Chao Fan Junhao Liang Chuanfu Shen Saihui Hou Yongzhen Huang Shiqi Yu
Autoregressive Visual Tracking
Xing Wei Yif an Bai · Yongchao Zheng · Dahu Shi · Yihong Gong
Observation-Centric SORT: Rethinking SORT for Robust Multi-Object Tracking
Jinkun Cao · Jiangmiao Pang · Xinshuo Weng · Rawal Khirodkar · Kris Kitani
GazeNeRF: 3D-Aware Gaze Redirection with Neural Radiance Fields
Alessandro Ruzzi· Xiangwei Shi · Xi Wang · Gengyan Li · Shalini De Mello · Hyung Jin Chang · Xucong Zhang · Otmar Hilliges
Phone2Proc: Bringing Robust Robots Into Our Chaotic World
Matt Deitke · Rose Hendrix · Ali Farhadi · Kiana Ehsani · Aniruddha Kembhavi
Learning Human-to-Robot Handovers from Point Clouds
Sammy Christen · Wei Yang · Claudia Pérez-D’Arpino · Otmar Hilliges · Dieter Fox · Yu-Wei Chao
MotionDiffuser: Controllable Multi-Agent Motion Prediction using Diffusion
Chiyu Jiang · Andre Cornman · Cheolho Park · Benjamin Sapp · Yin Zhou · Dragomir Anguelov
Uncovering the Missing Pattern: Unified Framework Towards Trajectory Imputation and Prediction
Yi Xu · Armin Bazarjani · Hyung-gun Chi · Chiho Choi · Yun Fu
MixSim: A Hierarchical Framework for Mixed Reality Traffic Simulation
Simon Suo · Kelvin Wong · Justin Xu · James Tu · Alexander Cui · Sergio Casas · Raquel Urtasun
Visual Exemplar Driven Task-Prompting for Unified Perception in Autonomous Driving
Xiwen Liang Minzhe Niu Jianhua Han Hang Xu Chunjing Xu Xiaodan Liang
Are We Ready for Vision-Centric Driving Streaming Perception? The ASAP Benchmark
Xiaofeng Wang Zh eng Zhu Yunpeng Zhang Guan Huang Yun Ye Wenbo Xu Ziwei Chen Xingang Wang
BAEFormer: Bi-directional and Early Interaction Transformers for Bird's Eye View Semantic Segmentation
Cong Pan Yonghao He Junran Peng Qian Zhang Wei Sui Zhaoxiang Zhang
PVO: Panoptic Visual Odometry
Weicai Ye Xinyue Lan SHUO CHEN Yuhang Ming Xingyuan Yu Hujun Bao Zhaopeng Cui Guofeng Zhang
Unsupervised Cumulative Domain Adaptation for Foggy Scene Optical Flow
Zhou Hanyu · Yi Chang · YAN WENDING · Luxin Yan
Domain Generalized Stereo Matching via Hierarchical Visual Transformation
Tianyu Chang · Xun Yang · Tianzhu Zhang · Meng Wang
Unsupervised Visible-Infrared Person Re-Identification via Progressive Graph Matching and Alternate Learning
Wu Zes en Mang Ye
Geometric Visual Similarity Learning in 3D Medical Image Self-Supervised Pre-training
Yuting He Guanyu Yang Rongjun Ge Yang Chen Jean-louis Coatrieux Boyu Wang Shuo Li
Progressive Neighbor Consistency Mining for Correspondence Pruning
Xin Liu Jufeng Yang
Visual Pro mpt Multi-Modal Tracking
Jiawen Zhu Simiao Lai Xin Chen Dong Wang Huchuan Lu
Robust Multiview Point Cloud Registration with Reliable Pose Graph Initialization and History Reweighting
Haiping Wang Yuan Liu Zhen Dong Yulan Guo Yushen Liu Wenping Wang Bisheng Yang
PointCert: Point Cloud Classification with Deterministic Certified Robustness Guarantees
Jinghuai Zhang Jinyuan Jia Hongbin Liu Neil Gong
Rethinking the Approximation Error in 3D Surface Fitting for Point Cloud Normal Estimation
Hang Du Xuejun Yan Jingjing Wang Di Xie Shiliang Pu
FAC: 3D Representation Learning via Foreground Aware Feature Contrast
Kangcheng Liu Aoran Xiao Xiaoqin Zhang Shijian Lu Ling Shao
ProxyFormer: Proxy Alignment Assisted Point Cloud Completion with Missing Part Sensitive Transformer
Shanshan Li · Pan Gao · Xiaoyang Tan · Mingqiang Wei
PointVector: A Vector Representation In Point Cloud Analysis
Xin Deng · wenyu Zhang · Qing Ding · Xinming Zhang
Fast Point Cloud Generation with Straight Flows
Lemeng Wu · Dilin Wang · Chengyue Gong · Xingchao Liu · Yunyang Xiong · Rakesh Ranjan · Raghuraman Krishnamoorthi · Vikas Chandra · qiang liu
ACL-SPC: Adaptive Closed-Loop system for Self-Supervised Point Cloud Completion
Sangmin Hong · Mohsen Yavartanoo · Reyhaneh Neshatavar Haghighi Shiraz · Kyoung Mu Lee
Open-set Semantic Segmentation for Point Clouds via Adversarial Prototype Framework
Jianan Li · Qiulei Dong
GD-MAE: Generative Decoder for MAE Pre-training on LiDAR Point Clouds
Honghui Yang · Tong He · Jiaheng Liu · Hua Chen · Boxi Wu · Binbin Lin · Xiaofei He · Wanli Ouyang
Novel Class Discovery for 3D Point Cloud Semantic Segmentation
Luigi Riz · Cristiano Saltori · Elisa Ricci · Fabio Poiesi
3D Semantic Segmentation in the Wild: Learning Generalized Models for Adverse-Condition Point Clouds
Aoran Xiao Jiaxing Huang Weihao Xuan Ruijie Ren Kangcheng Liu Dayan Guan Abdulmotaleb El Saddik Shijian Lu Eric Xing
Less is More: Reducing Task and Model Complexity for 3D Point Cloud Semantic Segmentation
Li Li · Hubert PH Shum · Toby Breckon
Instant Domain Augmentation for LiDAR Semantic Segmentation
Kwonyoung Ryu · Soonmin Hwang · Jaesik Park
Hidden Gems: 4D Radar Scene Flow Learning Using Cross-Modal Supervision
Fangqiang Ding · Andras Palffy · Dariu Gavrila · Xiaoxuan Lu
MoDAR: Using Motion Forecasting for 3D Object Detection in Point Cloud Sequences
Yingwei Li · Charles R. Qi · Yin Zhou · Chenxi Liu · Dragomir Anguelov
Towards Unsupervised Object Detection from LiDAR Point Clouds
Lunjun Zhang · Anqi Joyce Yang · Yuwen Xiong · Sergio Casas · Bin Yang · Mengye Ren · Raquel Urtasun
DeepMapping2: Self-supervised Large-scale LiDAR Map Optimization
Chao Chen · Xinhao Liu · Yiming Li · Li Ding · Chen Feng
ConQueR: Query Contrast Voxel-DETR for 3D Object Detection
Benjin ZHU · Zhe Wang · Shaoshuai Shi · Hang Xu · Lanqing HONG · Hongsheng Li
SGLoc: Scene Geometry Encoding for Outdoor LiDAR Localization
Wen Li · Shangshu Yu · Cheng Wang · Guosheng Hu · Siqi Shen · Chenglu Wen
Depth Estimation from Camera Image and mmWave Radar Point Cloud
Akash Deep Singh · Yunhao Ba · Ankur Sarker · Howard Zhang · Achuta Kadambi · Stefano Soatto · Mani Srivastava · Alex Wong
Towards Building Self-Aware Object Detectors via Reliable Uncertainty Quantification and Calibration
Kemal Oksuz · Tom Joy · Puneet Dokania
Uni3D: A Unified Baseline for Multi-dataset 3D Object Detection
Bo Zhang · Jiakang Yuan · Botian Shi Tao Chen Yikang LI Yu Qiao
Collaboration Helps Camera Overtake LiDAR in 3D Detection
Yue Hu Yifan Lu Runsheng Xu Weidi Xie Siheng Chen Yanfeng Wang
BEV@DC: Bird's-Eye View Assisted Training for Depth Completion
Wending Zhou · Xu Yan · Yinghong Liao · Yuankai Lin · Jin Huang · Gangming Zhao · Shuguang Cui · Zhen Li Tri
-Perspective View for Vision-Based 3D Semantic Occupancy Prediction
Yuanhui Huang · Wenzhao Zheng · Yunpeng Zhang · Jie Zhou · Jiwen Lu
Viewpoint Equivariance for Multi-View 3D Object Detection
Dian Chen · Jie Li · Vitor Guizilini · Rareș Ambruș · Adrien Gaidon
3D Concept Learning and Reasoning from Multi-View Images
Yining Hong · Chunru Lin · Yilun Du · Zhenfang Chen · Joshua Tenenbaum · Chuang Gan
Role of Transients in Two-Bounce Non-Line-of-Sight Imaging
Siddharth Somasundaram · Akshat Dave · Connor Henley · Ashok Veeraraghavan · Ramesh Raskar
3D Spatial Multimodal Knowledge Accumulation for Scene Graph Predict in Point Cloud
Mingtao Fng · Haran Hou · LIANG ZIANG · Zijie Wu · Yulan Ajmal Mian Re ·
VISITING The Stack-Based Inverse Tone Mapping
Ning ZHANG · YUYAO Ye · Yang Zhao · Ronggang
Mvimgnet: A Large-scale Dataset of Multi-view Images
Xianggang Yu Mutian Xu Yidan Zhang Haolin Liu Chongjie Ye Yushuang Wu Zizheng Yan Chenming Zhu Zhangyang Xiong Tianyou Liang Guanying Chen Shuguang Cui Xiaoguang Han Fully
Self -Supervised Depth Estimation from Defocus Clue
Haozhe Si Bin Zhao Dong Wang Yunpeng Gao Mulin Chen Zhigang Wang Xuelong Li
Zero-Shot Dual-Lens Super-Resolution
Ruikang Xu Mingde Yao Zhiwei Xiong
Temporally Consistent Online Depth Estimation Using Point-Based Fusion
Numair Khan · Eric Penner · Douglas Lanman · Lei Xiao
Learning to Detect Mirrors from Videos via Dual Correspondences
Jiaying Lin · Xin Tan · Rynson Lau
Renderable Neural Radiance Map for Visual Navigation
obin kwon · Jeongho Park · Songhwai Oh
VoxFormer: Sparse Voxel Transformer for Camera-based 3D Semantic Scene Completion
Yiming Li · Zhiding Yu · Chris Choy · Chaowei Xiao · Jose Alvarez · Sanja Fidler · Chen Feng · Anima Anandkumar
Behind the Scenes: Density Fields for Single View Reconstruction
Felix Wimbauer · Nan Yang · Christian Rupprecht · Daniel Cremers
Multiview Compressive Coding for 3D Reconstruction
Chao-Yuan Wu · Justin Johnson · Jitendra Malik · Christoph Feichtenhofer · Georgia Gkioxari
Virtual Occlusions Through Implicit Depth
Jamie Watson · Mohamed Sayed · Zawar Imam Qureshi · Gabriel Brostow · Sara Vicente · Oisin Aodha · Michael Firman
Panoptic Lifting for 3D Scene Understanding with Neural Fields
Yawar Siddiqui · Lorenzo Porzi · Samuel Rota Bulò · Norman Müller · Matthias Niessner · Angela Dai · Peter Kontschieder
Neural Part Priors: Learning to Optimize Part-Based Object Completion in RGB-D Scans
Alexey Bokhovkin · Angela Dai
BAAM: Monocular 3D pose and shape reconstruction with bi-contextual attention module and attention-guided modeling
Hyo-Jun Lee · Hanul Kim · Su-Min Choi · Seong-Gyun Jeong · Yeong Jun Koh
BKinD-3D: Self-Supervised 3D Keypoint Discovery from Multi-View Videos
Jennifer J. Sun · Lili Karashchuk · Amil Dravid · Serim Ryou · Sonia Fereidooni · John Tuthill · Aggelos Katsaggelos · Bingni Brunton · Georgia Gkioxari · Ann Kennedy · Yisong Yue · Pietro Perona
Four-view geometry with unknown radial distortion
Petr Hrubý · Viktor Korotynskiy · Timothy Duff · Luke Oeding · Marc Pollefeys · Tomas Pajdla · Viktor Larsson
Two-view Geometry Scoring Without Correspondences
Axel Barroso-Laguna · Eric Brachmann · Victor Prisacariu · Gabriel Brostow · Daniyar Turmukhambetov
Neural Voting Field for Camera-Space 3D Hand Pose Estimation
Lin Huang · Chung-Ching Lin · Kevin Lin · Lin Liang · Lijuan Wang · Junsong Yuan · Zicheng Liu
expOSE: Accurate Initialization-Free Projective Factorization using Exponential Regularization
José Iglesias Iglesias · Amanda Nilsson · Carl Olsson
Object Pose Estimation with Statistical Guarantees: Conformal Keypoint Detection and Geometric Uncertainty Propagation
Heng Yang · Marco Pavone
Crowd3D: Towards Hundreds of People Reconstruction from a Single Image
Hao Wen · Jing Huang · Huili Cui · Haozhe Lin · Yu-Kun Lai · LU FANG · Kun Li
Rigidity-Aware Detection for 6D Object Pose Estimation
Hai Yang · Rui Song · Jiaojiao Li · Mathieu Salzmann · Yinlin Hu
Robot Structure Prior Guided Temporal Attention for Camera-to-Robot Pose Estimation from Image Sequence
Yang Tian · Jiyao Zhang · Zekai Yin · Hao Dong
GFIE: A Dataset and Baseline for Gaze-Following from 2D to 3D in Indoor Environments
Zhengxi Hu · Yuxue Yang · Xiaolin Zhai · Dingye Yang · Bohan Zhou · Jingtai Liu
TokenHPE: Learning Orientation Tokens for Efficient Head Pose Estimation via Transformers
Cheng Zhang · Hai Liu · Yongjian Deng · Bochen Xie · Youfu Li
Global-to-Local Modeling for Video-based 3D Human Pose and Shape Estimation
Xiaolong Shen · Zongxin Yang · Xiaohan Wang · Jianxin Ma · Chang Zhou · Yi Yang
PoseFormerV2: Exploring Frequency Domain for Efficient and Robust 3D Human Pose Estimation
Qitao Zhao · Ce Zheng · Mengyuan Liu · Pichao WANG · Chen Chen
BITE: Beyond Priors for Improved Three-D Dog Pose Estimation
Nadine Rueegg · Shashank Tripathi · Konrad Schindler · Michael Black · Silvia Zuffi
TRACE: 5D Temporal Regression of Avatars with Dynamic Cameras in 3D Environments
Yu Sun · Qian Bao · Wu Liu · Tao Mei · Michael Black
NeuralDome: A Neural Modeling Pipeline on Mult i- View Human-Object Interactions
Juze Zhang Haimin Luo Hongdi Yang Xinru Xu Qianyang Wu Ye Shi Jingyi Yu Lan Xu Jingya Wang Target-
referenced Reactive Grasping for Dynamic Objects
Jirong Liu Ruo Zhang Hao-Shu Fang Minghao Gou Hongjie Fang Chenxi Wang Sheng Xu Hengxu Yan Cewu Lu
Command-driven Articulated Object Understanding and Manipulation
Ruihang Chu Zhengzhe Liu Xiaoqing Ye Xiao Tan XIAOJUAN QI Chi-Wing Fu Jiaya Jia
Visual-Tactile Sensing for In-Hand Object Reconstruction
Wenqiang Xu · Zhenjun Yu · Han Xue · Ruolin Ye · Siqiong Yao · Cewu Lu
MagicPony: Learning Articulated 3D Animals in the Wild
Shangzhe Wu · Ruining Li · Tomas Jakab · Christian Rupprecht · Andrea Vedaldi
Learning Analytical Posterior Probability for Human Mesh Recovery
Qi Fang · Kang Chen · Yinghui Fan · Qing Shuai · Jiefeng Li · Weidong Zhang
Marching-Primitives: Shape Abstraction from Signed Distance Function
Weixiao Liu · Yuwei Wu · Sipu Ruan · Gregory Chirikjian Le听
Neural Volumetric Representations of Dynamic Humans in Minutes
Chen Geng Sida Peng Zhen Xu Hujun Bao Xiaowei Zhou
Complete 3D Human Reconstruction from a Single Incomplete Image
Junying Wang · Jae Shin Yoon · Tuanfeng Wang · Krishna Kumar Singh · Ulrich Neumann
DIFu: Depth-guided Implicit Function for Clothed Human Reconstruction
Dae-Young Song · HeeKyung Lee · Jeongil Seo · Donghyeon Cho
BEDLAM: A Synthetic Dataset of Bodies Exhibiting Detailed Lifelike Animated Motion
Michael Black · Priyanka Patel · Joachim Tesch · Jinlong Yang
Invertible Neural Skinning
Yash Kant · Aliaksandr Siarohin · Riza Alp Guler · Menglei Chai · Jian Ren · Sergey Tulyakov · Igor Gilitschenski
Zero-shot Pose Transfer for Unrigged Stylized 3D Characters
Jiashun Wang · Xueting Li · Sifei Liu · Shalini De Mello · Orazio Gallo · Xiaolong Wang · Jan Kautz
Biomechanics-guided Facial Action Unit Detection through Force Modeling
Zijun Cui Chenyi Kuang Tian Gao Kartik Talamadupula Qiang Ji
Hand Avatar: Free-Pose Hand Animation and Rendering from Monocular Video
Xingyu Chen Baoyuan Wang Heung-Yeung Shum
High-fidel the city Clothed Avatar Reconstruction from a Single Image
Tingting Liao Xiaomei Zhang Yuliang Xiu Hongwei Yi Xudong Liu Guo-Jun Qi Yong Zhang Xuan Wang Xiangyu Zhu Zhen Lei NeuWigs
: A Neural Dynamic model for Volumetric Hair Capture and Animation
Ziyan Wang Giljoo Nam Tuur Stuyck Stephen Lombardi Chen Cao Jason Saragih Michael Zollhöfer Jessica Hodgins Christoph Lassner
FitMe: Deep Photorealistic 3D Morphable Model Avatars
Alexandros Lattas · Stylianos Moschoglou · Stylianos Ploumpis · Baris Gecer · Jiankang Deng · Stefanos Zafeiriou
FaceLit: Neural 3D Relightable Faces
Anurag Ranjan · Kwang Moo Yi · Jen-Hao Chang · Oncel Tuzel
Learning a Morphable Face Reflectance Model from Low-co st Data
Yuxuan Han · Zhibo Wang · Feng Xu
Fine-Grained Face Swapping via Regional GAN ​​Inversion
Zhian Liu · Maomao Li · Yong Zhang · Cairong Wang · Qi Zhang · Jue Wang · Yongwei Nie
DiffSwap: High-Fidelity and Controllable Face Swapping via 3D-Aware Masked Diffusion
Wenliang Zhao · Yongming Rao · Weikang Shi · Zuyan Liu · Jie Zhou · Jiwen Lu
Unsupervised 3D Shape Reconstruction by Part Retrieval and Assembly
Xianghao Xu Paul Guerrero Matthew Fisher Siddhartha Chaudhuri Daniel Ritchie
PREIM3D: 3D Consistent Precise Image Attribute Editing from a Single Image
Jianhui Li Jianmin Li Haoji Zhang Shilong Liu Zhengyi Wang Zihao Xiao Kaiwen Zheng Jun Zhu NeRF
Invertor : High Fidelity NeRF-GAN Inversion for Single-shot Real Image Animation
Yu Yin Kamran Ghasedi HsiangTao Wu Jiaolong Yang Xin Tong Yun Fu
Quantitative Manipulation of Custom Attributes on 3D-Aware Image Synthesis
Hoseok Do EunKyung Yoo Taehyeong Kim · Chul Lee · Jin Choi
SinGRAF: Learning a 3D Generative Radiance Field for a Single Scene
Minjung Son · Jeong Joon Park · Leonidas Guibas · Gordon Wetzstein
NeuralField-LDM: Scene Generation with Hierarchical Latent Diffusion Models
Seung Wook Kim · Bradley Brown · Kangxue Yin · Karsten Kreis · Katja Schwarz · Daiqing Li · Robin Rombach · Antonio Torralba · Sanja Fidler
NEF: Neural Edge Fields for 3D Parametric Curve Reconstruction from Multi-view Images
Yunfan Ye · Renjiao Yi · Zhirui Gao · Chenyang Zhu · Zhiping Cai · Kai Xu
NeuDA: Neural Deformable Anchor for High-Fidelity Implicit Surface Reconstruction
Bowen Cai · Jinchi Huang · Rongfei Jia · chengfei lv · Huan Fu
PermutoSDF: Fast Multi-View Reconstruction with Implicit Surfaces using Permutohedral Lattices
Radu Alexandru Rosu · Sven Behnke
Neuralangelo: High-Fidelity Neural Surface Reconstruction
Zhaoshuo Li · Thomas Müller · Alex Evans · Russ Taylor · Mathias Unberath · Ming-Yu Liu · Chen-Hsuan Lin
RealFusion: 360

Reconstruction of Any Object from a Single Image
Luke Melas-Kyriazi · Iro Laina · Christian Rupprecht · Andrea Vedaldi
Neural Lens Modeling
Wenqi Xian · Aljaz Bozic · Noah Snavely · Christoph Lassner
RGBD2: Generative Scene Synthesis via Incremental View Inpainting using RGBD Diffusion Models
Jiabao Lei · Jiapeng Tang · Kui Jia
Controllable Light Diffusion for Portraits
David Futschik · Kelvin Ritland · James Vecore · Sean Fanello · Sergio Orts-Escolano · Brian Curless · Daniel Sýkora · Rohit Pandey
Weakly-supervised Single-view Image Relighting
Renjiao Yi · Chenyang Zhu · Kai Xu
MAIR: Multi-view Attention Inverse Rendering with 3D Spatially-Varying Lighting Estimation
JunYong Choi · SeokYeong Lee · Haesol Park · Seung-Won Jung · Ig-Jae Kim · Junghyun Cho
DANI-Net: Uncalibrated Photometric Stereo by Differentiable Shadow Handling, Anisotropic Reflectance Modeling, and Neural Inverse Rendering
Zongrui Li Qian Zheng Boxin Shi Gang Pan Xudong Jiang
Neural Fields meet Explicit Geometric Representations for Inverse Rendering of Urban Scenes
Zian Wang Tianchang Shen Jun Gao · SHENGYU HUANG · Jacob Munkberg · Jon Hasselgren · Zan Gojcic · Wenzheng Chen · Sanja Fidler
Pointersect: Neural Rendering with Cloud-Ray Intersection
Jen-Hao Chang · Wei-Yu Chen · Anurag Ranjan · Kwang Moo Yi · Oncel Tuzel
Point2Pix: Photo-Realistic Point Cloud Rendering via Neural Radiance Fields
Tao Hu · Xiaogang Xu · Shu Liu · Jiaya Jia
StyleRF: Zero-shot 3D Style Transfer of Neural Radiance Fields
Kunhao Liu · Fangneng Zhan · Yiwen Chen · Jiahui Zhang · Yingchen Yu · Abdulmotaleb El Saddik · Shijian Lu · Eric Xing
EditableNeRF: Editing Topologically Varying Neural Radiance Fields by Key Points
Chengwei Zheng · Wenbin Lin · Feng Xu
Learning Neural Duplex Radiance Fields for Real-Time View Synthesis
Ziyu Wan · Christian Richardt · Aljaz Bozic · Chao Li · Vijay Rengarajan · Seonghyeon Nam · Xiaoyu Xiang · Tuotuo Li · Bo Zhu · Rakesh Ranjan · Jing Liao
Grid-guided Neural Radiance Fields for Large Urban Scenes
Linning Xu · Yuanbo Xiangli · Sida Peng · Xingang Pan · Nanxuan Zhao · Christian Theobalt · Bo Dai · Dahua Lin
NeRF-DS: Neural Radiance Fields for Dynamic Specular Objects
Zhiwen Yan · Chen Li · Gim Lee
Nerflets: Local Radiance Fields for Efficient Structure-Aware 3D Scene Representation from 2D Supervision
Xiaoshuai Zhang · Abhijit Kundu · Thomas Funkhouser · Leonidas Guibas · Hao Su · Kyle Genova
Local-to-Global Registration for Bundle-Adjusting Neural Radiance Fields
Yue Chen · Xingyu Chen · Xuan Wang · Qi Zhang · Yu Guo · Ying Shan · Fei Wang
FreeNeRF: Improving Few-shot Neural Rendering with Free Frequency Regularization
Jiawei Yang · Marco Pavone · Yue Wang
RefSR-NeRF: Towards High Fidelity and Super Resolution View Synthesis
Xudong Huang · Wei Li · Jie Hu · Hanting Chen · Yunhe Wang
Swept-Angle Synthetic Wavelength Interferometry
Alankar Kotwal · Anat Levin · Ioannis Gkioulekas
Edge-aware Regional Message Passing Controller for Image Forgery Localization
Dong Li · Jiaying Zhu · Menglu Wang · Jiawei Liu · Xueyang Fu · Zheng-Jun Zha
Revisiting Residual Networks for Adversarial Robustness
Shihua Huang · Zhichao Lu · Kalyanmoy Deb · Vishnu Naresh Boddeti
CFA: Class-wise Calibrated Fair Adversarial Training
Zeming Wei · Yifei Wang · Yiwen Guo · Yisen Wang
Feature Separation and Recalibration for Adversarial Robustness
Woo Jae Kim · Yoonki Cho · Junsik Jung · Sung-eui Yoon
Improving the Transferability of Adversarial Samples by Path-Augmented Method
Jianping Zhang · Jen-tse Huang · Wenxuan Wang · Yichen LI · Weibin Wu · Xiaosen Wang · Yuxin Su · Michael Lyu
StyLess: Boosting the Transferability of Adversarial Examples
Kaisheng Liang Bin Xiao
Minimizing Maximum Model Discrepancy for Transferable Black-box Targeted Attacks
Anqi Zhao Tong Chu Yahao Liu Wen Li Jingjing Li Lixin Duan
Adversarially Robust Neural Architecture S search for Graph Neural Networks
Beini Xie · Heng Chang · Ziwei Zhang · Xin Wang · Daixin Wang · Zhiqiang Zhang · Rex Ying · Wenwu Zhu
Color Backdoor: A Robust Poisoning Attack in Color Space
Wenbo Jiang · Hongwei Li · Guowen Xu · Tianwei Zhang
Effective Ambiguity Attack Against Passport-based DNN Intellectual Property Protection Schemes through Fully Connected Layer Substitution
Yiming Chen · Jinyu Tian · Xiangyu Chen · Jiantao Zhou
Single Image Backdoor Inversion via Robust Smoothed Classifiers
Mingjie Sun · J Kolter
Multi-view Adversarial Discriminator: Mine the Non-causal Factors for Object Detection in Unseen Domains
Mingjun Xu · Lingyun Qin · Weijie Chen · Shiliang Pu · Lei Zhang
RiDDLE: Reversible and Diversified De-identification with Latent Encryptor
Dongze Li · Wei Wang · Kang Zhao · Jing Dong · Tieniu Tan
CaPriDe Learning: Confidential and Private Decentralized Learning based on Encryption-friendly Distillation Loss
Nurbek Tastan · Karthik Nandakumar
Federated Learning with Data-Agnostic Distribution Fusion
Jian-hui Duan · Wenzhong Li · Derun Zou · Ruichen Li · Sanglu Lu
Learning Federated Visual Prompt in Null Space for MRI Reconstruction
Chun-Mei Feng · Bangjun Li · Xinxing Xu · Yong Liu · Huazhu Fu · Wangmeng Zuo
Decentralized Learning with Multi-Headed Distillation
Andrey Zhmoginov · Mark Sandler · Nolan Miller · Gus Kristiansen · Max Vladymyrov
Efficient Second-Order Plane Adjustment
Lipu Zhou
Learning Correspondence Uncertainty via Differentiable Nonlinear Least Squares
Dominik Muhle · Lukas Koestler · Krishna Murthy Jatavallabhula · Daniel Cremers
Learning Articulated Shape with Keypoint Pseudo-labels from Web Images
Anastasis Stathopoulos · Georgios Pavlakos · Ligong Han · Dimitris Metaxas
ObjectMatch: Robust Registration using Canonical Object Correspondences
Can Gümeli · Angela Dai · Matthias Niessner
Pose Synchronization under Multiple Pair-wise Relative Poses
Yifan Sun · Qixing Huang
MammalNet: A Large-scale Video Benchmark for Mammal Recognition and Behavior Understanding
Jun Chen · Ming Hu · Darren Coker · Michael L. Berumen · Blair Costelloe · Sara Beery · Anna Rohrbach · Mohamed Elhoseiny
DiffPose: Toward More Reliable 3D Pose Estimation
GONG JIA · Lin Geng Foo · Zhipeng Fan · Qiuhong Ke · Hossein Rahmani · Jun Liu
Scene-aware Egocentric 3D Human Pose Estimation
Jian Wang · Diogo Luvizon · Weipeng Xu · Lingjie Liu · Kripasindhu Sarkar · Christian Theobalt
Unified Pose Sequence Modeling
Lin Geng Foo · Tianjiao Li · Hossein Rahmani · Qiuhong Ke · Jun Liu
A Characteristic Function-based Method for Bottom-up Human Pose Estimation
Haoxuan Qu · Yujun Cai · Lin Geng Foo · Ajay Kumar · Jun Liu
AssemblyHands: Towards Egocentric Activity Understanding via 3D H and Pose Estimation
Takehiko Ohkawa · Kun He · Fadime Sener · Tomas Hodan · LUAN TRAN · Cem Keskin
Harmonious Feature Learning for Interactive Hand-Object Pose Estimation
Zhifeng Lin · Changxing Ding · Huan Yao · Zengsheng Kuang · Shaoli Huang
CIMI4D: A Large Multimodal Climbing Motion Dataset under Human-scene Interactions
Ming Yan · Xin Wang · Yudi Dai · Siqi Shen · Chenglu Wen · Lan Xu · Yuexin Ma · Cheng Wang
MIME: Human-Aware 3D Scene Generation
Hongwei Yi Chun-Hao Huang Shashank Tripathi Lea Hering Justus Thies Michael Black
ACR: Attention Collaboration-based Regressor for Arbitrary Two-Hand Reconstruction
Zhengdi Yu Shaoli Huang Chen Fang Toby Breckon Jue Wang
ARCTIC: A Dataset for Dexterous Bimanual Hand-Object Manipulation
Zicong Fan Omid Taheri Dimitrios Tzionas Muhammed Kocabas Manuel Kaufmann Michael Black Otmar Hilliges
NIKI: Neural Inverse Kinematics with Invertible Neural Networks for 3D Human Pose and Shape Estimation
Jiefeng Li · Siyuan Bian · Qi Liu · Jiasheng Tang · Fan Wang · Cewu Lu
P
C
2
: Projection-Conditioned Point Cloud Diffusion for Single-Image 3D Reconstruction
Luke Melas-Kyriazi · Christian Rupprecht · Andrea Vedaldi
ShapeClipper: Scalable 3D Shape Learning from Single-View Images via Geometric and CLIP-based Consistency
Zixuan Huang · Varun Jampani · Ngoc Anh Thai · Yuanzhen Li · Stefan Stojanov · James Rehg
Human Body Shape Completion with Implicit Shape and Flow Learning
Boyao Zhou · Di Meng · Jean-Sébastien Franco · Edmond Boyer
gSDF: Geometry-Driven Signed Distance Functions for 3D Hand-Object Reconstruction
Zerui Chen · Shizhe Chen · Cordelia Schmid · Ivan Laptev
Sampling is Matter: Point-guided 3D Human Mesh Reconstruction
Jeong Hwan Kim · Mi-Gyeong Gwon · Hyunwoo Park · Hyukmin Kwon · Gi-Mun Um · Wonjun Kim
High-fidelity 3D Human Digitization from Single 2K Resolution Images
Sang-Hun Han · Min-Gyu Park · Ju Yoon · Ju-Mi Kang · YOUNG-JAE PARK · Hae-Gon Jeon
Vid2Avatar: 3D Avatar Reconstruction from Videos in the Wild via Self-supervised Scene Decomposition
Chen Guo · Tianjian Jiang · Xu Chen · Jie Song · Otmar Hilliges
CLOTH4D: A Dataset for Clothed Human Reconstruction
XINGXING ZOU · Xintong Han · Waikeung Wong
RaBit: Parametric Modeling of 3D Biped Cartoon Characters with a Topological-consistent Dataset
Zhongjin Luo Shengcai Cai Jinguo Dong Ruibo Ming Liangdong Qiu Xiaohang Zhan Xiaoguang Han
OmniAvatar: Geometry-Guided Controllable 3D Head Synthesis
Hongyi Xu · Guoxian Song · Zihang Jiang Jianfeng Zhang Yichun Shi Jing Liu Wanchun Ma Jiashi Feng Linjie Luo
HARP: Personalized Hand Reconstruction from Monocular RGB Videos
Korrawe Karunratanakul Sergey Prokudin Otmar Hilliges Siyu Tang
Reconstructing Signing Avatars From V ideo Using Linguistic Priors
Maria- Paola Forte · Peter Kulits · Chun-Hao Huang · Vasileios Choutas · Dimitrios Tzionas · Katherine J. Kuchenbecker · Michael Black
CODETALKER: Speech-Driven 3D Facial Animation with Discrete Motion Prior
Jinbo Xing · Menghan XIA · Yuechen xiodong Cun · Jue Wang-Tsin Wong
Megane : Morphable Eyeglass and Avatar Network
Junxuan Li · Shunsuke Saito · Tomas Simon · Stephen Lombardi · Hongdong Li Jason Saragih
Parametric Implicit Face Representation for Audio-Driven Facial Reenactment
Ricong Huang Peiwen Lai Yipeng Qin Guanbin Li
3D-aware Facial Landmark Detection via Multi-view Consistent Training on Synthetic Data
Libing Zeng Lele Chen Wentao Bao Zhong Li · Yi Xu · Junsong Yuan · Nima Kalantari
DiffusionRig: Learning Personalized Priors for Facial Appearance Editing
Zheng Ding · Cecilia Zhang · Zhihao Xia · Lars Jebe · Zhuowen Tu · Xiuming Zhang
HairStep: Transfer Synthetic to Real Using Strand and Depth Maps for Single-View 3D Hair Modeling
Yujian Zheng · Zi-Rong Jin · Moran Li · Haibin Huang · Chongyang Ma Shuguang Cui Xiaoguang Han
DCFace: Synthetic Face Generation with Dual Condition Diffusion Model
Minchul Kim Feng Liu Anil Jain Xiaoming Liu
3D-Aware Face Swapping
Yixuan Li Chao Ma Yichao Yan Wenhan Zhu Xiaokang Yang
CoralStyleCLIP: Co -optimized Region and Layer Selection for Image Editing
Ambareesh Revanur Debraj Basu Shradha Agrawal Dhwanit Agarwal Deepak Pai
Local 3D Editing via 3D Distillation of CLIP Knowledge
Junha Hyung · Sungwon Hwang · Daejin Kim · Hyunji Lee · Jaegul Choo
Latent-NeRF for Shape-Guided Generation of 3D Shapes and Textures
Gal Metzer · Elad Richardson · Or Patashnik · Raja Giryes · Daniel Cohen-Or
3D-aware multi-class image -to-image translation with NeRFs
Senmao Li Joost van de Weijer Yaxing Wang Fahad Khan Meiqin Liu jian Yang
Diffusion-SDF: Text-to-Shape via Voxelized Diffusion
Muheng Li Yueqi Duan Jie Zhou Jiwen Lu
Infinite Photorealistic Worlds using Procedural Generation
Alexander Raistrick · Lahav Lipson · Zeyu Ma · Lingjie Mei · Mingzhe Wang · Yiming Zuo · Karhan Kayan · Hongyu Wen · Beining Han · Yihan Wang · Alejandro Newell · Hei Law · Ankit Goyal · Kaiyu Yang · Jia Deng
Score Jacobian Chaining: Lifting Pretrained 2D Diffusion Models for 3D Generation
Haochen Wang · Xiaodan Du · Jiahao Li · Raymond A. Yeh · Greg Shakhnarovich
RenderDiffusion: Image Diffusion for 3D Reconstruction, Inpainting and Generation
Titas Anciukevicius · Zexiang Xu · Matthew Fisher · Paul Henderson · Hakan Bilen · Niloy Mitra · Paul Guerrero
PET-NeuS: Positional Encoding Tri-planes for Neural Surfaces
Yiqun Wang · Ivan Skorokhodov · Peter Wonka
SparseFusion: Distilling View-conditioned Diffusion for 3D Reconstruction
Zhizhuo Zhou · Shubham Tulsiani
Dionysus: Recovering Scene Structures by Dividing into Semantic Pieces
Likang Wang · Lei Chen
3D shape reconstruction of semi-transparent worms
Thomas Ilett · Omer Yuval · Thomas Ranner · Netta Cohen · David Hogg
Seeing Through the Glass: Neural 3D Reconstruction of Object Inside a Transparent Container
Jinguang Tong · Sundaram Muthu · Fahira Afzal Maken · Chuong Nguyen · Hongdong Li
HumanGen: Generating Human Radiance Fields with Explicit Priors
Suyi Jiang · Haoran Jiang · Ziyu Wang · Haimin Luo · Wenzheng Chen · Lan Xu
Humans as Light Bulbs: 3D Human Reconstruction from Thermal Reflection
Ruoshi Liu · Carl Vondrick
Accidental Light Probes
Hong-Xing Yu · Samir Agarwala · Charles Herrmann · Richard Szeliski · Noah Snavely · Jiajun Wu · Deqing Sun
Inverse Rendering of Translucent Objects using Physical and Neural Renderers
Chenhao Li · Trung Ngo · Hajime Nagahara
Multi-view Inverse Rendering for Large-scale Real-world Indoor Scenes
Zhen Li · Lingli Wang · Mofang Cheng · Cihui Pan · Jiaqi Yang
K-Planes: Explicit Radiance Fields in Space, Time, and Appearance
Sara Fridovich-Keil · Giacomo Meanti · Frederik Warburg · Benjamin Recht · Angjoo Kanazawa
Efficient Map Sparsification Based on 2D and 3D Discretized Grids
Xiaoyu Zhang · Yun-Hui Liu
Modernizing Old Photos Using Multiple References via Photorealistic Style Transfer
Agus Gunawan · Soo Ye Kim · Hyeonjun Sim · Jae-Ho Lee · Munchurl Kim
DINER: Depth-aware Image-based NEural Radiance fields
Malte Prinzler · Otmar Hilliges · Justus Thies
Cross-Guided Optimization of Radiance Fields with Multi-View Image Super-Resolution for High-Resolution Novel View Synthesis
Youngho Yoon · Kuk-Jin YOON
NeRFLight: Fast and Light Neural Radiance Fields using a Shared Feature Grid
Fernando Rivas-Manzaneque · Jorge Sierra-Acosta · Adrian Penate-Sanchez · Francesc Moreno-Noguer · Angela Ribeiro
Multi-Space Neural Radiance Fields
Ze-Xin Yin · Jiaxiong Qiu · Ming-Ming Cheng · Bo Ren
DyLiN: Making Light Field Networks Dynamic
Heng Yu · Joel Julin · Zoltan Milacski · Koichiro Niinuma · Laszlo Jeni
DP-NeRF: Deblurred Neural Radiance Field with Physical Scene Priors
Do-Gyoon Lee · Minhyeok Lee · Chajin Shin · Sangyoun Lee
SUDS: Scalable Urban Dynamic Scenes
Haithem Turki · Jason Zhang · Francesco Ferroni · Deva Ramanan
NeRFLix: High-Quality Neural View Synthesis by Learning a Degradation-Driven Inter-viewpoint MiXer
Kun Zhou · Wenbo Li · Yi Wang · Tao Hu · Nianjuan Jiang · Xiaoguang Han · Jiangbo Lu
Polarimetric iToF: Measuring High-Fidelity Depth through Scattering Media
Daniel Jeon · Andreas Meuleman · Seung-Hwan Baek · Min Kim Kim
MaLP: Manipulation Localization Using a Proactive Scheme
Vishal Asnani · Xi Yin · Tal Hassner · Xiaoming Liu
Physically Adversarial Infrared Patches with Learnable Shapes and Locations
Xingxing Wei · Jie Yu · Yao Huang
Towards Benchmarking and Assessing Visual Naturalness of PhysicalWorld Adversarial Attacks
Simin Li · Shuning Zhang · Gujun Chen · dong wang · Pu Feng · Jiakai Wang · Aishan Liu · Xin Yi · Xianglong Liu
Seasoning Model Soups for Robustness to Adversarial and Natural Distribution Shifts
Francesco Croce · Sylvestre-Alvise Rebuffi · Evan Shelhamer · Sven Gowal
Demystifying Causal Features on Adversarial Examples and Causal Inoculation for Robust Network by Adversarial Instrumental Variable Regression
Junho Kim · Byung-Kwan Lee · Yong Man Ro
Black-Box Sparse Adversarial Attack via Multi-Objective Optimisation
Phoenix Williams · Ke Li
Enhancing the Self-Universality for Transferable Targeted Attacks
Zhipeng Wei · Jingjing Chen · Zuxuan Wu · Yu-Gang Jiang
Evading DeepFake Detectors via Adversarial Statistical Consistency
Hou Yang Qing Guo Yihao Huang Xiaofei Xie Lei Ma Jianjun Zhao
CAP: Robust Point Cloud Classification via Semantic and Structural Modeling
Daizong Ding Erling Jiang Yuanmin Huang Mi Zhang Wenx uan Li · Min Yang
Backdoor Attacks Against Deep Image Compression via Adaptive Frequency Trigger
Yi Yu Yufei Wang Wenhan Yang Shijian Lu Yap-peng Tan Alex Kot
FedSeg: Class-Heterogeneous Federated Learning for Semantic Segmentation
Jiaxu Miao Zongxin Yang Leile i Fan · Yi Yang
Multimodal Industrial Anomaly Detection via Hybrid Fusion
Yue Wang Jinlong Peng Jiangning Zhang Ran Yi Yabiao Wang Chengjie Wang
Unbiased Multiple Instance Learning for Weakly Supervised Video Anomaly Detection
HUI LYU · Zhongqi Yue · Qianru Sun · Bin Luo · Zhen Cui · Hanwang Zhang
Attribute-preserving Face Dataset Anonymization via Latent Code Optimization
Simone Barattin · Christos Tzelepis · Ioannis Patras · Nicu Sebe
HandsOff: Labeled Dataset Generation with No Additional Human Annotations
Austin Xu · Mariya Vasileva · Achal Dave · Arjun Seshadri
Cross-GAN Auditing: Unsupervised Identification of Attribute Level Similarities and Differences between Pretrained Generative Models
Matthew Olson · Shusen Liu · Rushil Anirudh · Jayaraman J. Thiagarajan · Peer-timo Bremer · Weng-Keen Wong
Learning to Generate Image Embeddings with User-level Differential Privacy
Zheng Xu · Maxwell Collins · Yuxiao Wang · Liviu Panait · Sewoong Oh · Sean Augenstein · Ting Liu · Florian Schroff · Hugh McMahan
Adaptive Data-Free Quantization
Biao Qian · Yang Wang · Richang Hong · Meng Wang
Solving Oscillation Problem in Post-Training Quantization Through a Theoretical Perspective
Yuexiao Ma · Huixia Li · Xiawu Zheng · Xuefeng Xiao · Rui Wang · Shilei Wen · Xin Pan · Fei Chao · Rongrong Ji
One-Shot Model for Mixed-Precision Quantization
Ivan Koryakovskiy · Alexandra Yakovleva · Valentin Buchnev · Temur Isaev · Gleb Odinokikh
Training debiased subnetworks with contrastive weight pruning
Geon Yeong Park · Sangmin Lee · Sang Wan Lee · Jong Ye
Understanding Masked Autoencoders via Hierarchical Latent Variable Models
Lingjing Kong · Martin Q. Ma · Guangyi Chen · Eric Xing · Yuejie Chi · Louis-Philippe Morency · Kun Zhang
MobileOne: An Improved One Millisecond Mobile Backbone
Pavan Kumar Anasosalu Vasu · James Gabriel · Jeff Zhu · Oncel Tuzel · Anurag Ranjan
Rate Gradient Approximation Attack Threats Deep Spiking Neural Networks
Tong Bu · Jianhao Ding · Zecheng Hao · Zhaofei Yu
Constructing Deep Spiking Neural Networks from Artificial Neural Networks with Knowledge Distillation
Qi Xu · Yaxin Li · Jiangrong Shen · Jian Liu · Huajin Tang · Gang Pan
From Node Interaction to Hop Interaction: New Effective and Scalable Graph Learning Paradigm
Jie Chen · Zilong Li · Zhu Yin · Junping Zhang · Jian Pu
A General Regret Bound of Preconditioned Gradient Method for DNN Training
Hongwei Yong · Ying Sun · Lei Zhang
Improved Distribution Matching for Dataset Condensation
Ganlong Zhao · Guanbin Li · Yipeng Qin · Yizhou Yu
Imitation Learning as State Matching via Differentiable Physics
Siwei Chen Xiao Ma Zhongwen Xu
Trainable Projected Gradient Method for Robust Fine-tuning
Junjiao Tian Xiaoliang Dai Chih-Yao Ma Zecheng He Yen-Cheng Liu Zsolt Kira
Improving Generalization of Meta Learning with Inverted Regularization at Inner-level
Lianzhe Wang Shiji Zhou Shanghang Zhang Xu Chu Heng Chang Wenwu Zhu
SIM: Semantic-aware Instance Mask Generation for Box-Supervised Instance Segmentation
Ruihuang Li · Chenhang HE · Yabin Zhang · Shuai Li · Liyi Chen · Lei Zhang
Rethinking the Correlation in Few-Shot Segmentation: A Buoys View
Yuan Wang · Rui Sun · Tianzhu Zhang
Reliability in Semantic Segmentation: Are We on the Right Track?
Pau de Jorge Aranda · Riccardo Volpi · Philip Torr · Grégory Rogez
ACSeg: Adaptive Conceptualization for Unsupervised Semantic Segmentation
Kehan Li · Zhennan Wang · Zesen Cheng · Runyi Yu · Yian Zhao · Guoli Song · Chang Liu · Li Yuan · Jie Chen
PartDistillation: Learning Parts from Instance Segmentation
Jang Hyun Cho · Philipp Kraehenbuehl · Vignesh Ramanathan
PACO: Parts and Attributes of Common Objects
Vignesh Ramanathan Anmol Kalia Vladan Petrovic Yi Wen Baixue Zheng Baishan Guo Rui Wang Aaron Marquez Rama Kovvuri Abhishek Kadian Amir Mousavi Yiwen Song Abhimanyu Dubey Dhruv Ma hajan
MIANet : Aggregating Unbiased Instance and General Information for Few-Shot Semantic Segmentation
Yong Yang Qiong Chen Yuan Feng Tianlin Huang
Generative Semantic Segmentation
Jiaqi Chen Jiachen Lu Xiatian Zhu Li Zhang
GeoLayoutLM: Geometric Pre-training for Visual Information Extra ction
Chuwei Luo · Changxu Cheng · Qi Zheng · Cong Yao
GAPartNet: Cross-Category Domain-Generalizable Object Perception and Manipulation via Generalizable and Actionable Parts
Haoran Gen · Helin Xu · Chengyang Zhao · Chao Xu · Li Yi · Siyuan Huang · He Wang
A Simple Framework for Text-Supervised Semantic Segmentation
Muyang Yi · Quan Cui · Hao Wu · Cheng Yang · Osamu Yoshie · Hongtao Lu
Learning to Detect and Segment for Open Vocabulary Object Detection
tao wang
Open-vocabulary Attribute Detection
Maria Bravo Sudhanshu Mittal Simon Ging Thomas Brox
CORA: Adapting CLIP for Open-Vocabulary Detection with Region Prompting and Anchor Pre-Matching
Xiaoshi Wu Feng Zhu Rui Zhao Hongsheng Li
CLIP2Scene: Towards Label-efficient 3D Scene Understanding by CLIP
Runnan Chen · Youquan Liu · Lingdong Kong · Xinge ZHU · Yuexin Ma · Yikang LI · Yuenan Hou · Yu Qiao · Wenping Wang
PLA: Language-Driven Open-Vocabulary 3D Scene Understanding
Runyu Ding Jihan Yang Chuhui Xue Wenqing Zhang Song Bai XIAOJUAN QI
CrOC: Cross-View Online Clustering for Dense Visual Representation Learning
Thomas Stegmüller Tim Lebailly Behzad Boz orgtabar Tinne Tuytelars · Jean-PHILIPEPE Thiran
Scandmm: A Deep Markov Model of Scanpath Prediction for 360 ° Images
Xiangjie Sui · YUMING FANGEI THU • Shiqi Wang · zhou W. Ang
Turning A Clip Model Into A Scene Text Detector
wenwen yu · yuliang liu · wei hua · Deqiang Jiang · Bo Ren · Xiang Bai
Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training
Filip Radenovic · Abhimanyu Dubey · Abhishek Kadian · Todor Mihaylov · Simon Vandenhende · Yash Patel · Yi Wen · Vignesh Ramanathan · Dhruv Mahajan
Uncurated Image-Text Datasets: Shedding Light on Demographic Bias
Noa Garcia · Yusuke Hirota · YANKUN WU · Yuta Nakashima
EXIF as Language: Learning Cross-Modal Associations Between Images and Camera Metadata
Chenhao Zheng · Ayush Shrivastava · Andrew Owens
Cross-Domain Image Captioning with Discriminative Finetuning
Roberto Dessi · Michele Bevilacqua · Eleonora Gualdoni · Nathanaël Rakotonirina · Francesca Franzon · Marco Baroni
Similarity Maps for Self-Training Weakly-Supervised Phrase Grounding
Tal Shaharabany · Lior Wolf
Positive-Augmented Constrastive Learning for Image and Video Captioning Evaluation
Sara Sarto · Manuele Barraco · Marcella Cornia · Lorenzo Baraldi · Rita Cucchiara
Detecting and Grounding Multi-Modal Media Manipulation
Rui Shao · Tianxing Wu · Ziwei Liu
DeltaEdit: Exploring Text-free Training for Text-Driven Image Manipulation
Yueming Lyu · Tianwei Lin · Fu Li · Dongliang He · Jing Dong · Tieniu Tan
Parts2Words: Learning Joint Embedding of Point Clouds and Texts by Bidirectional Matching between Parts and Words
Chuan Tang · Xi Yang · Bojian Wu · Zhizhong Han · Yi Chang
Exploiting Unlabelled Photos for Stronger Fine-Grained SBIR
Aneeshan Sain · Ayan Kumar Bhunia · Subhadeep Koley · Pinaki Nath Chowdhury · Soumitri Chattopadhyay · Tao Xiang · Yi-Zhe Song
GeneCIS: A Benchmark for General Conditional Image Similarity Sketches Subhadeep Koley · Ayan Kumar Bhunia · Aneeshan Sain · Pinaki Nath Chowdhury · Tao Xiang · Yi-Zhe Song Hyperbolic Contrastive Learning for Visual Representations beyond Objects : A Generalist Painter for In-Context Visual Learning Xinlong Wang · Wen Wang · Yue Cao · Chunhua Shen · Tiejun Huang







DeAR: Debiasing Vision-Language Models with Additive Residuals
Ashish Seth Mayur Hemani Chirag Agarwal
Leverage Interactive Affinity for Affordance Learning
Hongchen Luo Wei Zhai Jing Zhang Yang Cao Dacheng Tao
Affordance Grounding from Demonstration Video to Target Image
Joya Chen Difei Gao · Kevin Qinghong Lin · Mike Zheng Shou
Seeing What You Miss: Vision-Language Pre-training with Semantic Completion Learning
Yatai Ji · Rong-Cheng Tu · jie jiang · Weijie Kong · Chengfei Cai · Wenzhe Zhao · WANG HongFa · Yujiu Yang · Wei Liu
Is BERT Blind? Exploring the Effect of Vision-and-Language Pretraining on Visual Language Understanding
Morris Alper · Michael Fiman · Hadar Averbuch-Elor
Probabilistic Prompt Learning for Dense Prediction
Hyeongjun Kwon · Taeyong Song · Somi Jeong · Jin Kim · Jinhyun Jang · Kwanghoon Sohn
Visual-Language Prompt Tuning with Knowledge-guided Context Optimization
Hantao Yao · Rui Zhang · Changsheng Xu
The Dialog Must Go On: Improving Visual Dialog via Generative Self-Training
Gi-Cheon Kang · Sungdong Kim · Jinhwa Kim · Donghyun Kwak · Byoung-Tak Zhang
Divide and Conquer: Answering Questions with Object Factorization and Compositional Reasoning
Shi Chen · Qi Zhao
Logical Implications for Visual Question Answering Consistency
Sergio Tascon Morales · Pablo Márquez Neila · Raphael Sznitman
Abstract Visual Reasoning: An Algebraic Approach for Solving Raven’s Progressive Matrices
Jingyi Xu · Tushar Vaidya · Yufei Wu · Saket Chandra · Zhangsheng Lai · Kai Fong Ernest Chong
NaQ: Leveraging Narrations as Queries to Supervise Episodic Memory
Santhosh Kumar Ramakrishnan · Ziad Al-Halah · Kristen Grauman
Meta-Explore: Exploratory Hierarchical Vision-and-Language Navigation Using Scene Object Spectrum Grounding
Minyoung Hwang · Jaeyeon Jeong · Minsoo Kim · Yoonseon Oh · Songhwai Oh
3D-Aware Object Goal Navigation via Simultaneous Exploration and Identification
Jiazhao Zhang · Liu Dai · Fanpeng Meng · Qingnan Fan · Xuelin Chen · Kai Xu · He Wang
VLPD: Context-Aware Pedestrian Detection via Vision-Language Semantic Self-Supervision
Mengyin Liu · jie jiang · Chao Zhu · Xu-Cheng Yin
An Actor-Centric Causality Graph for Asynchronous Temporal Inference in Group Activity
Zhao Xie · Tian Gao · Kewei Wu · Jiao Chang
Affection: Learning Affective Explanations for Real-World Visual Data
Panos Achlioptas · Maks Ovsjanikov · Leonidas Guibas · Sergey Tulyakov
Decoupled Multimodal Distilling for Emotion Recognition
Yong Li · Yuanzhi Wang · Zhen Cui
Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models
Wenhao Wu · Xiaohan Wang · Haipeng Luo · Jingdong Wang · Yi Yang · Wanli Ouyang
Learning Video Representations from Large Language Models
Yue Zhao · Ishan Misra · Philipp Kraehenbuehl · Rohit Girdhar
ProTéGé: Untrimmed Pretraining for Video Temporal Grounding by Video Temporal Grounding
Lan Wang · Gaurav Mittal · Sandra Sajeev · Ye Yu · Matthew Hall · Vishnu Naresh Boddeti · Mei Chen
Fine-tuned CLIP Models are Efficient Video Learners
Hanoona Bangalath · Muhammad Uzair Khattak · Muhammad Maaz · Salman Khan · Fahad Khan
Movies2Scenes: Using Movie Metadata to Learn Scene Representation
Shixing Chen · Chun-Hao Liu · Xiang Hao · Xiaohan Nie · Maxim Arap · Raffay Hamid
Soft-Landing Strategy for Alleviating the Task Discrepancy Problem in Temporal Action Localization Tasks
Hyolim Kang · Hanjung Kim · Joungbin An · Minsu Cho · Seon Joo Kim
Reducing the Label Bias for Timestamp Supervised Temporal Action Segmentation
Kaiyuan Liu · Yunheng Li · Shenglan Liu · Tan · Zihang Shao
Active Exploration of Multimodal Complementarity for Few-Shot Action Recognition
Yuyang Wanyan · Xiaoshan Yang · Chaofan Chen · Changsheng Xu
MMG-Ego4D: Multimodal Generalization in Egocentric Action Recognition
Xinyu Gong · Sreyas Mohan · Naina Dhingra · Jean-Charles Bazin · YILEI LI · Zhangyang Wang · Rakesh Ranjan
Prompt-Guided Zero-Shot Anomaly Action Recognition using Pretrained Deep Skeleton Features
Fumiaki Sato · Ryo Hachiuma · Taiki Sekii
TempSAL - Uncovering Temporal Information for Deep Saliency Prediction
Bahar Aydemir · Ludo Hoffstetter · Tong Zhang · Mathieu Salzmann · Sabine Süsstrunk
Decompose More and Aggregate Better: Two Closer Looks at Frequency Representation Learning for Human Motion Prediction
Xuehao Gao · Shaoyi Du · Yang Wu · Yang Yang
CASP-Net: Rethinking Video Saliency Prediction from an Audio-Visual Consistency Perceptual Perspective
Junwen Xiong · Ganglai Wang · Peng Zhang · Wei Huang · Yufei Zha · Guangtao Zhai
Sound to Visual Scene Generation by Audio-to-Visual Latent Alignment
Sungbin Kim · Arda Senocak · Hyunwoo Ha · Andrew Owens · Tae-Hyun Oh
Learning Audio-Visual Source Localization via False Negative Aware Contrastive Learning
Weixuan Sun Jiayi Zhang Jianyuan Wang Zheyuan Liu Yiran Zhong Tianpeng Feng Yandong Guo Yanhao Zhang Nick Barnes
Novel-view Acoustic Synthesis
Changan Chen Alexander Richard Roman Shapovalov Vamsi Krishna Ithapu Natalia Neverova Kristen Grauman · Andrea Vedaldi
Relational Space-Time Query in Long-Form Videos
Xitong Yang · FU-JEN CHU · Raghav Goyal · Matt Feiszli · Lorenzo Torresani · Du Tran
Selec

Guess you like

Origin blog.csdn.net/weixin_62501745/article/details/130088602