点击@CV计算机视觉,关注更多CV干货
论文已打包,点击进入—>下载界面
1.【医学图像分割】Clean Label Disentangling for Medical Image Segmentation with Noisy Labels
2.【超分辨率重建】SeeSR: Towards Semantics-Aware Real-World Image Super-Resolution
3.【域自适应】Progressive Target-Styled Feature Augmentation for Unsupervised Domain Adaptation on Point Clouds
4.【多模态】HumanGaussian: Text-Driven 3D Human Generation with Gaussian Splatting
5.【多模态】LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models
6.【多模态】Efficient In-Context Learning in Vision-Language Models for Egocentric Videos
-
开源代码(即将开源):https://github.com/yukw777/EILEV
7.【多模态】Space-Time Diffusion Features for Zero-Shot Text-Driven Motion Transfer
8.【多模态】MVBench: A Comprehensive Multi-modal Video Understanding Benchmark
9.【多模态】SparseCtrl: Adding Sparse Controls to Text-to-Video Diffusion Models
-
工程主页:SparseCtrl
-
代码即将开源
10.【多模态】LLaFS: When Large-Language Models Meet Few-Shot Segmentation
11.【多模态】Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding
12.【多模态】RichDreamer: A Generalizable Normal-Depth Diffusion Model for Detail Richness in Text-to-3D
-
工程主页:RichDreamer: A Generalizable Normal-Depth Diffusion Model for Detail Richness in Text-to-3D
-
代码即将开源
13.【多模态】A Unified Approach for Text- and Image-guided 4D Scene Generation
14.【多模态】GPT4Video: A Unified Multimodal Large Language Model for lnstruction-Followed Understanding and Safety-Aware Generation
-
开源代码(即将开源):https://github.com/gpt4video/GPT4Video
15.【多模态】LLMGA: Multimodal Large Language Model based Generation Assistant
16.【多模态】ChartLlama: A Multimodal LLM for Chart Understanding and Generation
-
工程主页:ChartLlama: A Multimodal LLM for Chart Understanding and Generation
-
开源代码(即将开源):https://github.com/tingxueronghua/ChartLlama-code
17.【多模态】AvatarGPT: All-in-One Framework for Motion Understanding, Planning, Generation and Beyond
-
工程主页:AvatarGPT
-
开源代码(即将开源):https://github.com/zixiangzhou916/AvatarGPT
18.【多模态】TextDiffuser-2: Unleashing the Power of Language Models for Text Rendering
19.【多模态】Removing NSFW Concepts from Vision-and-Language Models for Text-to-Image Retrieval and Generation
20.【多模态】SemiVL: Semi-Supervised Semantic Segmentation with Vision-Language Guidance
21.【自动驾驶:BEV】Towards Full-scene Domain Generalization in Multi-agent Collaborative Bird's Eye View Segmentation for Connected and Autonomous Driving
22.【Diffusion】Surf-D: High-Quality Surface Generation for Arbitrary Topologies using Diffusion Models
-
工程主页:Surf-D: High-Quality Surface Generation for Arbitrary Topologies using Diffusion Models
-
开源代码(即将开源):https://github.com/Yzmblog/SurfD
23.【Diffusion】Adversarial Diffusion Distillation
24.【Diffusion】Enhancing Scene Text Detectors with Realistic Text Image Synthesis Using Diffusion Models
25.【Diffusion】Fine-grained Appearance Transfer with Diffusion Models
26.【Diffusion】DiffuseBot: Breeding Soft Robots With Physics-Augmented Generative Diffusion Models
27.【网络剪枝】Filter-Pruning of Lightweight Face Detectors Using a Geometric Median Criterion
28.【图像编辑】Text-Driven Image Editing via Learnable Regions
29.【人体运动生成】A Unified Framework for Multimodal, Multi-Part Human Motion Synthesis
-
工程主页:UDE-2
-
开源代码(即将开源):https://github.com/zixiangzhou916/UDE-2
30.【NeRF】UC-NeRF: Neural Radiance Field for Under-Calibrated multi-view cameras in autonomous driving
-
工程主页:UC-NeRF: Neural Radiance Field for Under-Calibrated multi-view cameras in autonomous driving
-
开源代码(即将开源):https://github.com/kcheng1021/UC-NeRF
31.【图像合成】DemoFusion: Democratising High-Resolution Image Generation With No $$$
-
工程主页:DemoFusion
-
开源代码(即将开源):GitHub - PRIS-CV/DemoFusion: Let us democratise high-resolution generation! (arXiv 2023)
32.【人体重建】Deceptive-Human: Prompt-to-NeRF 3D Human Generation with 3D-Consistent Synthetic Images
33.【网络量化】(NeurIPS2023)Effective Quantization for Diffusion Models on CPUs
34.【Continual Learning】Class-Adaptive Sampling Policy for Efficient Continual Learning
论文已打包,下载链接
CV计算机视觉交流群
群内包含目标检测、图像分割、目标跟踪、Transformer、多模态、NeRF、GAN、缺陷检测、显著目标检测、关键点检测、超分辨率重建、SLAM、人脸、OCR、生物医学图像、三维重建、姿态估计、自动驾驶感知、深度估计、视频理解、行为识别、图像去雾、图像去雨、图像修复、图像检索、车道线检测、点云目标检测、点云分割、图像压缩、运动预测、神经网络量化、网络部署等多个领域的大佬,不定期分享技术知识、面试技巧和内推招聘信息。
想进群的同学请添加微信号联系管理员:PingShanHai666。添加好友时请备注:学校/公司+研究方向+昵称。
推荐阅读:
CV计算机视觉每日开源代码Paper with code速览-2023.11.29
CV计算机视觉每日开源代码Paper with code速览-2023.11.28