CV计算机视觉每日开源代码Paper with code速览-2023.11.30

点击@CV计算机视觉，关注更多CV干货

论文已打包，点击进入—>下载界面

点击加入—>CV计算机视觉交流群

1.【医学图像分割】Clean Label Disentangling for Medical Image Segmentation with Noisy Labels

论文地址：https://arxiv.org//pdf/2311.16580
开源代码：https://github.com/xiaoyao3302/2BDenoise

2.【超分辨率重建】SeeSR: Towards Semantics-Aware Real-World Image Super-Resolution

论文地址：https://arxiv.org//pdf/2311.16518
开源代码（即将开源）：GitHub - cswry/SeeSR: SeeSR: Towards Semantics-Aware Real-World Image Super-Resolution

3.【域自适应】Progressive Target-Styled Feature Augmentation for Unsupervised Domain Adaptation on Point Clouds

论文地址：https://arxiv.org//pdf/2311.16474
开源代码：https://github.com/xiaoyao3302/PTSFA

4.【多模态】HumanGaussian: Text-Driven 3D Human Generation with Gaussian Splatting

论文地址：https://arxiv.org//pdf/2311.17061
工程主页：HumanGaussian: Text-Driven 3D Human Generation with Gaussian Splatting
开源代码：https://github.com/alvinliu0/HumanGaussian

5.【多模态】LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models

论文地址：https://arxiv.org//pdf/2311.17043
开源代码：https://github.com/dvlab-research/LLaMA-VID

6.【多模态】Efficient In-Context Learning in Vision-Language Models for Egocentric Videos

论文地址：https://arxiv.org//pdf/2311.17041
开源代码（即将开源）：https://github.com/yukw777/EILEV

7.【多模态】Space-Time Diffusion Features for Zero-Shot Text-Driven Motion Transfer

论文地址：https://arxiv.org//pdf/2311.17009
工程主页：Space-Time Diffusion Features for Zero-Shot Text-Driven Motion Transfer
代码即将开源

8.【多模态】MVBench: A Comprehensive Multi-modal Video Understanding Benchmark

论文地址：https://arxiv.org//pdf/2311.17005
开源代码：GitHub - OpenGVLab/Ask-Anything: [VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.

9.【多模态】SparseCtrl: Adding Sparse Controls to Text-to-Video Diffusion Models

论文地址：https://arxiv.org//pdf/2311.16933
工程主页：SparseCtrl
代码即将开源

10.【多模态】LLaFS: When Large-Language Models Meet Few-Shot Segmentation

论文地址：https://arxiv.org//pdf/2311.16926
开源代码（即将开源）：https://github.com/lanyunzhu99/LLaFS

11.【多模态】Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding

论文地址：https://arxiv.org//pdf/2311.16922
开源代码：https://github.com/DAMO-NLP-SG/VCD

12.【多模态】RichDreamer: A Generalizable Normal-Depth Diffusion Model for Detail Richness in Text-to-3D

论文地址：https://arxiv.org//pdf/2311.16918
工程主页：RichDreamer: A Generalizable Normal-Depth Diffusion Model for Detail Richness in Text-to-3D
代码即将开源

13.【多模态】A Unified Approach for Text- and Image-guided 4D Scene Generation

论文地址：https://arxiv.org//pdf/2311.16854
工程主页：Dream-in-4D: A Unified Approach for Text- and Image-guided 4D Scene Generation
代码即将开源

14.【多模态】GPT4Video: A Unified Multimodal Large Language Model for lnstruction-Followed Understanding and Safety-Aware Generation

论文地址：https://arxiv.org//pdf/2311.16511
工程主页：https://gpt4video.github.io/
开源代码（即将开源）：https://github.com/gpt4video/GPT4Video

15.【多模态】LLMGA: Multimodal Large Language Model based Generation Assistant

论文地址：https://arxiv.org//pdf/2311.16500
工程主页：LLMGA: Multimodal Large Language Model based Generation Assistant
开源代码：GitHub - Zj-BinXia/LLMGA: This project is the official implementation of 'LLMGA: Multimodal Large Language Model based Generation Assistant'

16.【多模态】ChartLlama: A Multimodal LLM for Chart Understanding and Generation

论文地址：https://arxiv.org//pdf/2311.16483
工程主页：ChartLlama: A Multimodal LLM for Chart Understanding and Generation
开源代码（即将开源）：https://github.com/tingxueronghua/ChartLlama-code

17.【多模态】AvatarGPT: All-in-One Framework for Motion Understanding, Planning, Generation and Beyond

论文地址：https://arxiv.org//pdf/2311.16468
工程主页：AvatarGPT
开源代码（即将开源）：https://github.com/zixiangzhou916/AvatarGPT

18.【多模态】TextDiffuser-2: Unleashing the Power of Language Models for Text Rendering

论文地址：https://arxiv.org//pdf/2311.16465
开源代码：https://github.com/microsoft/unilm/tree/master/textdiffuser-2

19.【多模态】Removing NSFW Concepts from Vision-and-Language Models for Text-to-Image Retrieval and Generation

论文地址：https://arxiv.org//pdf/2311.16254
开源代码（即将开源）：https://github.com/aimagelab/safe-clip

20.【多模态】SemiVL: Semi-Supervised Semantic Segmentation with Vision-Language Guidance

论文地址：https://arxiv.org//pdf/2311.16241
开源代码：https://github.com/google-research/semivl

21.【自动驾驶：BEV】Towards Full-scene Domain Generalization in Multi-agent Collaborative Bird's Eye View Segmentation for Connected and Autonomous Driving

论文地址：https://arxiv.org//pdf/2311.16754
开源代码（即将开源）：https://github.com/DG-CAVs/DG-CoPerception

22.【Diffusion】Surf-D: High-Quality Surface Generation for Arbitrary Topologies using Diffusion Models

论文地址：https://arxiv.org//pdf/2311.17050
工程主页：Surf-D: High-Quality Surface Generation for Arbitrary Topologies using Diffusion Models
开源代码（即将开源）：https://github.com/Yzmblog/SurfD

23.【Diffusion】Adversarial Diffusion Distillation

论文地址：https://arxiv.org//pdf/2311.17042
开源代码：https://github.com/Stability-AI/generative-models

24.【Diffusion】Enhancing Scene Text Detectors with Realistic Text Image Synthesis Using Diffusion Models

论文地址：https://arxiv.org//pdf/2311.16555
开源代码（即将开源）：https://github.com/99Franklin/DiffText

25.【Diffusion】Fine-grained Appearance Transfer with Diffusion Models

论文地址：https://arxiv.org//pdf/2311.16513
开源代码（即将开源）：https://github.com/babahui/Fine-grained-Appearance-Transfer

26.【Diffusion】DiffuseBot: Breeding Soft Robots With Physics-Augmented Generative Diffusion Models

论文地址：https://arxiv.org//pdf/2311.17053
工程主页：DiffuseBot: Breeding Soft Robots With Physics-Augmented Generative Diffusion Models
代码即将开源

27.【网络剪枝】Filter-Pruning of Lightweight Face Detectors Using a Geometric Median Criterion

论文地址：https://arxiv.org//pdf/2311.16613
开源代码：https://github.com/IDT-ITI/Lightweight-Face-Detector-Pruning

28.【图像编辑】Text-Driven Image Editing via Learnable Regions

论文地址：https://arxiv.org//pdf/2311.16432
工程主页：https://yuanze-lin.me/LearnableRegions_page
开源代码（即将开源）：https://github.com/yuanze-lin/Learnable_Regions

29.【人体运动生成】A Unified Framework for Multimodal, Multi-Part Human Motion Synthesis

论文地址：https://arxiv.org//pdf/2311.16471
工程主页：UDE-2
开源代码（即将开源）：https://github.com/zixiangzhou916/UDE-2

30.【NeRF】UC-NeRF: Neural Radiance Field for Under-Calibrated multi-view cameras in autonomous driving

论文地址：https://arxiv.org//pdf/2311.16945
工程主页：UC-NeRF: Neural Radiance Field for Under-Calibrated multi-view cameras in autonomous driving
开源代码（即将开源）：https://github.com/kcheng1021/UC-NeRF

31.【图像合成】DemoFusion: Democratising High-Resolution Image Generation With No $$$

论文地址：https://arxiv.org//pdf/2311.16973
工程主页：DemoFusion
开源代码（即将开源）：GitHub - PRIS-CV/DemoFusion: Let us democratise high-resolution generation! (arXiv 2023)

32.【人体重建】Deceptive-Human: Prompt-to-NeRF 3D Human Generation with 3D-Consistent Synthetic Images

论文地址：https://arxiv.org//pdf/2311.16499
开源代码（即将开源）：https://github.com/DanielSHKao/DeceptiveHuman

33.【网络量化】（NeurIPS2023）Effective Quantization for Diffusion Models on CPUs

论文地址：https://arxiv.org//pdf/2311.16133
开源代码：https://github.com/intel/intel-extension-for-transformers

34.【Continual Learning】Class-Adaptive Sampling Policy for Efficient Continual Learning

论文地址：https://arxiv.org//pdf/2311.16485
开源代码：https://github.com/hossein-rezaei624/CASP

论文已打包，下载链接

CV计算机视觉交流群

群内包含目标检测、图像分割、目标跟踪、Transformer、多模态、NeRF、GAN、缺陷检测、显著目标检测、关键点检测、超分辨率重建、SLAM、人脸、OCR、生物医学图像、三维重建、姿态估计、自动驾驶感知、深度估计、视频理解、行为识别、图像去雾、图像去雨、图像修复、图像检索、车道线检测、点云目标检测、点云分割、图像压缩、运动预测、神经网络量化、网络部署等多个领域的大佬，不定期分享技术知识、面试技巧和内推招聘信息。

想进群的同学请添加微信号联系管理员：PingShanHai666。添加好友时请备注：学校/公司+研究方向+昵称。

CV计算机视觉每日开源代码Paper with code速览-2023.11.28

CV计算机视觉每日开源代码Paper with code速览-2023.11.27

CV计算机视觉每日开源代码Paper with code速览-2023.11.23