Highlights
Wall crack recommended! How can a newbie systematically learn the core knowledge of CV in 1 month: Link
Click @CVComputer Vision to follow more CV information
The paper has been packaged, click to enter -> download interface
Click to join—>CV computer vision exchange group
1.【Rotated Object Detection】Linear Gaussian Bounding Box Representation and Ring-Shaped Rotated Convolution for Oriented Object Detection
-
Paper address: https://arxiv.org//pdf/2311.05410
2. [Point Cloud] (BMVC2023) 3D-QAE: Fully Quantum Auto-Encoding of 3D Point Clouds
-
Paper address: https://arxiv.org//pdf/2311.05604
-
Engineering Home Page: 3D-QAE: Fully Quantum Auto-Encoding of 3D Point Clouds
-
Open source code: GitHub - rishabhdabral/3D-QAE: Official code release for the project "3D-QAE: A Fully Quantum Auto-Encoder for 3D Shapes
3. [Medical Image Segmentation: 3D] Transfer learning from a sparsely annotated dataset of 3D medical images
-
Paper address: https://arxiv.org//pdf/2311.05032
4. [Medical Image Segmentation: 3D] CSAM: A 2.5D Cross-Slice Attention Module for Anisotropic Volumetric Medical Image Segmentation
-
Paper address: https://arxiv.org//pdf/2311.04942
-
Open source code: GitHub - aL3x-Oo-Hung/CSAM
5.【多模态】LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents
-
Paper address: https://arxiv.org//pdf/2311.05437
-
Engineering Home Page: LLaVA-Plus
6.【多模态】On the Road with GPT-4V(ision): Early Explorations of Visual-Language Model on Autonomous Driving
-
Paper address: https://arxiv.org//pdf/2311.05332
-
Open source code: https://github.com/PJLab-ADG/GPT4V-AD-Exploration
7.【多模态】(NeurIPS2023)Cross-modal Prompts: Adapting Large Pre-trained Models for Audio-Visual Downstream Tasks
-
Paper address: https://arxiv.org//pdf/2311.05152
-
Open source code: GitHub - haoyi-duan/DG-SCT: NeurIPS'2023 official implementation code
8.【Diffusion】(ACM MM 2023)3DStyle-Diffusion: Pursuing Fine-grained Text-driven 3D Stylization with 2D Diffusion Models
-
Paper address: https://arxiv.org//pdf/2311.05464
-
Open source code (soon to be open source): GitHub - yanghb22-fdu/3DStyle-Diffusion-Official: Official codes and datasets for ACM MM23 paper "3DStyle-Diffusion: Pursuing Fine-grained Text-driven 3D Stylization with 2D Diffusion Models"
9. [Human Pose Estimation] (WACV2024) Active Transfer Learning for Efficient Video-Specific Human Pose Estimation
-
Paper address: https://arxiv.org//pdf/2311.05041
-
Open source code: https://github.com/ImIntheMiddle/VATL4Pose-WACV2024
The paper has been packaged , download link
CV computer vision communication group
The group includes target detection, image segmentation, target tracking, Transformer, multi-modality, NeRF, GAN, defect detection, salient target detection, key point detection, super-resolution reconstruction, SLAM, face, OCR, biomedical images, 3D reconstruction, attitude estimation, autonomous driving perception, depth estimation, video understanding, behavior recognition, image dehazing, image deraining, image restoration, image retrieval, lane line detection, point cloud target detection, point cloud segmentation, image compression, motion Leaders in prediction, neural network quantification, network deployment and other fields share technical knowledge, interview skills and internally recommended recruitment information from time to time .
Students who want to join the group please add WeChat ID to contact the administrator: PingShanHai666 . When adding friends, please note: school/company + research direction + nickname .
Recommended reading:
CV computer vision daily open source code Paper with code quick overview-2023.11.9
CV computer vision daily open source code Paper with code quick overview-2023.11.8
CV computer vision daily open source code Paper with code quick overview-2023.11.7
CV computer vision daily open source code Paper with code quick overview-2023.11.6
CV computer vision daily open source code Paper with code quick overview-2023.11.3