A number of works of the research group of Professor Liu Yong of Zhejiang University included ICCV2023 and the compilation of recent academic achievements

Click the card below to follow the " CVer " official account

AI/CV heavy dry goods, delivered in the first time

Click to enter —> [Target Detection and Transformer] Exchange Group

ICCV (International Conference on Computer Vision, International Conference on Computer Vision) is one of the world's top academic conferences in the computer field. It is held every two years. Its collection of papers represents the latest development direction and level in the field of computer vision. It will be held in Paris, France in October this year. Recently, ICCV 2023 announced the acceptance results. The research group of Professor Liu Yong from the School of Control Science and Engineering of Zhejiang University has accepted 5 papers. The research directions involve indoor scene reconstruction, image synthesis, action recognition, pruning/quantization, and 3D object tracking.

Professor Liu Yong's research group has established a research team composed of researchers, postdoctoral fellows, doctors, and masters. The research direction covers a comprehensive range, and has achieved fruitful research results in the fields of computer vision and robot perception and navigation. In the past two months, the team has also achieved 1 ACM MM (ACM International Conference on Multimedia, International Conference on Multimedia), 5 IROS (International Conference on Intelligent Robots and Systems, International Conference on Intelligent Robots and Systems) and 7 journals The direction involves indoor scene reconstruction, image synthesis, image anomaly detection, action recognition, pruning/quantization, 3D object tracking, point cloud panorama segmentation, point cloud semantic scene completion, monocular depth estimation, large-scale cluster pursuit and evasion Touch, laser-inertial SLAM, geolocation, face-changing detection, multi-agent cooperation, neural network acceleration, etc.

Research group website link:

https://april.zju.edu.cn/our-team/

Research group Github:

https://github.com/APRIL-ZJU

Professor Liu Yong's homepage:

https://scholar.google.com/citations?user=qYcgBbEAAAAJ

01 ICCV 2023

RICO: Regularizing the Unobservable for Indoor Compositional Reconstruction

Efficient Decoupled Reconstruction of Indoor Scenes by Constraining Invisible Regions

Authors: Li Zizhang, Lu Xiaoyang, Ding Yuanyuan, Wang Mengmeng, Liao Yiyi, Liu Yong

The accurate implicit 3D geometric information of the scene can be reconstructed from the 2D image by means of volume rendering with directional distance (SDF). On this basis, some recent works express SDF as different SDF values ​​of multiple objects in the scene, and Screening is performed during volume rendering to achieve decoupled reconstruction of the geometry of different objects in the case of multi-view mask input. However, these methods only consider the effective decoupling of the visible surface. When many objects in the indoor scene can only be observed from a partial perspective (such as a sofa leaning against the wall), since the SDF field cannot be constrained from all directions, in the invisible area The reconstruction effect of is often very poor and affects the overall decoupling effect. Taking this as a starting point, this paper introduces three geometric priors to constrain the SDF field in the invisible region. First of all, for the invisible background area, constrain the continuity of its rendering normal vector and depth value in a small area to prevent holes or extremely irregular geometric properties in the invisible area; after having a relatively complete background surface, Using the prior that "all indoor scene objects are within the background range", from the two perspectives of sampling point by point and reverse rendering depth, the SDF field of each object is discretely and continuously constrained, and finally the various indoor scenes are realized. Object clean decoupled reconstruction effect. The proposed method was analyzed qualitatively and quantitatively on real indoor scene datasets and simulation datasets, both of which achieved a strong improvement in reconstruction effect.       

c7a804e830de0df5225c1967b61b74a3.png

The overall framework of the proposed three constraints

13d5401e87c8aa752b70c4264c35b99a.png

Comparison of the decoupling reconstruction effect of the proposed RICO and the previous baseline method in indoor scenes

02 ICCV 2023

Learning Global-Aware Kernel for Image Harmonization

Image Harmonization by Learning a Global Perceptual Kernel

Authors: Shen Xintian, Zhang Jiangning, Chen Jun, Bai Shipeng, Han Yue, Wang Yabiao, Wang Chengjie, Liu Yong

The paper mainly solves the problem of injecting background reference information in the image harmonization method, and proposes a new image harmonization method based on the global perceptual dynamic kernel to achieve a comprehensive and balanced consideration of the background information. Aiming at the background consistency treatment problem caused by the region matching method, targeted harmony is achieved by learning the dynamic kernel of the foreground of the synthesized image; a mechanism of joint global features is also designed in the prediction stage of the harmony kernel, so that the dynamic kernel can obtain local Effective global background information. This method provides a more reasonable background color reference baseline method for image harmonization based on the two-part joint mechanism of prediction and modulation of the harmonization kernel.

13459522f0ed56beac3bb871183fcfad.png

03 ICCV 2023

Boosting Few-Shot Action Recognition with Graph-Guided Hybrid Matching

Small-sample action recognition method based on graph-guided mix-and-match mechanism

Authors: Xing Jiazheng, Wang Mengmeng, Ruan Yudi, Chen Bofan, Guo Yaowei, Mu Boyu, Dai Guang, Wang Jingdong, Liu Yong

In this work, we propose a new few-shot action recognition framework, GgHM, which achieves excellent performance in similar category recognition without any dataset or task preference. Specifically, we explicitly optimize feature intra- and inter-class correlations by utilizing the guidance of a graph network during class prototyping to learn task-oriented features. Furthermore, we propose a hybrid-class prototype matching strategy that exploits frame- and sequence-level prototype matching to efficiently handle video tasks with different styles. Finally, we propose a learnable dense temporal modeling module to enhance video feature temporal representation, which helps to build a more solid foundation for the matching process.

c45bb5bb908b43102fc5fdf812c1001d.png

04 ICCV 2023

Unified Data-Free Compression: Pruning and Quantization without Fine-Tuning

Framework for data-free compression: pruning and quantization

Authors: Bai Shipeng, Chen Jun, Shen Xintian, Qian Yixuan, Liu Yong

The paper addresses the unavailable data in some scenarios and the cumbersome and complicated retraining process. For the first time, a unified compression framework without data and retraining process is proposed, including model pruning and model quantization. The original information is preserved by assuming that corrupted (pruned or quantized) channels can be replaced by a linear combination of other channels. This method can complete quantization and pruning without using any data, and does not lose precision in the case of 8-bit.

c65b6ab8f3cfc1f980a27f48d097cd84.png

05 ICCV 2023

Synchronize Feature Extracting and Matching: A Single Branch Framework for 3D Object Tracking

Simultaneous feature extraction and matching: a single-branch network framework for 3D object tracking

Authors: Ma Teli, Wang Mengmeng, Xiao Jimin, Wu Huifeng, Liu Yong

Aiming at the problem that the target frame and the initial frame need to use the twin network to perform two forward operations to extract features separately in the 3D point cloud single target tracking scene, a new backbone network structure based on Transformer is proposed to extract the target frame and the initial frame at the same time. In the process of frame features, the interaction of two features can be achieved at the same time. This method can avoid separately designing a feature matching network to match two features like the previous method, so as to realize a backbone network to complete all the work and keep the network design simple and elegant. At the same time, different from the point cloud down-sampling method in the past, we also propose to perform down-sampling and feature aggregation of the point cloud according to the attention map of the self-attention mechanism, so as to realize the extraction of the most relevant information of the target object in the forward process of the network. Scale features.

6c157340e78e6a4f7a4639f4948fc684.png

06 ACM MM 2023

CenterLPS: Segment Instances by Centers for LiDAR Panoptic Segmentation

Panoramic Segmentation of Point Clouds Based on Instance Centroid Encoding

Authors: Mei Jianbiao, Yang Yu, Wang Mengmeng, Li Zizhang, Hou Xiaojun, Luo Zhongyuan, Li Laijian, Liu Yong

Point Cloud Panoramic Segmentation (LPS) has broad application prospects in autonomous driving and robotics, and has received increasing attention in recent years. Mainstream LPS methods either adopt a top-down strategy, relying on 3D object detectors to discover instances, or utilize time-consuming heuristic clustering algorithms to group instances in a bottom-up manner. Inspired by instance center representation and dynamic kernel-based segmentation methods, this paper designs a center-based instance encoding and decoding paradigm, and proposes a new non-detection and non-clustering framework (CenterLPS). Specifically, to efficiently encode instance features, this paper proposes a sparse center proposal network to generate sparse 3D instance centers and corresponding center feature embeddings. A center-aware Transformer is then used to gather context between feature embeddings of different instance centers and points around the instance centers. Going a step further, this paper generates dynamic kernel weights based on the enhanced central feature embeddings, which are used to initialize dynamic convolutions to decode the final instance masks. Finally, a mask fusion module is designed to unify semantic and instance predictions and improve the final panorama quality. This paper conducts extensive experiments on SemanticKITTI and nuScenes, proving the effectiveness of the center-based framework CenterLPS proposed in this paper.

3b91e3b2c145ba3083f1ef7e54da10e8.png

07 WILL GO 2023

SSC-RS: Elevate LiDAR Semantic Scene Completion with Representation Separation and BEV Fusion

Point Cloud Semantic Scene Completion Based on Representation Decomposition and BEV Fusion

Authors: Mei Jianbiao, Yang Yu, Wang Mengmeng, Huang Tianxin, Yang Xuemeng, Liu Yong

Point Cloud Semantic Scene Completion (SSC), predicts semantic occupancy of each voxel in the entire 3D scene from sparse LiDAR point clouds, accurately estimated from partial observations due to complex outdoor scenes (e.g. various shapes/sizes and occlusions) The semantic and geometric structure of the entire 3D real-world scene is challenging. Most outdoor SSC methods consider semantic context (semantic representation) and geometric structure (geometric representation) in a hybrid or semi-hybrid manner, how to effectively learn semantic/geometric representation and exploit the relationship between the two remains to be explored. In this paper, a new point cloud semantic scene completion network is proposed from the perspective of feature representation decomposition (semantic features/geometric features) and BEV fusion. According to the respective properties of semantic/geometric features, sparse semantic branches based on hierarchical supervision and The lightweight geometry branch is used to extract multi-scale semantic context and geometric structure, which enhances the ability of network feature representation and reduces the amount of calculation. In addition, the BEV fusion network is designed to adaptively fuse semantic/geometric features, compared with conventional 3D space Dense feature fusion is more convenient and efficient.

167131ea0307f3ae12951fa2345a6dbb.png

08 WILL GO 2023

PANet: LiDAR Panoptic Segmentation with Sparse Instance Proposal and Aggregation

Panoramic Segmentation of Point Clouds Based on Sparse Instance Proposal and Aggregation

Authors: Mei Jianbiao, Yang Yu, Wang Mengmeng, Hou Xiaojun, Li Laijian, Liu Yong

Point Cloud Panoramic Segmentation (LPS) combines point cloud semantic segmentation and instance segmentation in one framework, providing semantic labels for points in the scene and instance IDs for points belonging to instances (things). To remove the dependence of mainstream clustering methods on offset branches and improve performance on large objects, a new LPS framework is proposed in this paper. First, this paper proposes a non-learned Sparse Instance Proposal (SIP) module with a “sampling-offset-clustering” scheme, which can efficiently cluster foreground points in raw point clouds directly into instances. Specifically, balance point sampling is introduced to generate sparse seed points with more uniform point distribution within the distance range; and an offset method - "bubble shrinkage" is proposed to offset the seed points to the cluster center; then use Connect component labeling algorithms to generate instance proposals. Furthermore, this paper proposes an instance aggregation module to integrate potentially fragmented instances and improve the performance of the SIP module on large objects.

e3b5b02c2ea60af183a1819f951e6bc1.png

09 WILL GO 2023

Self-Supervised Event-Based Monocular Depth Estimation using Cross-Modal Consistency

Self-supervised Monocular Depth Estimation from Event Cameras Using Cross-Modal Consistency

Authors: Zhu Junyu, Liu Lina, Jiang Bofeng, Wen Feng, Zhang Hongbo, Li Wanlong, Liu Yong

An event camera is a new type of camera that outputs asynchronous event data and has the advantages of high temporal resolution, high dynamic range, low bandwidth, and low power consumption. In recent years, many works have explored different vision tasks on event cameras, This paper focuses on self-supervised monocular depth estimation from event cameras. Considering that event data is different from image data, it does not satisfy the assumption of gray consistency, so this paper proposes monocular depth estimation using event data and image data aligned with pixels, as well as pose estimation of adjacent frames of image data, A self-supervised signal is obtained by constructing a cross-modal consistency loss based on the grayscale consistency assumption of images. In addition, considering that the event data is very sparse on the image plane, the paper also proposes a multi-scale skip link structure to improve the feature extraction ability of the network. We conduct experiments on the MVSEC and DSEC datasets to verify that our contribution is effective and the accuracy of our model surpasses existing related methods.

565f27db9314e1898435c23ba0c6d43b.png

10 WILL GO 2023

Large Scale Pursuit-Evasion Under Collision Avoidance Using Deep Reinforcement Learning

Large-scale cluster pursuit and collision avoidance based on deep reinforcement learning

Authors: Yang Helei, Ge Peng, Cao Junjie, Yang Yifan, Liu Yong

This paper studies the multi-robot chase-evasion task in multiple pursuers and multiple evaders (MPME) scenarios. The robots on both sides are distributed decision-making, and complete their respective pursuit/evasion tasks under the premise of satisfying kinematic constraints and collision avoidance. Compared with the chaser, the evader, which is more flexible in movement, poses an important challenge to the cooperation of the chaser. The learning-based strategies of the chasing and fleeing parties co-evolve in the game confrontation, which effectively reduces the risk of overfitting caused by the rule-based opponent's strategy. In order to solve the curse of dimensionality in large-scale scenarios, this paper proposes a Mix-Attention method based on the self-attention mechanism. The experimental results show that as the scale increases, the combination of Mix-Attention and IPPO is significantly better on the MPME problem. Compared with other methods, the trained strategy has certain generalization and robustness, and can adapt to scenes with different numbers of agents and obstacles without additional training.

1e70badf2d7243d855bed719e5a2731c.pngdce37e4dd0489bd11a88d3b913c3b4e2.png

11 WILL GO 2023

LiDAR-Inertial SLAM with Effciently Extracted Planes

Plane-Assisted Laser-Inertial SLAM for Efficient Extraction

Authors: Chen Chao, Wu Hangyu, Ma Yukai, Lu Jiajun, Li Laijian, Liu Yong

This paper proposes a plane-assisted laser inertial SLAM system for efficient extraction in order to solve the problem that the existing laser inertial SLAM has insufficient observation constraints in a small indoor environment, resulting in serious positioning accumulation drift. The paper proposes a point→line→surface plane extraction method based on region growing. This method firstly fits straight line segments to the point cloud on the same line bundle of mechanical LIDAR, and then constructs a line segment connectivity graph for adjacent straight line segments. Finally, based on the line segment connectivity Figure performs region growing to fit the plane. Compared with the plane extraction methods of the same category, this method runs faster and extracts more complete planes. The paper then integrates the extracted explicit plane into the SLAM front-end odometry to obtain a more accurate initial pose for back-end optimization. In the SLAM back-end optimization, the extracted explicit plane is used as a new landmark to participate in factor graph optimization, and a voxel hash-based plane map management method and a more robust plane matching strategy are proposed, which significantly reduces the cumulative drift of the system .

da38a91d654929fb96f12b0f3142737b.png

12 Journal TIP

Omni-Frequency Channel-Selection Representations for Unsupervised Anomaly Detection

Unsupervised Image Anomaly Detection Based on Full-Band Reconstruction and Channel Selection

Authors: Liang Yufei, Zhang Jiangning, Zhao Shiwei, Wu Runze, Liu Yong, Pan Shuwen

In recent years, methods based on density estimation and classification prediction have dominated unsupervised anomaly detection, while reconstruction-based methods have been rarely mentioned due to poor model reconstruction ability and insufficient performance. Therefore, this paper focuses on improving reconstruction-based image anomaly detection methods, and proposes a novel OCR-GAN to handle the image anomaly detection task from the frequency perspective. Specifically, this paper proposes a frequency decoupling module (FD) to decouple the input image into different frequency components and model the reconstruction process as a combination of parallel full-frequency image restoration, based on our observation that normal images and Abnormal images have significant differences in frequency distribution. Considering the correlation among multiple frequencies, this paper further proposes a channel selection (CS) module, which realizes information interaction between encoders of different frequency bands by adaptively selecting different channel features. A large number of experiments have proved the effectiveness and superiority of the method in this paper. For example, on the MVTec AD dataset, OCR-GAN achieved a detection AUC of 98.3, which is 38.1 higher than the baseline method based on reconstruction. Compared with the current SOTA method improved by 0.3.

72e70cdd129a744f84d72093586664b4.png

13 Journal PRs

Data-Free Quantization via Mixed-Precision Compensation without Fine-Tuning

Data-free quantization: mixed-precision compensation without fine-tuning

Authors: Chen Jun, Bai Shipeng, Huang Tianxin, Wang Mengmeng, Tian Guanzhong, Liu Yong

Neural network quantization is a very promising solution in the field of model compression, but the resulting accuracy depends heavily on the training/fine-tuning process and requires raw data. This not only brings a lot of computation and time cost, but also is not conducive to the protection of privacy and sensitive information. Therefore, some recent work has started to focus on data-free quantification. However, data-free quantization does not perform well when dealing with ultra-low precision quantization. Although the researchers have partially addressed this problem using synthetic data generation methods, data synthesis is computationally and time-intensive. In this paper, we propose a data-free mixed-precision compensation (DF-MPC) method to recover the performance of ultra-low-precision quantized models without any data and fine-tuning process. By assuming that quantization errors induced by low-precision quantization layers can be recovered by reconstruction of high-precision quantization layers, we mathematically formulate the reconstruction loss between a pretrained full-precision model and its layered mixed-precision quantization model. Following the formulation, we theoretically derive a closed-form solution by minimizing the reconstruction loss of the feature maps. Since DF-MPC does not require any raw/synthetic data, it is a more efficient way to approximate the full precision model. Experimentally, our DF-MPC is able to achieve higher accuracy on ultra-low-precision quantized models than recent methods without any data and fine-tuning.

8c3c5ce4ce49a619e6323fa08b59b76c.png

14 Journal RAL

Geo-Localization with Transformer-Based 2D-3D Match Network

A Geolocation Method Based on 2D-3D Matching Network

Authors: Li Laijian, Ma Yukai, Tang Kai, Zhao Xiangrui, Chen Chao, Huang Jianxin, Mei Jianbiao, Liu Yong

This paper proposes a method for geolocation by cross-domain matching of satellite maps and LiDAR point clouds. This method proposes a 2D-3D matching network called D-GLSNet, which learns the matching relationship between LiDAR point cloud and satellite imagery in an end-to-end manner. D-GLSNet can provide accurate point-to-pixel association between LiDAR point cloud and satellite image, and then calculate the horizontal offset (Δx, Δy) and angular deviation Δθ between LiDAR point cloud and satellite image according to the matching, so that Accurate registration is achieved. For the potential of the proposed network in localization, a Geolocation Node (GLN) is designed based on the network, which realizes geolocation and can be seamlessly integrated with the SLAM system. Compared to GPS, GLN is less susceptible to external interference, such as building shadows. In urban scenes, the proposed D-GLSNet can output high-quality matches, enabling GLN to run stably and provide more accurate localization results.

a73154c475ddd5d1269ec3f4a6d7aaa2.png

15 Journal PRLs

Hierarchical Supervisions with Two-Stream Network for Deepfake Detection

Face Swap Detection Based on Hierarchical Supervision and Two-Stream Network

Authors: Liang Yufei, Wang Mengmeng, Jin Yining, Pan Shuwen, Liu Yong

Recently, the quality of face generation and forgery has reached impressive levels, making it even difficult for humans to tell real faces from fake ones. At the same time, methods to distinguish real and fake faces are also emerging, such as Deepfake detection. However, the deepfake detection task is still challenging, especially the low-resolution fake images circulating on the Internet and the diversity of face generation methods make this task very difficult. In this work, we propose a novel deepfake detection network that can effectively distinguish fake faces of different resolutions generated by different generation methods. First, this paper designs a dual-stream framework with a regular image domain branch and a frequency domain branch to deal with the low-resolution forgery detection problem, since we find that the frequency-domain artifacts of low-resolution images are preserved. Second, this paper introduces hierarchical supervision in a coarse-to-fine manner, and the network includes a coarse classification branch for classifying real and fake images, and a fine classification branch for further classification of real images and four different types of fake images. subdivision. Extensive experiments demonstrate the effectiveness of the proposed framework on the widely used FaceForensics++ dataset.

3c2a3f2a25ce5e922b8fd2d572cdf9e8.png

16 Journals ACM TIST

Fast Real-Time Video Object Segmentation with a Tangled Memory Network

Fast real-time video object segmentation based on entangled memory network

Authors: Mei Jianbiao, Wang Mengmeng, Yang Yu, Li Yanjun, Liu Yong

In this paper, a fast and real-time entangled memory network (TMN) is proposed to efficiently segment objects. Specifically, this paper proposes an entangled reference encoder and a state estimation-based memory organization mechanism to fully utilize mask features and reduce the memory overhead and computational burden brought by the use of dynamic storage in memory-based methods. First, this paper designs an entangled two-stream reference encoder to extract and fuse features from RGB frames and prediction masks to exploit the rich edge and contour information in mask features; second, this paper designs an object state estimator to learn mask prediction The IoU score between the value and the ground truth to indicate the quality of the predicted mask and feed back the online predicted state for organizational memory; moreover, in order to speed up the inference process and avoid memory overflow, this paper based on the masked state score provided by the state estimator A new efficient memory organization mechanism is designed, using a fixed-size memory to store historical features.

9273e7e6db3835f665a4cc5fe61862eb.png

17 Journal NCA

Expert Demonstrations Guide Reward Decomposition for Multi-Agent Cooperation

Expert Sample Guidance Reward Breakdown

Authors: Liu Weiwei, Jing Wei, Liu Shanqi, Ruan Yudi, Zhang Kexin, Yang Jian, Liu Yong

Human beings can accomplish team tasks extremely well through cooperation because everyone correctly knows the contribution of their behavior to the team, but because they share the same team rewards, there is a problem of credit allocation in the multi-agent system. This work uses expert samples to decompose the reward at each time step. Specifically, GAI and MAGAIL are introduced to measure the similarity between the agent's behavior and the expert's behavior, and learn a reward decomposition strategy. As shown in the figure, the discriminator is trained using the expert samples and the experience in the reinforcement learning experience pool, which is used to judge the similarity of the agent's state-action pair to the expert samples, and then decompose the reward. Overall, our algorithm provides each agent with a step-by-step reward signal and guides the agent to behave more like an expert. Finally, extensive experiments are conducted on three open source experimental platforms to verify the effectiveness of the proposed algorithm, and it is found that the proposed algorithm performs much better than the baseline algorithm in some scenarios. Furthermore, our algorithm significantly outperforms imitation learning algorithms such as MAGAIL and MAVEN+BC with the same number of expert samples. Finally, compared to baseline algorithms, our algorithm generates data distributions that are closer to expert sample distributions, since expert data indirectly guides policy updates.

d49ffadc57286310b65e765f29ab8144.png

Block Diagram of Expert Guidance Reward Decomposition Algorithm

18 Journals EAAI

Single-Shot Pruning and Quantization for Hardware-Friendly Neural Network Acceleration

One-shot pruning and quantization: a hardware-friendly approach to neural network acceleration

Authors: Jiang Bofeng, Chen Jun, Liu Yong

Deploying Convolutional Neural Networks (CNNs) in embedded systems is a promising but challenging task. This is because the current CNN has a large number of parameters and a large amount of calculation for forward reasoning, but the underlying hardware resources are limited. To solve this problem, pruning and model quantization are currently widely used methods. But these methods are time-consuming and can only compress neural networks serially. This paper proposes a single pruning and quantization strategy to simultaneously quantize and prune a CNN in a single training process, so that both quantization errors and pruning errors caused by structure can be considered during training to update network parameters . We evaluate the method on two commonly used datasets, CIFAR-10 and CIFAR-100. The experimental results show that compared with the previous model, our model has a 69.4% reduction in size, a small drop in accuracy, and can run 6-8 times faster on hardware (NVIDIA Xavier NX).

51c2e5571454262b14599a5b3edf91be.png35d8a19ccc7352782aa6f821386968e8.png

 
  

Click to enter —> [Target Detection and Transformer] Exchange Group

ICCV/CVPR 2023 Paper and Code Download

 
  

Background reply: CVPR2023, you can download the collection of CVPR 2023 papers and code open source papers

后台回复:ICCV2023,即可下载ICCV 2023论文和代码开源的论文合集
目标检测和Transformer交流群成立
扫描下方二维码,或者添加微信:CVer333,即可添加CVer小助手微信,便可申请加入CVer-目标检测或者Transformer 微信交流群。另外其他垂直方向已涵盖:目标检测、图像分割、目标跟踪、人脸检测&识别、OCR、姿态估计、超分辨率、SLAM、医疗影像、Re-ID、GAN、NAS、深度估计、自动驾驶、强化学习、车道线检测、模型剪枝&压缩、去噪、去雾、去雨、风格迁移、遥感图像、行为识别、视频理解、图像融合、图像检索、论文投稿&交流、PyTorch、TensorFlow和Transformer、NeRF等。
一定要备注:研究方向+地点+学校/公司+昵称(如目标检测或者Transformer+上海+上交+卡卡),根据格式备注,可更快被通过且邀请进群

▲扫码或加微信号: CVer333,进交流群
CVer计算机视觉(知识星球)来了!想要了解最新最快最好的CV/DL/AI论文速递、优质实战项目、AI行业前沿、从入门到精通学习教程等资料,欢迎扫描下方二维码,加入CVer计算机视觉,已汇集数千人!

▲扫码进星球
▲点击上方卡片,关注CVer公众号

It's not easy to organize, please like and watch9ef1692ab92f6a6410aadcc945ca0b58.gif

Guess you like

Origin blog.csdn.net/amusi1994/article/details/132486840