github:https://github.com/amusi/daily-paper-computer-vision
本文分享共计46篇论文，涉及CNN、Face、图像分类、目标检测、图像分割、GAN、Re-Id、SLAM和迁移学习等方向。

前戏

计算机视觉论文速递系列推文目前是一周一次，因为Amusi说过很多次，这个系列文章整理到公众号上有点"吃"时间。所以暂时将原来的日报形式改成周报的形式。

其实Amusi一般会在专属的CVer知识星球里分享当天的论文和干货资料，但目前还是不打算介绍开放。预计2019年1月吧，一起搞事情！

今天的文章原本篇幅很长，已经超过50000字，因为Amusi把摘要也放进来了，结果微信爸爸告诉我：正文不能超过50000字，于是就砍掉了大多摘要，如果你像看完整版，可以点击阅读原文

论文类别目录

CNN
Face
图像分类
目标检测
Saliency Detection
场景文本检测
图像分割
目标跟踪
GAN
3D
Re-ID
SLAM
迁移学习
风格迁移
Image Caption
Few-Shot Learning
数据集
Other

CNN

《Deeper Interpretability of Deep Networks》

arXiv：https://arxiv.org/abs/1811.07807

Deep Convolutional Neural Networks (CNNs) have been one of the most influential recent developments in computer vision, particularly for categorization. There is an increasing demand for explainable AI as these systems are deployed in the real world. However, understanding the information represented and processed in CNNs remains in most cases challenging. Within this paper, we explore the use of new information theoretic techniques developed in the field of neuroscience to enable novel understanding of how a CNN represents information. We trained a 10-layer ResNet architecture to identify 2,000 face identities from 26M images generated using a rigorously controlled 3D face rendering model that produced variations of intrinsic (i.e. face morphology, gender, age, expression and ethnicity) and extrinsic factors (i.e. 3D pose, illumination, scale and 2D translation). With our methodology, we demonstrate that unlike human’s network overgeneralizes face identities even with extreme changes of face shape, but it is more sensitive to changes of texture. To understand the processing of information underlying these counterintuitive properties, we visualize the features of shape and texture that the network processes to identify faces. Then, we shed a light into the inner workings of the black box and reveal how hidden layers represent these features and whether the representations are invariant to pose. We hope that our methodology will provide an additional valuable tool for interpretability of CNNs.

《Deep Shape-from-Template: Wide-Baseline, Dense and Fast Registration and Deformable Reconstruction from a Single Image》

arXiv：https://arxiv.org/abs/1811.07791

《Do Normalization Layers in a Deep ConvNet Really Need to Be Distinct?》

arXiv：https://arxiv.org/abs/1811.07727

《Self-Referenced Deep Learning》

arXiv：https://arxiv.org/abs/1811.07598

《Multimodal Densenet》

arXiv：https://arxiv.org/abs/1811.07407

《RePr: Improved Training of Convolutional Filters》

arXiv：https://arxiv.org/abs/1811.07275

《PydMobileNet: Improved Version of MobileNets with Pyramid Depthwise Separable Convolution》

arXiv：https://arxiv.org/abs/1811.07083

Face

《Aff-Wild2: Extending the Aff-Wild Database for Affect Recognition》

arXiv：https://arxiv.org/abs/1811.07770

图像分类

《High Order Neural Networks for Video Classification》

arXiv：https://arxiv.org/abs/1811.07519

《DeepConsensus: using the consensus of features from multiple layers to attain robust image classification》

arXiv：https://arxiv.org/abs/1811.07266

目标检测

《Weakly Supervised Soft-detection-based Aggregation Method for Image Retrieval》

arXiv：https://arxiv.org/abs/1811.07619

《Fast Efficient Object Detection Using Selective Attention》

arXiv：https://arxiv.org/abs/1811.07502

《FotonNet: A HW-Efficient Object Detection System Using 3D-Depth Segmentation and 2D-DNN Classifier》

arXiv：https://arxiv.org/abs/1811.07493

《R2CNN++: Multi-Dimensional Attention Based Rotation Invariant Detector with Robust Anchor Strategy》

arXiv：https://arxiv.org/abs/1811.07126

Saliency Detection

《Global and Local Sensitivity Guided Key Salient Object Re-augmentation for Video Saliency Detection》

arXiv：https://arxiv.org/abs/1811.07480

场景文本检测

《Pixel-Anchor: A Fast Oriented Scene Text Detector with Combined Networks》

arXiv：https://arxiv.org/abs/1811.07432

《Improving Rotated Text Detection with Rotation Region Proposal Networks》

arXiv：https://arxiv.org/abs/1811.07031

图像分割

《OrthoSeg: A Deep Multimodal Convolutional Neural Network for Semantic Segmentation of Orthoimagery》

arXiv：https://arxiv.org/abs/1811.07859

《M2U-Net: Effective and Efficient Retinal Vessel Segmentation for Resource-Constrained Environments》

arXiv：https://arxiv.org/abs/1811.07738

目标跟踪

《Robust Visual Tracking using Multi-Frame Multi-Feature Joint Modeling》

arXiv：https://arxiv.org/abs/1811.07498

《Deep Siamese Networks with Bayesian non-Parametrics for Video Object Tracking》

arXiv：https://arxiv.org/abs/1811.07386

《Exploit the Connectivity: Multi-Object Tracking with TrackletNet》

arXiv：https://arxiv.org/abs/1811.07258

GAN

《Injecting and removing malignant features in mammography with CycleGAN: Investigation of an automated adversarial attack using neural networks》

arXiv：https://arxiv.org/abs/1811.07767

《SEIGAN: Towards Compositional Image Generation by Simultaneously Learning to Segment, Enhance, and Inpaint》

arXiv：https://arxiv.org/abs/1811.07630

《GAN-QP: A Novel GAN Framework without Gradient Vanishing and Lipschitz Constraint》

arXiv：https://arxiv.org/abs/1811.07296

3D

《Modeling Local Geometric Structure of 3D Point Clouds using Geo-CNN》

arXiv：https://arxiv.org/abs/1811.07782

《PointConv: Deep Convolutional Networks on 3D Point Clouds》

arXiv：https://arxiv.org/abs/1811.07246

《Topology-Aware Non-Rigid Point Cloud Registration》

arXiv：https://arxiv.org/abs/1811.07014

Re-ID

《Past, Present, and Future Approaches Using Computer Vision for Animal Re-Identification from Camera Trap Data》

arXiv：https://arxiv.org/abs/1811.07749

《CA3Net: Contextual-Attentional Attribute-Appearance Network for Person Re-Identification》

arXiv：https://arxiv.org/abs/1811.07544

《Re-Identification with Consistent Attentive Siamese Networks》

arXiv：https://arxiv.org/abs/1811.07487

SLAM

《Collaborative Dense SLAM》

arXiv：https://arxiv.org/abs/1811.07632

迁移学习

《An Efficient Transfer Learning Technique by Using Final Fully-Connected Layer Output Features of Deep Networks》

arXiv：https://arxiv.org/abs/1811.07459

《Transfer Learning with Deep CNNs for Gender Recognition and Age Estimation》

arXiv：https://arxiv.org/abs/1811.07344

风格迁移

《GLStyleNet: Higher Quality Style Transfer Combining Global and Local Pyramid Features》

arXiv：https://arxiv.org/abs/1811.07260

Image Caption

《Intention Oriented Image Captions with Guiding Objects》

arXiv：https://arxiv.org/abs/1811.07662

Few-Shot Learning

《Deep Comparison: Relation Columns for Few-Shot Learning》

arXiv：https://arxiv.org/abs/1811.07100

数据集

《iQIYI-VID: A Large Dataset for Multi-modal Person Identification》

arXiv：https://arxiv.org/abs/1811.07548

Other

《Addressing the Invisible: Street Address Generation for Developing Countries with Deep Learning》

NIPS 2018 Workshop

arXiv：https://arxiv.org/abs/1811.07769

《Handwriting Recognition of Historical Documents with few labeled data》

arXiv：https://arxiv.org/abs/1811.07768

《GroundNet: Segmentation-Aware Monocular Ground Plane Estimation with Geometric Consistency》

arXiv：https://arxiv.org/abs/1811.07222

《Image-to-GPS Verification Through A Bottom-Up Pattern Matching Network》

arXiv：https://arxiv.org/abs/1811.07288

《Matching RGB Images to CAD Models for Object Pose Estimation》

arXiv：https://arxiv.org/abs/1811.07249

《Optical Flow Dataset and Benchmark for Visual Crowd Analysis》

arXiv：https://arxiv.org/abs/1811.07170

《Simulating LIDAR Point Cloud for Autonomous Driving using Real-world Scenes and Traffic Flows》

arXiv：https://arxiv.org/abs/1811.07112

《DSCnet: Replicating Lidar Point Clouds with Deep Sensor Cloning》

arXiv：https://arxiv.org/abs/1811.07070

若喜欢此文，欢迎点赞收藏和转发

▲长按关注我们

欢迎点赞！

最新的46篇CV论文

前戏