Paper address: https://arxiv.org/pdf/2303.16900.pdf
代码: GitHub - sail-sg/inceptionnext: InceptionNeXt: When Inception Meets ConvNeXt
Unit: NUS, Sea AI Lab (Yan Shuicheng et al.)
1. Introduction to InceptionNeXt
Abstract: Inspired by ViT’s long-range modeling capabilities, large-core convolution is used to expand the receptive field to improve model performance. For example, ConvNeXt uses 7x7 depth convolution. Although this deep operator only consumes a small number of FLOPs, with high memory access cost, it largely harms the model efficiency on powerful computing devices. To solve this problem, we propose to decompose the large-kernel depth convolution into four parallel branches along the channel dimension, namely, a small square kernel, two orthogonal band kernels, and an identity map. With this new Inception deep convolution, we built a family of networks, namely IncepitonNeXt, that not only achieve high throughput but also maintain competitive performance.
Figure 1: Trade-off between accuracy and training throughput. All models were trained under DeiT training hyperparameters [61, 37, 38, 69]. Training throughput at batch size 128