papers to read
Whether the following mark is read or not read
-
AN IMAGE IS WORTH 16X16 WORDS: TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE
[Completed] Notes: 2022.11.18 -
CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification
[Completed] Notes: 2022.11.19 -
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
-
ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks
-
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
-
Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet