BiFormer: A Visual Transformer Based on Dual-Layer Routing Attention

Summary

Paper link: https://arxiv.org/abs/2303.08810
Code link: https://github.com/rayleizhu/BiFormer

As the core building block of vision transformers, attention is a powerful tool for capturing long-range dependencies. However, this power comes at a price: it imposes a huge computational burden and memory footprint, since all space

Guess you like

Origin blog.csdn.net/hhhhhhhhhhwwwwwwwwww/article/details/130186102