Paper: https://arxiv.org/pdf/2010.03045.pdf
This paper proposes triplet attention that can effectively solve cross-dimensional interactions. Compared with previous attention methods, there are two main advantages:
1. Negligible computational overhead
2. Emphasizes the importance of multidimensional interactions without reducing dimensionality, thus eliminating the indirect correspondence between channels and weights .
Traditional methods for computing channel attention To compute the weights of these channels, the input tensor is spatially decomposed into a pixel by global average pooling. This leads to a large loss of spatial information, and thus the interdependence between the channel dimension and the spatial dimension does not exist when attention is computed on a single pixel channel. The CBAM model based on Spatial and Channel is proposed later to alleviate the problem of spatial interdependence, but channel attention and spatial attention are separated, and the calculations are independent of each other. Based on the method of building spatial attention, this paper proposes the concept of cross dimension interaction, which solves this problem by capturing the interaction between the spatial dimension and the input tensor channel dimension.