YOLOV5 improvement: Add D-LKA Attention at different locations in the C3 module (with the ability to have both SA attention and large convolution kernels)

1. This article belongs to the YOLOV5/YOLOV7/YOLOV8 improvement column and contains a large number of improvement methods. The improvement methods are mainly proposed based on the latest articles in 2023 and articles in 2022.
2. Provide more detailed improvement methods, such as adding attention mechanisms to different positions of the network, which facilitates experiments and can also be used as an innovation point in the paper.

3. Point-increasing effect: D-LKA Attention mechanism achieves effective point-increasing effect!

Deformation models have significantly improved medical image segmentation, excelling at capturing far-reaching contextual and global context information. However, the increasing computational demands of these models scale with squared token counts, limiting their depth and resolution capabilities. Most current methods process three-dimensional volume image data slice by slice (called pseudo-3D), lacking key inter-slice information, thereby reducing the overall performance of the model. To address these challenges, we introduce the concept of deformable large kernel attention (D-LKA attention), a streamlined attention mechanism that employs large convolutional kernels to fully understand volumetric context. This mechanism operates in a receptive field similar to self-attention while avoiding computational overhead. Furthermore, our proposed attention mechanism benefits from deformable convolutions that can flexibly distort the sampling grid, enabling the model to appropriately adapt to different data patterns.

We designed 2D and 3D versions of D-LKA Attention, the latter of which performs well in cross-depth data understanding. Together these components form our novel hierarchical visual converter architecture, the D-LKA network.

Guess you like

Origin blog.csdn.net/m0_51530640/article/details/132883115