YOLOv5, v7 improvement twenty-eight: ICLR 2022 rising point artifact - plug-and-play dynamic convolution ODConv

Foreword: As the current advanced deep learning target detection algorithm YOLOv5 and v7 series algorithms, a large number of tricks have been assembled, but when dealing with some complex background problems, it is still prone to errors and omissions. The following series of articles will focus on how to improve the YOLO series of algorithms in detail. The purpose is to provide meager help and reference for those students who are engaged in scientific research who need innovation or friends who are engaged in engineering projects to achieve better results. .

For specific improvement methods, please pay attention and leave a private message! Follow and get deep learning materials for free!

Solve the problem: ICLR2022 has released the list some time ago, and a lot of excellent work has emerged. A work on dynamic convolution: ODConv, which employs a multi-dimensional attention mechanism through a parallel strategy to learn complementary attention along four dimensions of kernel space. As a "plug and play" operation, it can be easily embedded into existing CNN networks. And the experimental results show that it can improve the performance of large models, but also improve the performance of lightweight models

Main principle:

Paper: Omni-Dimensional Dynamic Convolution | OpenReview

      ODConv can be regarded as a continuation of CondConv, which expands the dynamic characteristics of one dimension in CondConv, and considers the dynamics of airspace, input channels, output channels and other dimensions, so it is called full-dimensional dynamic convolution. ODConv employs a multi-dimensional attention mechanism through a parallel strategy to learn complementary attention along four dimensions of the kernel space. As a "plug and play" operation, it can be easily embedded into existing CNN networks. Experiments on ImageNet classification and COCO detection tasks have verified the excellence of the proposed ODConv: it can improve the performance of large models and the performance of lightweight models, which is really a panacea! It is worth mentioning that, benefiting from its improved feature extraction capabilities, ODConv can still achieve comparable or even better performance than the existing multi-core dynamic convolution when paired with a convolution kernel .

Add method: 

The first step: common.py builds the ODConv module. Some code examples are as follows.

class ODConv2d(nn.Module):
    def __init__(self, in_planes, out_planes, kernel_size, stride=1, padding=0, dilation=1, groups=1,
                 reduction=0.0625, kernel_num=4):
        super(ODConv2d, self).__init__()
        self.in_planes = in_planes
        self.out_planes = out_planes
        self.kernel_size = kernel_size
        self.stride = stride
        self.padding = padding
        self.dilation = dilation
        self.groups = groups
        self.kernel_num = kernel_num
        self.attention = Attention(in_planes, out_planes, kernel_size, groups=groups,
                                   reduction=reduction, kernel_num=kernel_num)
        self.weight = nn.Parameter(torch.randn(kernel_num, out_planes, in_planes//groups, kernel_size, kernel_size),
                                   requires_grad=True)
        self._initialize_weights()

        if self.kernel_size == 1 and self.kernel_num == 1:
            self._forward_impl = self._forward_impl_pw1x
        else:
            self._forward_impl = self._forward_impl_common

Step 2: Register the ODConv module in yolo.py. Some code examples are as follows.

Step 3: Modify the yaml file, which needs to be modified.

Step 4: Change train.py to the yaml file in this article and start training.

Results: I have done a lot of experiments on multiple data sets, and different data sets have different effects, so everyone needs to conduct experiments. Most of the cases are effective and improved.

A preview: continue to share content related to deep learning. Interested friends can pay attention to me, if you have any questions, you can leave a message or chat with me privately

PS: The series of improved algorithms can be added not only to YOLOv5, but also to any other deep learning network. Whether it is classification, detection or segmentation, mainly in the field of computer vision, there may be different degrees of improvement.

Finally, I hope that we can follow each other, be friends, and learn and communicate together. Follow and get deep learning materials for free!

Guess you like

Origin blog.csdn.net/m0_70388905/article/details/127031843