"Analysis" SPP feature pyramid series

insert image description here

For a CNN model, it can be divided into two parts:

The feature extraction network that includes the convolutional layer, activation function layer, and pooling layer, hereinafter referred to as CNN_Pre,

The following fully connected network, hereinafter referred to as CNN_Post.

Many CNN models have requirements for the size of the input image. In fact, CNN_Pre has no requirements for the input image. It can be simply considered that it reduces the image by a fixed multiple, while CNN_Post has requirements for the input dimension.

SPP: Spatial pyramid pooling, no matter what the size of the feature maps output by CNN_Pre is, it can output a fixed dimension and pass it to CNN_Post.

The essence of SPP is a multi-layer maxpool, but it is just to generate an output of fixed size n n for featur maps of different sizes a a .

insert image description here

insert image description here

Conv module

class Conv(nn.Module):
    # Standard convolution with args(ch_in, ch_out, kernel, stride, padding, groups, dilation, activation)
    default_act = nn.SiLU()  # default activation

    def __init__(self, c1, c2, k=1, s=1, p=None, g=1, d=1, act=True):
        super().__init__()
        self.conv = nn.Conv2d(c1, c2, k, s, autopad(k, p, d), groups=g, dilation=d, bias=False)
        self.bn = nn.BatchNorm2d(c2)
        self.act = self.default_act if act is True else act if isinstance(act, nn.Module) else nn.Identity()

    def forward(self, x):
        return self.act(self.bn(self.conv(x)))

    def forward_fuse(self, x):
        return self.act(self.conv(x))

Guess you like

Origin blog.csdn.net/ViatorSun/article/details/129846468