注意力机制——ECANet(Efficient Channel Attention Network)

ECANet(Efficient Channel Attention Network)是一种新颖的注意力机制,用于深度神经网络中的特征提取,它可以有效地减少模型参数量和计算量,提高模型的性能。

ECANet注意力机制是针对通道维度的注意力加权机制。它的基本思想是,通过学习通道之间的相关性,自适应地调整通道的权重,以提高网络的性能。ECANet通过两个步骤实现通道注意力加权:      1.提取通道特征             2.计算通道权重

用pytorch实现ECANet注意力机制:

import torch
import torch.nn as nn
import torch.nn.functional as F

class ECANet(nn.Module):
    def __init__(self, in_channels, r=8):
        super(ECANet, self).__init__()
        self.avg_pool = nn.AdaptiveAvgPool2d(1)
        self.fc1 = nn.Linear(in_channels, in_channels // r, bias=False)
        self.relu = nn.ReLU(inplace=True)
        self.fc2 = nn.Linear(in_channels // r, in_channels, bias=False)
        self.sigmoid = nn.Sigmoid()

    def forward(self, x):
        b, c, _, _ = x.size()
        y = self.avg_pool(x).view(b, c)
        y = self.fc1(y)
        y = self.relu(y)
        y = self.fc2(y)
        y = self.sigmoid(y).view(b, c, 1, 1)
        return x * y
  • nn.AdaptiveAvgPool2d(1)用于将输入的特征图转换为1x1大小的特征图,以进行全局平均池化。
  • nn.Linear(in_channels, in_channels // r, bias=False)是线性层,将输入通道数降低到输入通道数的r分之一,其中r是一个超参数。
  • nn.ReLU(inplace=True)是激活函数,将线性层的输出通过非线性变换。
  • nn.Linear(in_channels // r, in_channels, bias=False)是另一个线性层,将通道数恢复到原始数量。
  • nn.Sigmoid()是一个非线性函数,将输出值限制在0到1之间。

将ECANet注意力机制添加到神经网络中:

import torch
import torch.nn as nn
import torch.nn.functional as F

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(3, 64, kernel_size=3, padding=1)
        self.ecanet1 = ECANet(64)
        self.conv2 = nn.Conv2d(64, 128, kernel_size=3, padding=1)
        self.ecanet2 = ECANet(128)
        self.conv3 = nn.Conv2d(128, 256, kernel_size=3, padding=1)
        self.ecanet3 = ECANet(256)
        self.fc1 = nn.Linear(256 * 8 * 8, 512)
        self.fc2 = nn.Linear(512, 10)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        x = self.ecanet1(x)
        x = F.max_pool2d(x, 2)
        x = F.relu(self.conv2(x))
        x = self.ecanet2(x)
        x = F.max_pool2d(x, 2)
        x = F.relu(self.conv3(x))
        x = self.ecanet3(x)
        x = F.max_pool2d(x, 2)
        x = x.view(-1, 256 * 8 * 8)
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return x

猜你喜欢

转载自blog.csdn.net/weixin_50752408/article/details/129589601
今日推荐