Dropout和Droppath

Dropout

Dropout是2012年深度学习视觉领域的开山之作paper:AlexNet《ImageNet Classification with Deep Convolutional Neural Networks》所提到的算法,用于提高网络的泛化能力, 防止过拟合。

Dropout现在一般用于全连接层,卷积层一般不使用Dropout,而是使用BN来防止过拟合,而且卷积核还会有relu等非线性函数,降低特征直接的关联性。

python实现

import numpy as np

def dropout(x, p, mode='train'):
    keep_prob = 1 - p
    if mode == 'train':
        x *= np.random.binomial(1, keep_prob, size=x.shape)
    else:
        x *= keep_prob
    return x

def dropout(x, p, mode='train'):
    keep_prob = 1 - p
    if mode == 'train':
        x *= np.random.binomial(1, keep_prob, size=x.shape) / keep_prob
    return x

Droppath

drop-path是将深度学习模型中的多分支结构随机失活的一种正则化策略。
论文:《FractalNet: Ultra-Deep Neural Networks without Residuals(ICLR2017)》,与FractalNet一起提出。

drop-path,一种用于超深分形网络的新型正则化协议。在没有数据增强的情况下,使用 drop-path 和 dropout 训练的分形网络超过了通过随机深度正则化的残差网络的性能。虽然,像随机深度一样,它会随机删除宏观尺度的组件,但 drop-path 进一步利用了我们的分形结构来选择禁用哪些组件。

drop-path,这是 dropout 的自然延伸,以规范分形架构中子路径的共同适应。这种正则化允许提取高性能的固定深度子网络。

python中实现:

def drop_path(x, drop_prob: float = 0., training: bool = False):
    if drop_prob == 0. or not training:
        return x
    keep_prob = 1 - drop_prob
    shape = (x.shape[0],) + (1,) * (x.ndim - 1)  
    random_tensor = keep_prob + torch.rand(shape, dtype=x.dtype, device=x.device)
    random_tensor.floor_()  # binarize
    output = x.div(keep_prob) * random_tensor
    return output


class DropPath(nn.Module):
    def __init__(self, drop_prob=None):
        super(DropPath, self).__init__()
        self.drop_prob = drop_prob

    def forward(self, x):
        return drop_path(x, self.drop_prob, self.training)

调用

self.drop_path = DropPath(drop_prob) if drop_prob > 0. else nn.Identity()

x = x + self.drop_path(self.token_mixer(self.norm1(x)))
x = x + self.drop_path(self.mlp(self.norm2(x)))

实验

import torch

drop_prob = 0.2
keep_prob = 1 - drop_prob
x = torch.randn(4, 3, 2, 2)
shape = (x.shape[0],) + (1,) * (x.ndim - 1)
random_tensor = keep_prob + torch.rand(shape, dtype=x.dtype, device=x.device)
random_tensor.floor_()
output = x.div(keep_prob) * random_tensor

推荐一篇博文:【YOLO v4 相关理论】Regularizations(正则化): DropOut、DropBlock(最重要)、Spatial DropOut、DropPath、DropConnect

猜你喜欢

转载自blog.csdn.net/wuli_xin/article/details/127266407