[Activation function] SELU activation function

 1 Introduction

        SELU (Scaled Exponential Linear Unit) SELU is an improvement on the ELU activation function. By introducing an automatic normalization mechanism , the hidden layer of the neural network can automatically keep the mean and variance of the output close to 1 during the training process.

# 定义 SELU 激活函数
def selu(x, alpha=1.67326, lambda_=1.0507):
    return lambda_ * torch.where(x > 0, x, alpha * (torch.exp(x) - 1))

2, official

$f(x)=\lambda \cdot\left\{\begin{array}{ll}x & \text { if } x>0 \\ \alpha \cdot\left(e^x-1\right) & \text { if } x \leq 0\end{array}\right.$

where \lambdaand \alpha are two constants, usually set to:

\lambda = 1.0507 and  \alpha = 1.67326

3. Image

4. Features 

  • Self-normalization: The SELU activation function introduces a self-normalization mechanism, so that the output of the neural network keeps the mean and variance close to 1 during the training process. This helps solve the gradient exploding and vanishing gradient problems in neural networks , making deep networks easier to train.

  • Scope of application: SELU has certain requirements for the input value range. Usually, the input needs to be normalized when applying SELU.       

  • Activation range : The SELU activation function has exponential growth when the input is negative and is approximately linear when the input is positive. This nonlinear characteristic makes SELU perform better than activation functions such as ReLU in some cases.

It should be noted that the SELU activation function may not be suitable for all tasks and network structures in some cases. When using SELU, you also need to pay attention to the setting of initialization parameters, because it has strict requirements on the output distribution of the network. If SELU is used inappropriately, it may cause the mean and variance of the network output to be unstable, thus affecting the performance of the model.

Paper link:

[1706.02515] Self-Normalizing Neural Networks (arxiv.org)

For more deep learning content, please visit my homepage. The following are quick links:

[Activation Function] Several activation functions you must know in deep learning: Sigmoid, Tanh, ReLU, LeakyReLU and ELU activation functions (2024 latest compilation) - CSDN Blog

Guess you like

Origin blog.csdn.net/Next_SummerAgain/article/details/135409619