torch.nn.GELU
原型
CLASS torch.nn.GELU(approximate='none')
参数
- approximate (str, optional) – gelu 近似算法用
none
或者tanh
, 默认为none
;
定义
高斯误差线性单元函数
GELU ( x ) = x ∗ ϕ ( x ) \text{GELU}(x) = x * \phi(x) GELU(x)=x∗ϕ(x)
其中 ϕ ( x ) \phi(x) ϕ(x) 为高斯分布的累积分布函数;
当参数为 tanh
, Gelu 估计为
GELU ( x ) = 0.5 ∗ x ∗ ( 1 + Tanh ( ( 2 / π ) ∗ ( x + 0.044715 ∗ x 3 ) ) ) \text{GELU}(x)=0.5∗x∗(1+\text{Tanh}((2/\pi)∗(x+0.044715∗x^3))) GELU(x)=0.5∗x∗(1+Tanh((2/π)∗(x+0.044715∗x3)))
图
代码
import torch
import torch.nn as nn
m = nn.GELU()
input = torch.randn(4)
output = m(input)
print("input: ", input) # input: tensor([-1.2732, -0.4936, -0.8219, 0.1772])
print("output: ", output) # output: tensor([-0.1292, -0.1534, -0.1690, 0.1010])