Research on the activation function has not stopped, ReLU still ruled the activation function of the depth of learning, however, this situation is likely to be Mish changed.
Diganta Misra article entitled: new paper "Mish A Self Regularized Non-Monotonic Neural Activation Function" introduced a new depth study activation function, the function of the ratio in the final accuracy Swish (+ 494%.) And ReLU (+ 1.671%) has increased.
Mish functional form
class Mish(nn.Module):
'''
Applies the mish function element-wise:
mish(x) = x * tanh(softplus(x)) = x * tanh(ln(1 + exp(x)))
Shape:
- Input: (N, *) where * means, any number of additional
dimensions
- Output: (N, *), same shape as the input
Examples:
>>> m = Mish()
>>> input = torch.randn(2)
>>> output = m(input)
'''
def __init__(self):
'''
Init method.
'''
super().__init__()
def forward(self, input):
'''
Forward pass of the function.
'''
return Func.mish(input)
Briefly summarize, Mish = x * tanh (ln (1 + e ^ x)).
Other activation functions, ReLU is x = max (0, x) , Swish is x * sigmoid (x)
How to use Mish
Here download Mish package
and then copy mish.py
to your relevant directory and contains it, your network activation function to it:
from mish import Mish
act_fun = Mish()