搭建神经网络中常用激活函数总结

学习了构建神经网络中常见的激活函数，其中包括：

1）Sigmod()函数

tf.nn.sigmoid(x)
2）F(x)=1/(1+e-x) f’(x)=e-x*(1+e-x)-2=f(x)(1-f(x))
在这里插入图片描述
(1)函数原图像

在这里插入图片描述
（2）函数导数图像
Sigmoid函数特点：
1）易造成梯度消失
2）输出非0均值，收敛慢
3）幂运算复杂，训练时间长（近些年使用变少，因为深层神经网络在更新参数的时候，需要从输入层到输出层逐层进行链式求导，而sigmoid函数的倒数输出是0到0.25之间的小数，链式求导需要多层倒数连续相乘，结果会趋近于0，会出现梯度消失，无法进行参数更新）

Tanh函数

tf.math.tanh(x) f(x)=(1-e-2x)/(1+e-2x)
tanh函数的导函数是4/(ex+e-x)2
在这里插入图片描述

(1)Tanh函数图像
在这里插入图片描述

(2)tanh函数导数图像
特点：
（1)数据是0的均值
（2)易造成梯度消失
（3)幂运算复杂，训练时间长

Relu函数

tf.nn.relu(x)
f(x)=max(0,x)
在这里插入图片描述
relu函数图像

relu函数导数图像

Relu函数优点：
1）解决了梯度消失问题（在正区间）
2）只需判断计算速度是否大于0，计算速度快
3）收敛速度远快于sigmoid和tanh
缺点：
1）输出非0均值，收敛慢
2）Dead relu问题：某些神经元可能永远不会被激活，导致相应的参数永远不能被更新（送入激活函数的特征是负数时，激活函数输出是0，反向传播得到的梯度是0导致参数无法更新，造成神经元死亡，造成神经元死亡的根本原因是经过relu函数的负数特征过多导致的，可以改进随机初始化，避免过多的负数特征送入relu函数或者设置更小的学习率，减少参数分布的巨大变化，避免训练中产生过多负数特征进入relu函数）。

Leaky relu函数

fx=max(αx,x) tf.nn.leaky_relu(x)
本例中设置的α值为0.5
在这里插入图片描述

Leaky relu函数图像在这里插入图片描述
Leaky Relu函数倒数图像

理论上来讲，Leaky Relu有Relu的所有优点，外加不会有Dead relu问题，但是在实际操作中，并没有完全证明leaky relu总是优于relu。
绘制函数图像代码如下：

from matplotlib import pyplot as plt
import numpy as np
import mpl_toolkits.axisartist as axisartist
def sigmoid(x):
    return 1./(1+np.exp(-x))
def tanh(x):
    return (np.exp(x)-np.exp(-x))/(np.exp(x)+np.exp(-x))
def relu(x):
    return np.where(x<0,0,x)
def prelu(x):
    return np.where(x<0,0.5*x,x)
def plot_sigmoid():
    x=np.arange(-10,10,0.1)
    y=sigmoid(x)
    fig=plt.figure()
    #ax=axisartist.Subplot(fig,111)
    ax=fig.add_subplot(111)
    ax.spines['top'].set_color('none')
    ax.spines['right'].set_color('none')
    # ax.spines['bottom'].set_color('none')
    # ax.spines['left'].set_color('none')
    #ax.axis['bottom'].set_axisline_style('-|>',size=1.5)
    ax.spines['left'].set_position(('data',0))
    plt.plot(x,y)
    plt.xlim([-10.05,10.05])
    plt.ylim([-0.02,1.02])
    plt.tight_layout()
    plt.savefig('sigmoid.png')
    plt.show()
    # x = np.arange(-10, 10, 0.1)
    # y = sigmoid(x)
    # plt.xlabel(xlabel='x')
    # plt.ylabel('y')
    # plt.title(label='sigmod')
    # plt.plot(x, y, linewidth=3)
    # plt.savefig('sigmoid.png')
    # plt.show()
def plot_dsigmoid():
    x=np.arange(-10,10,0.1)
    y=1./(1+np.exp(-x))
    dy=y*(1-y)
    fig=plt.figure()
    ax=fig.add_subplot(111)
    ax.spines['top'].set_color('none')
    ax.spines['right'].set_color('none')
    ax.spines['bottom'].set_position(('data',0))
    ax.spines['left'].set_position(('data',0))
    ax.plot(x,dy)
    plt.savefig('dsigmoid.png')
    plt.show()
def plot_tanh():
    x=np.arange(-10,10,0.1)
    y=tanh(x)
    fig=plt.figure()
    ax=fig.add_subplot(111)
    ax.spines['top'].set_color('none')
    ax.spines['right'].set_color('none')
    #ax.spines['bottom'].set_color('none')
    #ax.spines['left'].set_color('none')
    ax.spines['bottom'].set_position(('data',0))
    ax.spines['left'].set_position(('data',0))
    ax.plot(x,y)
    plt.xlim([-10.05,10.05])
    plt.ylim([-1.02,1.02])
    ax.set_yticks([-1.0,-0.5,0.5,1.0])
    ax.set_xticks([-10,-5,5,10])
    plt.tight_layout()
    plt.savefig('tanh.png')
    plt.show()
def plot_dtanh():
    x=np.linspace(-10,10,100)
    y=4/pow((np.exp(x)+np.exp(-x)),2)
    fig=plt.figure()
    ax=fig.add_subplot(111)
    ax.spines['top'].set_color('none')
    ax.spines['right'].set_color('none')
    ax.spines['left'].set_position(('data',0))
    ax.spines['bottom'].set_position(('data',0))
    ax.plot(x,y)
    plt.tight_layout()
    plt.savefig('dtanh.png')
    plt.show()
def plot_relu():
    x=np.arange(-10,10,0.1)
    y=relu(x)
    fig=plt.figure()
    ax=fig.add_subplot(111)
    ax.spines['top'].set_color('none')
    ax.spines['right'].set_color('none')
    #ax.spines['bottom'].set_color('none')
    #ax.spines['left'].set_color('none')
    ax.spines['left'].set_position(('data',0))
    #ax.spines['bottom'].set_position(('data',0))
    ax.plot(x,y,color='b',label='relu')
    plt.xlim([-10.05,10.05])
    plt.ylim([1,10.02])
    ax.set_yticks([2,4,6,8,10])
    plt.tight_layout()
    plt.savefig("relu.png")
    plt.show()
def plot_drelu():
    x=np.arange(-10,10,0.1)
    y=np.where(x<0,0,1)
    fig=plt.figure()
    ax=fig.add_subplot()
    ax.spines['top'].set_color('none')
    ax.spines['right'].set_color('none')
    ax.spines['bottom'].set_position(('data',0))
    ax.spines['left'].set_position(('data',0))
    ax.plot(x,y)
    plt.savefig('drelu.png')
    plt.show()

def plot_prelu():
    x=np.arange(-10,10,0.1)
    y=prelu(x)
    fig=plt.figure()
    ax=fig.add_subplot(111)
    ax.spines['top'].set_color('none')
    ax.spines['right'].set_color('none')
    #ax.spines['left'].set_color('none')
    #ax.spines['bottom'].set_color('none')
    ax.spines['left'].set_position(('data',0))
    ax.spines['bottom'].set_position(('data',0))
    ax.plot(x,y)
    plt.xticks([-10,-5,0,5,10])
    plt.yticks([-10,-5,0,5,10])
    plt.tight_layout()
    plt.savefig('prelu.png')
    plt.show()
def plot_dprelu():
    x = np.arange(-10, 10, 0.1)
    y = np.where(x < 0, 0.1, 1)
    fig = plt.figure()
    ax = fig.add_subplot()
    ax.spines['top'].set_color('none')
    ax.spines['right'].set_color('none')
    ax.spines['bottom'].set_position(('data', 0))
    ax.spines['left'].set_position(('data', 0))
    ax.plot(x, y)
    plt.savefig('dprelu.png')
    plt.show()
if __name__=="__main__":
    plot_sigmoid()
    plot_dsigmoid()
    plot_tanh()
    plot_dtanh()
    plot_relu()
    plot_drelu()
    plot_prelu()
    plot_dprelu()