tensorflow激活函数--tf.nn.dropout

前言：激活函数（Activation Function）运行时激活神经网络中某一部分神经元，将激活信息向后传入下一层的神经网络。神经网络的数学基础是处处可微的，所以选取激活函数要保证数据输入与输出也是可微的。

### 激励函数的作用如果不使用激活函数，此时激活函数本质上相当于f(x)=ax+b。这种情况下，神经网络的每一层输出都是上层输入的线性函数。不难看出，不论神经网络有多少层，输出与输入都是线性关系，与没有隐层的效果是一样的，这个就是相当于是最原始的感知机(Perceptron)。至于感知机，大家知道其连最基本的异或问题都无法解决，更别提更复杂的非线性问题。 神经网络之所以能处理非线性问题，这归功于激活函数的非线性表达能力。

### TFLearn官方提供的激活函数：

[Activation Functions](https://www.tensorflow.org/api_guides/python/nn#activation-functions)

tf.nn.relu
tf.nn.relu6
tf.nn.crelu
tf.nn.elu
tf.nn.selu
tf.nn.softplus
tf.nn.softsign
tf.nn.dropout
tf.nn.bias_add
tf.sigmoid
tf.tanh

dropout函数会以一个概率为keep_prob来决定神经元是否被抑制。如果被抑制，该神经元输出为0，如果不被抑制则该神经元的输出为输入的1/keep_probbe倍。

每个神经元是否会被抑制是相互独立的。神经元是否被抑制还可以通过调节noise_shape来调节，当noise_shape[i] == shape(x)[i]，x中的元素是相互独立的。如果shape(x)=[k,l,m,n](k表示数据的个数，l表示数据的行数，m表示数据的列，n表示通道)，当noise_shape=[k,1,1,n]，表示数据的个数与通道是相互独立的，但是与数据的行和列是有关联的，即要么都为0，要么都为输入的1/keep_prob倍。

def dropout(incoming, keep_prob, noise_shape=None, name="Dropout"):
    """ Dropout.
    Outputs the input element scaled up by `1 / keep_prob`. The scaling is so
    that the expected sum is unchanged.
    By default, each element is kept or dropped independently. If noise_shape
    is specified, it must be broadcastable to the shape of x, and only dimensions
    with noise_shape[i] == shape(x)[i] will make independent decisions. For
    example, if shape(x) = [k, l, m, n] and noise_shape = [k, 1, 1, n], each
    batch and channel component will be kept independently and each row and column
    will be kept or not kept together.
    Arguments:
        incoming : A `Tensor`. The incoming tensor.
        keep_prob : A float representing the probability that each element
            is kept.
        noise_shape : A 1-D Tensor of type int32, representing the shape for
            randomly generated keep/drop flags.
        name : A name for this layer (optional).

下面以实例来进行说明。

import tensorflow as tf
dropout = tf.placeholder(tf.float32)
x = tf.Variable(tf.ones([10, 10]))
y = tf.nn.dropout(x, dropout)
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
a = sess.run(y, feed_dict = {dropout: 0.5})
print(a)

结果：
[[0. 2. 0. 2. 2. 2. 0. 2. 2. 2.]
 [2. 0. 0. 0. 2. 2. 0. 2. 0. 2.]
 [0. 0. 2. 2. 2. 0. 2. 2. 2. 2.]
 [0. 0. 2. 2. 0. 0. 2. 2. 0. 2.]
 [0. 2. 0. 0. 2. 0. 0. 0. 0. 0.]
 [2. 0. 0. 0. 0. 2. 0. 0. 0. 0.]
 [0. 2. 0. 0. 2. 2. 2. 0. 2. 0.]
 [0. 2. 2. 2. 0. 0. 0. 2. 0. 2.]
 [0. 0. 2. 0. 2. 2. 0. 2. 0. 0.]
 [0. 2. 2. 2. 2. 0. 2. 0. 2. 2.]]

Process finished with exit code 0

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    d = tf.constant([[1.,2.,3.,4.],[5.,6.,7.,8.],[9.,10.,11.,12.],[13.,14.,15.,16.]])
    print(sess.run(tf.shape(d)))

    #由于[4,4] == [4,4] 行和列都为独立
    dropout_a44 = tf.nn.dropout(d, 0.5, noise_shape = [4,4])
    result_dropout_a44 = sess.run(dropout_a44)
    print(result_dropout_a44)

    #noise_shpae[0]=4 == tf.shape(d)[0]=4  
    #noise_shpae[1]=1 != tf.shape(d)[1]=4
    #所以[0]即行独立，[1]即列相关，每个行同为0或同不为0
    dropout_a41 = tf.nn.dropout(d, 0.5, noise_shape = [4,1])
    result_dropout_a41 = sess.run(dropout_a41)
    print(result_dropout_a41)

    #noise_shpae[0]=1 ！= tf.shape(d)[0]=4  
    #noise_shpae[1]=4 == tf.shape(d)[1]=4
    #所以[1]即列独立，[0]即行相关，每个列同为0或同不为0
    dropout_a24 = tf.nn.dropout(d, 0.5, noise_shape = [1,4])
    result_dropout_a24 = sess.run(dropout_a24)
    print(result_dropout_a24)
    #不相等的noise_shape只能为1

结果：
[4 4]
[[  0.   4.   0.   8.]
 [  0.   0.  14.   0.]
 [  0.   0.  22.   0.]
 [  0.   0.  30.   0.]]
[[  2.   4.   6.   8.]
 [  0.   0.   0.   0.]
 [ 18.  20.  22.  24.]
 [ 26.  28.  30.  32.]]
[[  0.   0.   6.   0.]
 [  0.   0.  14.   0.]
 [  0.   0.  22.   0.]
 [  0.   0.  30.   0.]]

Droptout定义

Dropout是TensorFlow里面为了防止或减轻过拟合而使用的函数，它一般用在全连接层。
dropout 是训练过程中，对于神经网络单元，按照一定的概率将其暂时从网络中丢弃。注意是暂时，对于随机梯度下降来说，由于是随机丢弃，故而每一个mini-batch都在训练不同的网络。

参考资料：

https://blog.csdn.net/ajian0051/article/details/82315828

https://blog.csdn.net/Eclipsesy/article/details/77603336

tensorflow激活函数--tf.nn.dropout

猜你喜欢