PaddlePaddle entry-depth study (five): convolution neural network BN Foundations and Dropout

This course is an introductory offer of zero-based Baidu official depth course of study, there is no depth learning technology base or weak base mainly for students, to help you leap from 0 to 1+ depth study in the field. From this course, you will learn to:

  • Deep learning the basics
  • numpy achieve artificial neural network and gradient descent algorithm
  • The main principle of the direction of the field of computer vision, practice
  • The main direction of the principle of natural language processing, practice
  • Personalized recommendation algorithm theory, practice

This article from the depth of learning technology platform Baidu, senior R & D engineer Sun Gaofeng bring the basis of the convolutional neural network BN and Dropout for everyone.

Batch normalization (Batch Normalization)

Batch normalization methods (Batch Normalization, BatchNorm) by Ioffe and Szegedy proposed in 2015, has been widely applied in depth study aimed at the output of the neural network of the intermediate layer is normalized, so that the intermediate layer output is more stable.
Usually we have a neural network data is normalized, treated sample data set to meet with mean 0 and variance 1 statistical distribution, which is distributed because when the input data is relatively fixed, conducive to the stability and convergence of the algorithm. For the depth of neural networks, because the parameters are constantly updated, even if the input data standardization process has been done, but for those who rely on layer after comparing its received input is still highly variable, often leads to numerical instability, model is difficult to converge. BatchNorm neural network output can be made the intermediate layer becomes more stable, and has the following three advantages:

  • Make learning fast (to use a larger learning rate)
  • Reduce the sensitivity of the model to the initial value
  • Inhibition of over-fitting a certain extent

BatchNorm main idea is to press the mini-batch training units of neurons value is normalized, so that the data distribution satisfies zero mean, unit variance. The calculation process is as follows:

Here Insert Picture Description
Here Insert Picture DescriptionHere Insert Picture DescriptionHere Insert Picture Description


# 输入数据形状是 [N, K]时的示例
import numpy as np

import paddle
import paddle.fluid as fluid
from paddle.fluid.dygraph.nn import BatchNorm
# 创建数据
data = np.array([[1,2,3], [4,5,6], [7,8,9]]).astype('float32')
# 使用BatchNorm计算归一化的输出
with fluid.dygraph.guard():
    # 输入数据维度[N, K],num_channels等于K
    bn = BatchNorm('bn', num_channels=3)    
    x = fluid.dygraph.to_variable(data)
    y = bn(x)
    print('output of BatchNorm Layer: \n {}'.format(y.numpy()))

# 使用Numpy计算均值、方差和归一化的输出
# 这里对第0个特征进行验证
a = np.array([1,4,7])
a_mean = a.mean()
a_std = a.std()
b = (a - a_mean) / a_std
print('std {}, mean {}, \n output {}'.format(a_mean, a_std, b))

# 建议读者对第1和第2个特征进行验证,观察numpy计算结果与paddle计算结果是否一致

Here Insert Picture DescriptionHere Insert Picture Description


# 输入数据形状是[N, C, H, W]时的batchnorm示例
import numpy as np

import paddle
import paddle.fluid as fluid
from paddle.fluid.dygraph.nn import BatchNorm

# 设置随机数种子,这样可以保证每次运行结果一致
np.random.seed(100)
# 创建数据
data = np.random.rand(2,3,3,3).astype('float32')
# 使用BatchNorm计算归一化的输出
with fluid.dygraph.guard():
    # 输入数据维度[N, C, H, W],num_channels等于C
    bn = BatchNorm('bn', num_channels=3)
    x = fluid.dygraph.to_variable(data)
    y = bn(x)
    print('input of BatchNorm Layer: \n {}'.format(x.numpy()))
    print('output of BatchNorm Layer: \n {}'.format(y.numpy()))

# 取出data中第0通道的数据,
# 使用numpy计算均值、方差及归一化的输出
a = data[:, 0, :, :]
a_mean = a.mean()
a_std = a.std()
b = (a - a_mean) / a_std
print('channel 0 of input data: \n {}'.format(a))
print('std {}, mean {}, \n output: \n {}'.format(a_mean, a_std, b))

# 提示:这里通过numpy计算出来的输出
# 与BatchNorm算子的结果略有差别,
# 因为在BatchNorm算子为了保证数值的稳定性,
# 在分母里面加上了一个比较小的浮点数epsilon=1e-05

Here Insert Picture Description

Discard method (Dropout)

Discard method (Dropout) is a common depth learning method of inhibiting overfitting, which in practice is the neural network learning process, to delete a portion of the random neurons. Training, randomly selected portion of the neurons, the output is set to 0, these neurons will not pass the external signal.

Figure 11 is a schematic diagram Dropout, left intact neural network is a network with the right structure after Dropout. After applying Dropout, it will mark the neurons X is removed from the network, so they do not pass a signal to the back layer. In the learning process, which neuron is determined randomly discarded, so that the model will not be over-reliance on certain neurons to inhibit over-fitting to a certain extent.

Here Insert Picture Description
Here Insert Picture Description
Here Insert Picture Description


# dropout操作
import numpy as np

import paddle
import paddle.fluid as fluid

# 设置随机数种子,这样可以保证每次运行结果一致
np.random.seed(100)
# 创建数据[N, C, H, W],一般对应卷积层的输出
data1 = np.random.rand(2,3,3,3).astype('float32')
# 创建数据[N, K],一般对应全连接层的输出
data2 = np.arange(1,13).reshape([-1, 3]).astype('float32')
# 使用dropout作用在输入数据上
with fluid.dygraph.guard():
    x1 = fluid.dygraph.to_variable(data1)
    out1_1 = fluid.layers.dropout(x1, dropout_prob=0.5, is_test=False)
    out1_2 = fluid.layers.dropout(x1, dropout_prob=0.5, is_test=True)

    x2 = fluid.dygraph.to_variable(data2)
    out2_1 = fluid.layers.dropout(x2, dropout_prob=0.5, \
                    dropout_implementation='upscale_in_train')
    out2_2 = fluid.layers.dropout(x2, dropout_prob=0.5, \
                    dropout_implementation='upscale_in_train', is_test=True)

    print('x1 {}, \n out1_1 \n {}, \n out1_2 \n {}'.format(data1, out1_1.numpy(),  out1_2.numpy()))
    print('x2 {}, \n out2_1 \n {}, \n out2_2 \n {}'.format(data2, out2_1.numpy(),  

to sum up

This article focuses on expansion explained convolution neural network inside the common modules, such as BN and Dropout. In the latter part of the course, will continue to bring a richer curriculum for everyone, help students to grasp the depth of learning.

[Learn how]

How to watch the complete video? How coding practices?

Video + code has been published on the AI Studio practice platform, video support PC client / mobile terminal simultaneous viewing, but also encourage people to personally experience running code oh. Scan code or open the following link:
https://aistudio.baidu.com/aistudio/course/introduce/888

Learning process, we have questions how to do?

Adding depth study training camp QQ group: 726 887 660, and the class teacher will fly paddle research to answer questions and learning materials distributed in the group.

How to learn more?

Baidu will be in the form of fly fly paddle paddle depth study of training camp, continue to update the "zero-based entry-depth study" courses, learning senior research engineer personally taught by the depth of Baidu, Tuesdays, Thursdays 8: 00-9: 00 is not seen scattered, using taped live + + + practice in the form of Q & a's, welcome attention ~

Published 116 original articles · won praise 1 · views 4572

Guess you like

Origin blog.csdn.net/PaddleLover/article/details/103897544