Why is it that as the network increases, the nonlinear expression of the traditional multi-layer network structure is difficult to express the identity mapping, and the model will suffer from network degradation problems? What is the identity mapping! !

Insert image description here


1. What is identity mapping?

Identity mapping refers to the exact same mapping relationship between input and output, that is, y=x. It is a linear function without any nonlinear transformation.

Each layer of the deep neural network has a nonlinear activation function, such as ReLU function, etc. This makes 深层网络很难精确学习到一个恒等映射.

The reasons are as follows:

  1. The nonlinear activation of each layer will perform a certain degree of nonlinear transformation on the input signal. As the number of layers increases, this nonlinear transformation accumulates and becomes difficult完全抵消 , to restore the output to the original appearance of the input.

  2. For deep networks表达能力很强,容易过拟合, the learned functions may be more complex than simple identity mapping.

The reason随着层数增加,深层网络很难学习到一个线性恒等映射函数,往往会出现网络退化的问题,即某些时候深层网络的性能不如浅层网络.

2. For deep neural networks, it is not necessary to maintain identity mapping.

For deep neural networks, maintain identity mapping并不是必需的, especially in 处理复杂的任务时. The main purpose of the deep neural network is to learn the of the input data through 多层非线性变换 so as to better solve specific learning tasks. 高级抽象表示或特征

However,在一些情况下,恒等映射是必需的. Identity mapping means that the input and output are exactly the same, that is, the network does not perform any transformation on the input. In some tasks, such an identity mapping may be a desired result, such as the input content itself being the target output, such as in image denoising/restoration tasks, where it is desirable输出图像与输入图像尽量相同.

In addition, identity mapping is introduced主要解决网络退化问题. As the number of network layers increases, 非线性变换可能使得网络性能下降, but by introducing identity mapping, it can 在一定程度上保留输入的信息, so that 减轻梯度消失和梯度爆炸的问题, the deep network can Train more effectively.

3. Identity mapping can be used as a simple benchmark task to evaluate and analyze some important properties of the network

  1. 用来测试网络是否过拟合: If a very deep network does not perform well on a simple data set (such as the identity mapping data set), then it may have overfitted more complex patterns and lost the ability to learn simple patterns.

  2. 用来分析网络表达能力是否随层数增加而退化: If a network cannot learn identity mapping, its ability to extract features and model modeling may decrease as the number of layers increases.

  3. 用来测试优化算法是否有效: If a network cannot learn a simple function such as identity mapping through training, then there may be a problem with the network structure or optimization algorithm.

  4. 恒等映射是一个基础的线性模型,If the network cannot learn it, it indicates that the ,network’s learning ability of linear and nonlinear patterns ,needs to be further improved.


Guess you like

Origin blog.csdn.net/qlkaicx/article/details/135025225