Attention mechanism + = depth of soft thresholding residual shrinkage network

As the name suggests, the depth of the residual shrinkage by the network "residual network" and "contraction" composed of two parts , a "residual network" improved algorithm.

Among them, residual network in 2016 won the ImageNet image recognition contest winner, has become the basis for a network depth learning areas; " contraction " is "soft thresholding" , is a key step in many methods of signal noise reduction.

The depth of the residual shrinkage network is also a deep learning algorithm of "attentional mechanisms." Which soft thresholding required threshold, is disposed essentially in attentional mechanisms.

In this paper, we first residual network related infrastructure, soft thresholding and attention mechanisms are briefly reviewed, then the depth of motivation residual shrinkage networks, algorithms and applications to expand interpretation.

 


 

1. related infrastructure

1.1 Residual Network

Residual network (also known as residual network depth, the depth of the residual learning English ResNet) belongs to a convolution neural network. Compared to normal convolution neural network, residual network using a cross-layer identity connection, in order to reduce the difficulty of training convolution neural network. A basic module residual network shown in Figure 1.

 

Soft thresholding 1.2

Soft thresholding step is the core of a number of signal noise reduction method . This is useful to an absolute value below a certain threshold characteristic is set to zero, the other features can also be adjusted towards zero, i.e. contracted. Here, the threshold parameter is a need to pre-set values which have a direct impact on the size of the result of noise reduction. The relationship between the threshold value of the soft input and the output as shown in FIG.

  

As can be seen from Figure 2, soft thresholding is a nonlinear transformation, and ReLU has properties very similar to the activation function: gradient is either 0 or 1. Thus, soft thresholding function can be activated as a neural network. In fact, some of the neural network has the soft threshold function was used as activation function. 

1.3 attentional mechanisms

Attention mechanism is to focus on the key mechanism for the local information, it can be divided into two steps: first, by the global scan to find out useful information locally; second, enhance useful information and suppress redundant information. 

Squeeze-and-Excitation Network is the depth of learning in a very classic attentional mechanisms. It can be a small sub-network, automatically learn to give a set of weights, wherein for each channel is weighted FIG. The implication is that some features of the channel is more important, while others feature information channel is redundant; so, we can enhance the useful feature of the channel in this way, to weaken redundant features channel. A basic module Squeeze-and-Excitation Network as shown below.

 

It is worth noting, in this way, each sample can have its own unique set of weights , according to the characteristics of the sample itself, a weighted adjustment path unique characteristics. For example, the specimen A first channel characteristic is important, wherein the second channel is not critical; wherein the first passage is unimportant sample B, a second channel is an important feature; in this way, the specimen A can have its own set of weights, wherein the first channel to strengthen and weaken a second channel characteristic; Likewise, sample B can have its own set of weights, wherein the first channel to weaken, to strengthen the second channel characteristic.

 


 

2. The depth of the residual shrinkage network theory

2.1 Motivation

First, the data in the real world, more or less contain some redundant information . Then we can try to soft thresholding embedded in the residual network in order to eliminate redundant information.

Secondly, each sample is often redundant information content is different . Then we can make use of the mechanism of attention, according to the situation of individual samples, each sample adaptively to set different thresholds.

2.2 algorithm

Network and the residual Squeeze-and-Excitation Network Similarly, the depth of the residual shrinkage by the network stack is formed by a number of basic modules. Each basic module has a sub-network, for automatically learning a set of thresholds to obtain, for the soft threshold value of the characteristic of FIG. It is worth noting, in this way, each sample has its own unique set of thresholds . A basic module depth residual shrinkage network as shown below.

 

Depth overall configuration of a residual shrinkage of the network as shown below, is an input layer, a number of basic modules and the last layer or the like connected to the output of the whole composition.

 

2.3 Application

In the paper, the depth of the residual shrinkage network fault diagnosis is used in mechanical rotational vibration signal. In principle, however, the depth of the residual shrinkage is the case for network data set containing redundancy information, the redundancy information is everywhere . For example, when the image recognition, the image will always contain a region associated with the labels; when the speech recognition, audio often contain various forms of noise. Therefore, the depth of the residual shrinkage network, or that such "attention mechanism" + "soft threshold" of the idea, has a more extensive research value and application prospect.

 


 

Literature sources

M. Zhao, S, Zhong, X. Fu, et al. Deep residual shrinkage networks for fault diagnosis. IEEE Transactions on Industrial Informatics, 2019, DOI: 10.1109/TII.2019.2943898

https://ieeexplore.ieee.org/document/8850096/

 

Guess you like

Origin www.cnblogs.com/davidxiong2020/p/12447061.html