Learn machine learning-neural networks

table of Contents

0. What does a neural network do?

1. Neurons

2. Neuron learning

(1) Learning method

<1>Supervised learning

<2> Unsupervised learning

<3>Encourage learning

(2) Learning algorithm

<1>Error correction learning

<2>Hebb learning

<3>Competitive learning

3. Activation function

4.BP neural network

5. The training and learning process of BP network


0. What does a neural network do?

As a nonlinear network model, neural network is studied in nonlinear science. It is a dynamic model of distributed parallel information processing algorithm structure that imitates the behavior characteristics of human neural network. It is used to receive multiple input stimuli, and generate "excited" output when the weighted sum exceeds a certain threshold, to imitate the working mode of human neurons, and through the interconnected structure of these neurons and the weight coefficient reflecting the strength of the association, It has a variety of complex information processing capabilities. It is not due to the continuous improvement of meta-performance, but through complex interconnection relationships. So it is a connection mechanism model with many important characteristics of a complex system.

Its essence reflects a mathematical expression that converts input into output. This mathematical relationship is determined by the structure of the network, and the structure of the network must be designed and trained according to specific problems. Neural network is particularly suitable for the clustering problem of adjacent mode of settlement mode. The nervous system has the following characteristics: (1) neurons and their connections (2) the strength of connections between neurons can be changed with training (3) signals can be stimulating or inhibiting (4) ) The cumulative effect of the signal received by a neuron determines the state of the neuron (5) The connection strength between neurons determines the strength of signal transmission (6) Each neuron has a threshold.

1. Neurons

Neurons are the most basic unit that constitutes a neural network. Each neuron is composed of a cell body, axons and dendrites that connect other neurons. The axon is responsible for transmitting the output signal of this neuron to other neurons. Many nerve endings at its end can transmit output signals to multiple neurons at the same time. The function of the dendrites is to receive signals from other neurons. The cell body simply processes all the received signals and outputs them. The artificial neuron model (without activation function) is as shown in the figure below, where X is the information of other neurons and W is the connection weight. This neuron represents the sum output after taking the weight coefficients for different inputs.

2. Neuron learning

Obtaining information and improving its performance by learning from the environment is an important feature of artificial neurons.

(1) Learning method

According to the amount of information provided by the environment, online learning methods can be divided into the following three types.

<1>Supervised learning

This learning method requires an outside guidance. The role of guidance is to provide a set of given inputs with the expected output results. The learning system can adjust the system parameters according to the difference between the known output and the actual output.

<2> Unsupervised learning

The learning system adjusts its own parameters or structure completely in accordance with certain statistical laws of the data provided by the environment.

<3>Encourage learning

The external environment only gives an evaluation of the output results of the system, not the output results. The learning system improves its performance by strengthening those rewarded actions.

(2) Learning algorithm

Learning algorithms are also divided into three types.

<1>Error correction learning

The ultimate goal is to minimize an objective function based on the error signal. The actual output of the output unit in the network is the closest to the expected output in a certain statistical sense. Once the form of the objective function is selected, the error correction learning becomes a typical The optimal solution problem. The most commonly used objective function is the mean square error criterion.

<2>Hebb learning

As the relevant synapses change during the learning process, the synaptic connection is strengthened and the transmission efficiency is improved. The Hebb learning rule becomes the basis of connection learning. When the neurons at both ends of a synapse are activated synchronously, the strength of the connection should increase, and vice versa.

<3>Competitive learning

Multiple output units in the network compete with each other, and finally there is only one strongest activation.

3. Activation function

Combining the basic model of artificial neuron and the activation function is the processing unit of the neural network.

Excitation function:, y=f\sum_{t=0}^{n-1}W_{t}X_{t}-\Thetawhere f is called the activation function or action function, the output is 1 or 0 depending on whether the sum of its inputs is greater than or less than the internal threshold \Theta, that is , the definition of the f function is as follows: f=\left\{\begin{matrix} 1&\sigma > 0\\ 0&\sigma < 0 \end{matrix}\right.,1 means the neuron is activated, 0 means the neuron Yuan is suppressed. The excitation function needs to be non-linear. The common activation functions are as follows:

Function name Function expression Function application
Step function f(x)=\left\{\begin{matrix} 1&x\geq 0\\ 0&x\leq 0 \end{matrix}\right.orf(x)=\left\{\begin{matrix} 1&x\geq 0\\ -1&x\leq 0 \end{matrix}\right. Discrete neural network
Piecewise linear    
Sigmoid function f(x)=\frac{1}{1+e^{-x}} Continuous neural network
Hyperbolic tangent function f (x) = tanh (x) = \ frac {e ^ {x} -e ^ {- x}} {e ^ {x} + e ^ {- x}}  
Gaussian function f(x)=e^{-\frac{1}{2\sigma _{i}^{2}\sum_{j}^{ }(x_{j}-W_{ij})^{2}}} Radial basis function neural network

4.BP neural network

BP neural network, also known as error back propagation neural network or multilayer feedforward neural network. It is a unidirectional multi-layer forward neural network. The first layer is the input node, and the last layer is the output node. There are one or more hidden layer nodes in between. The neural network elements in the hidden layer are all used S-type function, the neurons in the output layer use pure linear transformation functions.

The input-output relationship is a highly nonlinear mapping relationship. If the number of input nodes is n and the number of output nodes is m, then the network is a mapping from n-dimensional Euclidean space to m-dimensional space. There are the following two basic theorems: (1) Given any continuous function f:[0,1]^{n}\rightarrow R^{m}, f can be realized by a three-layer forward neural network, the input layer has n neurons, the middle layer has 2n+1 neurons, and the output layer There are m neurons. (2) Given any \varepsilon > 0, for any continuous function f:[0,1]^{n}\rightarrow R^{m}, there exists a three-layer BP network, \varepsilon^{2}which approximates f within arbitrary precision.

Increasing the number of network layers can reduce errors and improve accuracy. However, increasing the number of network layers makes the network structure more complicated, and the number of network weights increases sharply, thereby increasing the training time of the network. The improvement of accuracy can also be achieved by adjusting the number of nodes in the hidden layer, and the training results are easier to observe. All network structures with fewer hidden layers are usually considered. There are few hidden layer nodes in the network, the learning process may not converge, but too many hidden layer nodes will cause long-term non-convergence problems, and due to over-fitting, the network’s fault tolerance and generalization ability will be affected. decline.

The final performance of the BP neural network is not only determined by the network structure, but also related to the initial point and the learning sequence of the training data. Therefore, the topology of the network is selected to have the best network performance. From the perspective of function fitting, the BP network has Interpolation function.

5. The training and learning process of BP network

Suppose there are N BP each network processing unit, the i-th hidden layer neuron of the input received is: net_{i}=x_{i}w_{1i}+x_{2}w_{2i}+\cdots+x_{n}w_{ni}excitation function is typically selected S function or a hyperbolic tangent function, the output is: o = f (net) = \ frac {1} {1 + e ^ {- net}}.

In the forward propagation stage, take a sample from the sample set and input it into the network to calculate the corresponding output

In the back-propagation phase, calculating the difference between the actual output and the corresponding desired outputs, error minimization manner to adjust the weight matrix, for the network error metric for the p-th samples: E_{p}=\frac{1}{2}\sum_{j=1}^{m}(y_{pj}-o_{pj})^{2}for the error metric for the entire set of samples: E = \ sum_ {p} ^ {} {p} proposition. Using the \deltalearning rule, that is, using the steepest descent of the gradient, the weight is changed along the negative gradient direction of the error function. There \ Delta w_ {ji} \ propto - \ frac {\ partial E_ {p}} {\ partial w_ {ij}}are: .

In the error propagation analysis, to adjust the output layer weights are: w_ {pq} = w_ {pq} + \ Delta w_ {pq}, w_ {pq}showing the connection weights between the q-th output layer neurons and the p-th hidden layer neurons, and the \Delta w_{pq}=\alpha f_{n}^{,}(net_{q})(y_{q}-o_{q})o_{p}; same adjustment hidden layer and output layer weights , But \Delta v_{hq}=\alpha f_{k-1}^{,}(net_{p})(w_{p1}\delta _{1k}+w_{p2}\delta _{2k}+\cdots+w_{pm}\delta _{mk})oh_{k-2}.

 

 

Guess you like

Origin blog.csdn.net/qq_35789421/article/details/113785701