Machine Learning - A Preliminary Study of Neural Networks

1. Overview of Neural Network
Neural network is an important algorithm in machine learning. The most basic component in a neural network is the neuron model, the "simple unit" in the above definition. In biological neural networks, each neuron is connected to other neurons, and when it is "excited", it sends chemicals to the connected neurons, thereby changing the potential within those neurons; if the potential of a neuron exceeds If it reaches a "threshold", it will be activated, that is, "excited", sending chemicals to other neurons. Abstracting this situation into a simple model is the "MP" neuron model:
write picture description here
in this model, a neuron receives input signals from n other neurons, and these input signals are carried out through weighted connections. Passed, the total input value received by the neuron is compared to the neuron's threshold and then processed through an "active function" to produce the neuron's output. (The Sigmoid function is used here as the step function)
Sigmoid function: continuous, smooth, strictly monotonic, and symmetrical about the (0, 0.5) center, it is a very good threshold function, also known as "squeeze function".
write picture description here

Second, the neural network structure

A more classical neural network is a neural network with three layers:
write picture description here
input layer, hidden layer and output layer.
Note:
1. When designing a neural network, the number of nodes in the input layer and output layer is often fixed, and the middle layer can be freely specified;
2. The topology and arrows in the neural network structure diagram represent the flow of data during the prediction process. It is different from the data flow during training;
3. The key in the structure diagram is not a circle (representing "neuron"), but a connecting line (representing the connection between "neurons"). Each connection line corresponds to a different weight (the value of which is called a weight), which needs to be trained.

3. BP algorithm (error back propagation, or back propagation algorithm)
When training a multi-layer network, we need a more powerful learning algorithm. The BP algorithm is by far the most successful neural network learning algorithm. When neural networks are used in real tasks, most of them are trained using BP algorithm. It is worth pointing out that the BP algorithm can be used not only for multilayer feedforward neural networks, but also for other types of neural networks, such as training recurrent neural networks. But when we usually say "BP network", it generally refers to a multi-layer feedforward neural network trained with the BP algorithm.
write picture description here
The graph is the BP algorithm. Given a training set D={(X1,Y1),(X2,Y2),...(Xm,Ym)}, write picture description herethat is, the input example is described by d attributes, and an l-dimensional real-valued vector is output. For example, the figure shows a multi-layer feedforward network structure with d input neurons, l output neurons, and q hidden neurons, where the threshold of the jth neuron in the output layer is write picture description heredenoted by The threshold of each neuron is write picture description heredenoted by , the connection weight between the i-th neuron in the input layer and the h-th neuron in the hidden layer is the connection weight between the h-th neuron in the write picture description herehidden layer and the j-th neuron in the output layer is write picture description here, the input received by the hth neuron in the hidden layer is , write picture description hereand the input received by the jth neuron in the output layer is write picture description here, where write picture description hereis the output of the hth neuron in the hidden layer. Assume that both the hidden layer and the output layer neurons use the Sigmoid function introduced above.

For the training example (Xk, Yk), it is assumed that the output of the neuralwrite picture description here network is It is necessary to determine: d*q weights from the input layer to the hidden layer, q * l from the hidden layer to the output layer, the threshold of q hidden layer neurons, and the threshold of l output layer neurons. BP is an iterative learning algorithm. In each round of iteration, the generalized perceptron learning rules are used to update and estimate the parameters, that is, the update estimation formula of any parameter V with (wi is the weight) is followed by the hidden layer to The connection weight of the output layer is deduced as an example. The BP algorithm is based on the gradient descent strategy and adjusts the parameters in the direction of the parent gradient of the target. For the error in the formula , given the learning rate , it is noted that it affects the input value of the jth output layer neuron first , then affects its output value , and then affects , according to the definition of , obviously there are Among them , Sigmoid The function has a very good property: so, according to equations (5.4) and (5.3), there are Substitute equations (5.10) and (5.8) into equation (5.7), and then into equation (5.6), the BP algorithm is obtained The update formula about in can be obtained similarly in equations (5.13) and (5.14) The learning ratewrite picture description here

write picture description here
write picture description here
write picture description here
write picture description here
write picture description herewrite picture description herewrite picture description here
write picture description here
write picture description herewrite picture description herewrite picture description herewrite picture description here
write picture description here
write picture description here
write picture description here

write picture description here

write picture description here
write picture description here
write picture description here
write picture description here

write picture description here
write picture description here
write picture description hereIt controls the growth of the update in each iteration of the algorithm. If it is too large, it will easily oscillate, and if it is too small, the convergence speed will be too slow.
Sometimes in order to make fine adjustment, formulas (5.11) and (5.12) can be used write picture description here, and formulas (5.13) and (5.14) can be used write picture description here, but the two are not necessarily equal.

Fourth, the workflow of the BP algorithm
First provide the input example to the input layer neurons, and then forward the signal layer by layer until the result of the output layer is generated; then calculate the error of the output layer (lines 4-5), and then propagate the error back. to the hidden layer neurons (line 6), and finally adjust the connection weights and thresholds according to the errors of the hidden layer neurons (line 7). This iterative process loops until some stopping condition is reached, such as the training error has reached a small value.
write picture description here

Reference: "Machine Learning" by Zhou Zhihua

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324965218&siteId=291194637