Data Mining (6.1)--Neural Network

Table of contents

Introduction to Neural Networks

BP algorithm

Fundamentals of the Delta Learning Rule

Structure of BP Neural Network 

Algorithm Description of BP Neural Network

General steps of neural network training

The main steps of the backpropagation algorithm 

Advantages and disadvantages

Simple example of BP algorithm


Introduction to Neural Networks

A neural network is a computational model inspired by the biological neural network process by which the human brain processes information. Artificial neural network (ANN) is generally also called neural network (Neural Network, NN).  

A neural network is composed of multiple neurons, each neuron has an input and an output, and they are connected by weights. When the input data passes through multiple neurons, the output result is obtained by the weighted sum of the outputs of these neurons.

BP algorithm

BP algorithm (Back Propagation Algorithm) is a commonly used neural network training algorithm for the forward propagation algorithm in supervised learning.

A neural network is a computational model inspired by the biological neural network process by which the human brain processes information. Artificial neural network (ANN) is generally also called neural network (Neural Network, NN).  

A neural network is composed of multiple neurons, each neuron has an input and an output, and they are connected by weights. When the input data passes through multiple neurons, the output result is obtained by the weighted sum of the outputs of these neurons.

optimize. The BP algorithm is based on the principle of error backpropagation, by calculating the error of each neuron for the output result, and then backpropagating the error from the output layer to the input layer, updating the weight and bias value of each neuron layer by layer to minimize The error between the predicted results and the actual results.

Fundamentals of the Delta Learning Rule

The Delta learning rule, also known as the gradient method or the steepest descent method, its main point is to change the connection weights between units to reduce the error between the actual output and the expected output of the system 

Structure of BP Neural Network 

The BP neural network is a hierarchical neural network with three or more layers, consisting of an input layer, a hidden layer and an output layer. The neurons in adjacent layers are fully interconnected , and the neurons in the same layer are not connected. 

Algorithm Description of BP Neural Network

The main idea of ​​the BP algorithm is to propagate the error of the output layer layer by layer from back to front (reverse) to indirectly calculate the error of the hidden layer.

The algorithm is divided into two phases:

In the first stage ( forward propagation process ), the input information is calculated from the input layer through the hidden layer layer by layer to calculate the output value of each unit.

In the second stage ( back propagation process ), the output error is calculated layer by layer, and the error of each unit in the hidden layer is calculated forward, and the error is used to correct the weight of the previous layer.

Specifically, the BP algorithm includes the following steps:

  1. Initialize model parameters (such as neuron weights and bias values).

  2. Forward propagation computes the output of the network.

  3. Calculate the error between the output of the network and the true result.

  4. According to the principle of error backpropagation, the contribution of each neuron to the error is calculated, and the weight and bias value of each neuron are updated according to the gradient descent algorithm.

  5. Repeat the above steps until a certain number of iterations is reached or the loss function converges. 

General steps of neural network training

Initialize weights

Inject input tuples into the network one by one

For each input tuple, the following process is performed:

The input of each unit is calculated as a linear combination of all inputs of this unit

Use the activation function to calculate the output value

Update the weight value and threshold ( deviation )

The main steps of the backpropagation algorithm 

Initialize weights with random values

②Propagate the input forward, calculate its input and output for each hidden layer or output unit, and finally calculate the prediction result

Net input to unit j:

Suppose a hidden or output unit is j, the input of unit j comes from the output of the previous layer, Wij is the connection weight from unit i to unit j in the previous layer, is the output of unit i in the previous layer, and is the bias of O_{i}unit \theta jj .

Net output of unit j:

③Backward propagating error. Update weights and thresholds (from output layer)

The error of unit j in the output layer :

The error of hidden layer unit j :

where w_{jk}, is the connection weight of unit j to the next higher layer unit k, and Errk is the error of unit k.

 

Among them \bigtriangleup w_{ij}is the change of weight w_{ij}; l is the learning rate, which is usually a constant between 0.0 and 1.0. An empirical setting is to set the learning rate to 1/t, and t is the number of iterations of the current training set.

Advantages and disadvantages

advantage

  • Nonlinear Mapping Capabilities
  • Self-learning and adaptive ability
  • Generalization
  • fault tolerance

shortcoming

  • structure options vary
  • Local minimization, slow convergence
  • The Contradictory Problem of Predictive Ability and Training Ability
  • sample dependency problem

Simple example of BP algorithm

import numpy as np
from sklearn.datasets import load_iris
from sklearn.preprocessing import StandardScaler

# 定义sigmoid函数
def sigmoid(x):
    return 1 / (1 + np.exp(-x))

# 定义BP算法
def backpropagation(X, y, hidden_nodes, output_nodes, learning_rate, epochs):
    # 初始化权重
    input_nodes = X.shape[1]
    hidden_weights = np.random.normal(scale=1 / input_nodes ** 0.5, size=(input_nodes, hidden_nodes))
    output_weights = np.random.normal(scale=1 / hidden_nodes ** 0.5, size=(hidden_nodes, output_nodes))

    # 训练
    for i in range(epochs):
        # 前向传播
        hidden_inputs = np.dot(X, hidden_weights)
        hidden_outputs = sigmoid(hidden_inputs)
        output_inputs = np.dot(hidden_outputs, output_weights)
        output_outputs = sigmoid(output_inputs)

        # 计算误差
        output_error = y - output_outputs
        output_delta = output_error * output_outputs * (1 - output_outputs)

        hidden_error = np.dot(output_delta, output_weights.T)
        hidden_delta = hidden_error * hidden_outputs * (1 - hidden_outputs)

        # 更新权重
        output_weights += learning_rate * np.dot(hidden_outputs.T, output_delta)
        hidden_weights += learning_rate * np.dot(X.T, hidden_delta)

    return output_outputs

# 加载数据集
iris = load_iris()
X = iris.data
y = iris.target

# 数据预处理
scaler = StandardScaler()
X = scaler.fit_transform(X)
# 将标签转换为one-hot编码
y_onehot = np.zeros((y.size, y.max()+1))
y_onehot[np.arange(y.size), y] = 1

# 训练模型
output = backpropagation(X, y_onehot, hidden_nodes=5, output_nodes=3, learning_rate=0.1, epochs=10000)

# 预测结果
y_pred = np.argmax(output, axis=1)

# 计算准确率
accuracy = np.mean(y_pred == y)
print('Accuracy:', accuracy)

 Output result:

Accuracy: 0.9933333333333333

Guess you like

Origin blog.csdn.net/weixin_53197693/article/details/131080203