Artificial Intelligence and Machine Learning Course Assignment (2. Function Approximation)


This article is the second part of the artificial intelligence and machine learning course assignment (2. Function approximation)

This article is for learning reference only!


 Jump to other chapters:

1. Basics of Knowledge Engineering

2. Function approximation

3. Fuzzy logic

4. Function optimization


Table of contents

2. Function approximation

2.1 BP network

2.1.1 Principle of BP neural network

2.1.2 Nonlinear function approximation based on BP neural network

2.2 Changing BP network model parameters and analysis of approximation results

2.2.1 Change the number of hidden layers

2.2.2 Change the number of neurons in each layer

2.2.3 Changing the learning rate

2.2.4 Changing the training algorithm

2.2.5 Changing the activation function

2.2.6 Analysis of approximation results

2.3 Principle of RBF neural network

2.4 Changing RBF network model parameters and analysis of approximation results

2.4.1 Changing the radial basis expansion velocity parameter

2.4.2 Change the maximum number of neurons parameter

2.4.3 Change the step size parameter

2.5 Analysis and experimental summary of approximation results of BP network and RBF network

2.6 MATLAB source code for approximating nonlinear functions based on BP and RBF neural networks

 2. Function approximation


Q: Use BP network and RBF network to study the approximation problem of the following nonlinear functions:

y=(1+3x-2x^{^{2}})e^{^{-x^{^{2}}/2}},x\in [-10,10]

Require:

1. First obtain two sets of data, one as a training set and the other as a test set, use the training set to train the network, and use the test set to verify the training results;

2. Change parameters such as the number of hidden layer neurons, training algorithm, learning rate, and maximum number of iterations, and analyze their impact on the approximation accuracy;

3. Comparative analysis of the approximation results of the two networks.

2.1 BP network

2.1.1 Principle of BP neural network

       BP neural network is also called error back propagation neural network. It is a feedforward network composed of nonlinear transformation units. The general multi-layer feedforward network also refers to BP neural network [1].

       The BP learning algorithm is the basic method for training artificial neural networks. It is also a very important and classic learning algorithm. Its essence is to solve the minimum value problem of the error function. It can be used to adjust the weights of multi-layer feedforward neural networks. The proposal of this learning algorithm has played a great role in promoting the development of artificial neural networks.

       The information processing function of the neural network is determined by the input and output characteristics (activation characteristics) of the network unit (neuron), the topology of the network (the connection method of neurons), the size of the connection weight (synaptic connection strength) and the threshold of the neuron. (deemed as special connection rights). Neurons are the most basic components of neural networks, and their structural model is shown in Figure 2-1.

 Figure 2-1 Neuron structural model

Among them, xi(i=1,2,...,R) is the neuron input, wi(i=1,2,...,R) represents the connection weight between neurons, and b=w0 is the threshold ( Bias value), if xo=1 is also regarded as neuron input, then w0 can be regarded as a special connection weight, f is the transfer function (activation function), and y is the neuron output, as follows:

y=f(\sum_{i=1}^{R}x_{i}w_{i}+b)

Let X=(x1 x2...xR), W=(w1 w2...wR)T, XW+b=n, then we have:

y=f(n)

       The transfer function f can be a linear or nonlinear function. Commonly used transfer functions include hardlim (hard limit transfer function), purelin (linear transfer function) and logsig (logarithmic transfer function).

       The so-called BP model is the most widely used type of neural network model. Structurally speaking, the BP network is a typical hierarchical multi-layer network, which has an input layer, a hidden layer and an output layer. The layers are mostly fully connected between layers. There are no interconnections between units on the same floor. Figure 2-2 shows a typical 3-layer BP neural network structure. Compared with multi-layer perceptrons, the two are similar, but the differences are also significant. The main difference between BP network and perceptron is that the weight of each layer can be adjusted through learning. Since the logsig function is differentiable, the BP network uses this function as the transfer function.

       The BP network can be viewed as a highly nonlinear mapping from input to output, that is, F:Rm->Rn, Y=f(n). For the sample set input: input xi(Rm) and output yi(Rn), it can be considered that there is a certain mapping g such that g(xi)=yi,i=1,2,...,p. It is now required to have a mapping f such that in a certain sense (usually the least squares sense), f is the best approximation of g. Hechat-Nielsen proved the following Kolmogorov theorem: Given any continuous function f:U->R, where U is the closed unit interval [0,1], f can be accurately realized by a 3-layer feedforward network. This network The first layer has m processing units, the middle layer has 2m+1 processing units, and the third layer has n processing units.

 Figure 2-2 3-layer BP neural network structure diagram

2.1.2 Nonlinear function approximation based on BP neural network

       The function that needs to be approximated is:

y=(1+3x-2x^{^{2}})e^{^{-x^{^{2}}/2}},x\in [-10,10]

       It should be noted that updating the number of iterations does not have much practical effect in the nonlinear function approximation problem here. Since the function is relatively simple, more evaluation indicators such as MSE, SSE or RMSE will be considered rather than too much. Pay more attention to complexity issues. Therefore, I set the epochs in BP to infinity, represented by 1000000.

        The original nonlinear function is shown in Figure 2-3. Divide the interval [-10,10] into a training set and a test set according to the ratio of 8:2, that is, the training set is [-10,6) and the test set is [6,10]. It should be pointed out that because network model parameters such as the number of hidden layers, number of neurons, learning rate, training algorithm, activation function, etc. need to be modified, the verification set needs to be used in order to prevent overfitting. This is also the case. A method that must be used in neural networks, so when modifying the network parameters, I divided part of the training set into a verification set, and finally got the training set: verification set: test set = 6:2:2.

 Figure 2-3 Original nonlinear function

        The training functions in newff, their characteristics and usage scenarios are shown in Table 2-1.

Table 2-1 Training functions, their characteristics and usage scenarios

function name

Features and usage scenarios

trainingd

Basic gradient descent method, convergence speed is relatively slow

traingdm

Gradient descent with momentum term, usually faster than traingd

traingdx

Adaptive learning algorithm with momentum term, faster than traingdm

trainer

Elastic BP algorithm has the advantages of fast convergence speed and small memory footprint.

traincgf

The Fletcher-Reeves conjugate gradient method is the algorithm with the smallest storage requirements among the conjugate gradient methods.

traincgp

Polak-Ribiers conjugate gradient algorithm, the storage capacity is slightly larger than traincgf, but it converges faster for some problems

trainscg

The normalized conjugate gradient method is the only conjugate gradient method that does not require linear search.

trainbfg

BFGS- Quasi-Newton method, which requires larger storage space than the conjugate gradient method and takes more time for each iteration, but usually requires fewer iterations for convergence than the conjugate gradient method, and is more suitable for small networks

trainoss

One-step segmentation method is a compromise method between the conjugate gradient method and the quasi-Newton method.

trainlm

The Levenberg-Marquardt algorithm is the fastest training algorithm for medium-sized networks. Its disadvantage is that it takes up a large amount of memory. For large networks, the Jacobian matrix can be divided into several sub-matrices by setting the parameter mem-reduc to 1, 2, 3, ⋯. But this also has disadvantages. The system overhead will be closely related to the calculation of each sub-matrix of the Jacobian.

2.2 Changing BP network model parameters and analysis of approximation results

       The statistical data includes: number of hidden layers, number of neurons, training algorithm, learning rate, and maximum number of iterations. Change these model parameters respectively and analyze the impact on the approximation accuracy. Therefore, the following is given as shown in Table 2-2

Table 2-2 Changing the parameters of the BP neural network model

serial number

Number of hidden layers

Number of neurons

training algorithm

learning rate

activation function

#1

1

10

Trainlm

0.001

Logsig

#2

2

10,10

Trainlm

0.001

Logsig

#3

3

10,10,10

Trainlm

0.001

Logsig

#4

1

20

Trainlm

0.001

Logsig

#5

2

20,10

Trainlm

0.001

Logsig

#6

5

20,20,10,10,5

Trainlm

0.01

Logsig

#7

5

20,20,10,10,5

Trainlm

0.1

Logsig

#8

5

20,20,10,10,5

Trainscg

0.01

Logsig

#9

5

20,20,10,10,5

Trainscg

0.01

Tansy

#10

5

20,20,10,10,5

Trainrp

0.01

Logsig

#11

5

20,20,10,10,5

traingd

0.01

Logsig

       The source code of BP neural network is as follows:

%% BP网络逼近非线性函数源码
%%
clear
clc
close all
%% 数据生成
n = 10000; % 总数据量
% x = 20 * rand(n,1)-10; % x生成
x = linspace(-10,10,n); % x生成
fx = (1 + 3.*x - 2.*x.^2).*exp(-x.^2./2); % fx生成
dataOrigin = [x,fx];% 原始数据生成
%%
figure(1)
plot(x,fx,'LineWidth',2)
grid on
title('原始非线性函数')
xlabel('X Label')
ylabel('Y Label')
%% 拆分数据集
X_train = x(1:8000);
X_test = x(8001:end);
Y_train = fx(1:8000);
Y_test = fx(8001:end);
disp('1*************************************************')
epochs=1000000;
goal = 1e-7;
mid = [20];
transfun = {'logsig'};
stra = 'trainlm';
lr = 0.001;
[netALL] = trainAndModel(X_train,Y_train,epochs,goal,mid,transfun,stra,lr);
% test
SimYALL = sim(netALL,X_test);
% cal score
deltaY = Y_test - SimYALL;
fprintf('最大绝对误差绝对值,即绝对误差限为:')
[MaxindeltaY,p1] = max(abs(deltaY))
deltaToY = deltaY./Y_test;
fprintf('最大相对误差绝对值,即相对误差限为:')
[MaxindeltaToY,p2] = max(abs(deltaToY))
% 均方差MSE
fprintf('均方误差MSE为:')
test_mse = mse(deltaY)
% 和方差SSE
fprintf('和方差SSE为:')
test_sse = test_mse*n
% 均方根RMSE
fprintf('均方根差SSE为:')
test_rmse = sqrt(test_mse)
%%
figure(2)
hold on
plot(X_test,Y_test,'LineWidth',1.5)
plot(X_test,SimYALL,'LineWidth',1.5)
plot(X_train,Y_train,'','LineWidth',1.5)
legend('非线性函数理想曲线','非线性函数BP逼近曲线','训练部分')
title('BP网络训练结果')
xlabel('X Label')
ylabel('Y Label')
grid on
axes('Position',[0.54 0.17 0.34 0.33])
hold on
plot(X_test,Y_test,'LineWidth',2)
plot(X_test,SimYALL,'LineWidth',2)
grid on
axis([6 10 -0.00006 0.0002])

2.2.1 Change the number of hidden layers

       According to the network model parameters numbered #1, #2, and #3 in Table 2-2, the number of hidden layers is set to 1, 2, and 3 respectively, the number of magnesium layer neurons is 10, and the training algorithm is trainlm. The learning rate is 0.001, the activation function is logsig, the mean square error (mse) is used as the optimization index, and is verified in the verification set. The final performance results are as follows:

 Figure 2-4 #1 model parameter training results

 Figure 2-5 #2 model parameter training results

 

 Figure 2-6 #3 model parameter training results

2.2.2 Change the number of neurons in each layer

       As shown in Table 2-2, for number #1, change the number of neurons to 20 and keep other parameters unchanged. The training results can be obtained as shown in Figure 2-7.

 Figure 2-7 #4 model parameter training results

       Similarly, for number #2, only change the number of neurons in the first hidden layer to 20, and keep other parameters unchanged. The training results can be obtained as shown in Figure 2-8.

 Figure 2-8 #5 model parameter training results

2.2.3 Changing the learning rate

       As shown in numbers #6 and #7 in Table 2-2, the number of hidden layers is set to 5, the number of neurons is 20, 20, 10, 10, 5, and the learning rates are 0.01 and 0.1 respectively. Training The algorithm and activation function are trainlm and logsig respectively. The training results are shown in Figure 2-9 and Figure 2-10.

 Figure 2-9 #6 model parameter training results

 Figure 2-10 #7 model parameter training results

2.2.4 Change the training algorithm

       As shown in the model parameters numbered #6, #8, #10, and #11 in Table 2-2, the hidden layers are all set to 5 layers, and the number of neurons is 20, 20, 10, 10, and 5 in order. The rates are all 0.01, the training algorithms are trainlm, trainscg, trainrp and trainingd respectively, and the activation functions are all logsig. As shown in Table 2-1, the Levenberg-Marquardt algorithm (  the fastest training algorithm for medium-sized networks ) and the normalized conjugate gradient method (the only one that does not require The conjugate gradient method of linear search), the elastic BP  algorithm (which has the advantages of fast convergence and small memory usage) and the basic gradient descent method. The training results are shown in Figure 2-11 , Figure 2-13 , and Figure 2-14 .

2.2.5 Changing the activation function

       As shown in the model parameters numbered #8 and #9 in Table 2-2, the hidden layer is set to 5 layers, the number of neurons is 20, 20, 10, 10, 5, and the learning rate is 0.01. Training The algorithm is trainlm, and the activation functions are logsig and tansig respectively. The training results are shown in Figure 2-12 .

 Figure 2-11 #8 model parameter training results

 Figure 2-12 #9 model parameter training results

 Figure 2-13 #10 model parameter training results

 Figure 2-14 #11 model parameter training results

2.2.6 Analysis of approximation results

       The number of iterations and training time of different BP network model parameters are shown in Table 2-3. The results of MSE, SSE, RMSE, maximum relative error and maximum absolute error are shown in Table 2-4.

Table 2-3 Results of iteration times and training duration of different BP network models

serial number

number of iterations

Training duration (seconds)

#1

849

5

#2

216

42

#3

289

81

#4

88

less than 1

#5

283

73

#6

460

176

#7

413

112

#8

443

35

#9

621

71

#10

937

243

#11

10000+

1000+

Table 2-4 MSE, SSE, RMSE and other results of different BP network models

serial number

MSE

SSE

RMSE

maximum absolute error

maximum relative error

#1

0.9255

9255.0110

0.9620

1.7650

5.4148e+19

#2

2.3173e-05

  0.2317

0.0048

0.0056

1.4902e+17

#3

  1.7493

1.7493e+04

1.3226

1.5567

4.7757e+19

#4

8.3655e-09

8.3655e-05

9.1463e-05

  9.1786e-05

2.8150e+15

#5

0.7609

7.6091e+03

0.8723

0.9602

2.9458e+19

#6

0.0486

485.7135

0.2204

0.2393

7.3422e+18

#7

0.0644

644.3497

0.2538

0.2780

8.5292e+18

#8

0.0040

40.4638

0.0636

0.0658

2.0186e+18

#9

0.1512

1.5121e+03

0.3889

0.4072

1.2492e+19

#10

6.0798e-09

6.0798e-05

7.7973e-05

7.8006e-05

2.3931e+15

#11

0.0055

54.8824

0.0741

0.0746

2.2898e+18

       In summary, the best performing network is the network numbered #10, with an MSE of 6.0798e-09, an SSE of 6.0798e-05, and a RMSE of 7.7973e-05. The training time is 243 seconds. However, compared to the network numbered #4, Its indicators are similar to #10, but the training time is less than 1 second, so I chose the network of #4 as the final network of the BP neural network. The MSE is 8.3655e-09, the SSE is 8.3655e-05, and the RMSE is 9.1463e-05. . Optimal and suboptimal BP neural network model parameters, #4 model: hidden layer is 1 layer, number of neurons is 20, training algorithm is trainlm, learning rate is 0.001, activation function logsig; #10 model: implicit The layer is 5, the number of neurons is 20 20 10 10 5, the training algorithm is tranrp, the learning rate is 0.01, and the activation function is logsig.

       The optimal BP network model training and testing results are shown in the figure below.

 Figure 2-15 Optimal BP network model training and test results

2.3 Principle of RBF neural network

       In the 1980s, J. Moody and C. Darken proposed a neural network structural model, also known as a radial basis function network. This structural model contains a three-layer feedforward neural network [2]. The theoretical basis of this neural network structure is: radial basis exists as the implicit basis of neurons. These implicit bases are the main elements that constitute the implicit space. The input vector can be changed in the hidden layer, so that Realize the transformation from low dimension to high dimension, so that problems that cannot be solved in low dimension can be easily solved in high dimension space. Radial basis function networks have incomparable advantages in both input and output. It is precisely due to these advantages that radial basis functions have been widely used in fields such as function approximation, pattern recognition, and prediction.

       The radial basis function network is a three-layer feedforward neural network, in which the input layer is responsible for receiving external information. The conversion from the spatial layer to the hidden layer occurs in the hidden layer. At the same time, the hidden layer can also Achieve non-linear conversion. The third layer functions as an output and is the final result of the input. Through this network system, information can be realized from the spatial layer to the hidden layer, and undergo nonlinear transformation in the hidden layer, and finally be transmitted to the output layer in a linear form.

       In the radial basis function, the function is symmetric about the center point, and for neurons far away from the center point, they show quite low activity, so their activity becomes lower and lower as the distance increases. There are many forms of radial basis functions, but the most commonly used form is the Gaussian function.

 Figure 2-16 RBF neural network structure

2.4 Changing RBF network model parameters and analysis of approximation results

       I use the newrb function to achieve this. In newrb, the following parameters can be adjusted:

Table 2-5 newrb parameter table

newrb(P, T, goal, spread, MN, DF)

P

input variable matrix

T

Output matrix (label values)

goal

mean square error target

Spread

Radial basis expansion speed

MN

The maximum number of neurons, that is, training will stop immediately after the number of neurons reaches MN.

DF

The network parameters added each time are used when outputting

       First, determine the mean square error target. In order to adequately fit the nonlinear function, the MSE should be 0. However, in order to prevent overfitting, I determined the goal to be 10^(-16). Change the RBF network parameters as shown in Table 2-6.

Table 2-6 Different RBF neural network model parameters

Spread

MN

DF

Spread

MN

DF

Spread

MN

DF

1

26

1

1

10

1

1

90

1

2

26

1

1

14

1

1

90

2

3

26

1

1

18

1

1

90

3

4

26

1

1

22

1

1

90

5

5

26

1

1

26

1

1

90

9

6

26

1

1

30

1

1

90

10

7

26

1

1

38

1

1

90

15

8

26

1

1

50

1

1

90

30

9

26

1

1

70

1

1

90

45

10

26

1

1

90

1

1

90

90

2.4.1 Change the radial basis expansion speed parameter

       The RBF neural network source code for changing the parameter radial basis expansion speed spread is as follows:

%% 更改参数spread的RBF神经网络源码
%% RBF spread
goal_rbf = 1e-16;
spread_rbf = 1;
MN_rbf = 26;
DF_rbf = 1;
SimRBF = zeros(10,2000);
for i = 1:10
    netRBF = newrb(X_train,Y_train,goal_rbf,spread_rbf,MN_rbf,DF_rbf);
    SimRBF(i,:) = sim(netRBF,X_test);
    spread_rbf = spread_rbf + 1;
end

       Similar to the BP neural network, MSE is used as the objective function, and the radial basis expansion speed is set to: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, RBF with spread=1, 2, 3 The neural network performance curve is shown in Figure 2-17, and the training results of 10 situations are shown in Figure 2-18.

Figure 2-17 RBF neural network training MSE curve with spread=1, 2, 3

 Figure 2-18 Nonlinear function curve fitting results of RBF neural network test set with different spreads

       According to Figure 2-18, it is very obvious that when spread=2, 3, 4, 5, 6, 7, 8, 9, 10, the results are divergent, and only when spread=1, the mse curve converges , so here I only consider the results when spread=1 as the radial basis expansion speed parameter for changing other parameters.

2.4.2 Change the maximum number of neurons parameter

       The RBF neural network source code for changing the maximum number of neurons MN in parameters is as follows:

%% 更改参数MN的RBF神经网络源码
%% RBF MN
matrix_MN = [10,14,18,22,26,30,38,50,70,90];
spread_rbf = 1;
for i = 1:10
    DF_rbf = 1;
    MN_rbf = matrix_MN(i);
    netRBF_mn = newrb(X_train,Y_train,goal_rbf,spread_rbf,MN_rbf,DF_rbf);
    SimRBF_MN(i,:) = sim(netRBF_mn,X_test);
End

       The maximum number of neurons is set to: 10, 14, 18, 22, 26, 30, 38, 50, 70, 90 respectively. The RBF neural network performance curve of MN=50, 70 and 90 is shown in Figure 2-17. The results of the 10 cases in the test set are shown in Figure 2-20.

Figure 2-19 RBF neural network training MSE curve with MN=50, 70, and 90

Figure 2-20 Nonlinear function curve fitting results of RBF neural network test set for different MNs

       The MSE results of different RBF network models are shown in Table 2-7. It can be observed that within a certain range, the larger the MN, the smaller the RBF model performance index MSE, and the better the network. At this time, the optimal MN is 1 and the MSE is 1.83317e -11.

Table 2-7 MSE results of different RBF neural network models

MN

MSE

10

2.18221e-06

14

2.07757e-08

18

2.12097e-10

22

7.57382e-11

26

7.57828e-11

30

5.99356e-11

38

5.95365e-11

50

3.70175e-11

70

1.8505e-11

90

1.83317e-11

2.4.3 Change the step size parameter

       Select the maximum number of neurons MN=90 as the optimal parameter value. Finally, different step size DF parameter values ​​are given to analyze their impact on the prediction results. The RBF neural network source code to change the parameter DF is as follows:

%% 更改参数DF的RBF神经网络源码
%% RBF DF
for i = 1:10
    DF_rbf = matrix_DF(i);
    MN_rbf = 1;
    netRBF_df = newrb(X_train,Y_train,goal_rbf,spread_rbf,MN_rbf,DF_rbf);
    SimRBF_MN(i,:) = sim(netRBF_df,X_test);
end

       The RBF neural network performance curves with step sizes set to: 1, 2, and 3 are shown in Figure 2-21.

 Figure 2-21 RBF neural network training MSE curve with DF=1, 2, 3

       As can be seen from the above figure, different DFs do not have much impact on the RBF network training MSE curve, so this article will also use the parameter results of DF=1 for a more convenient visual display. Finally, the optimal RBF network model parameters were obtained: radial basis expansion speed spread=1, maximum number of neurons MN=90, and step size DF=1. Figure 2-22 is the test result of the optimal RBF network model in the test set .

 Figure 2-22 The optimal RBF network test set approximates the nonlinear function curve

2.5 Analysis and experimental summary of approximation results of BP network and RBF network

       The best-performing BP neural network model and RBF neural network model were tested in the test set respectively, and the obtained nonlinear function approximation results are shown in Figure 2-23.

 Figure 2-23 Comparison of optimal BP and RBF network test set approximation nonlinear function curves

Result analysis and experimental summary

       1. The RBF neural network program is obviously simpler than the BP neural network, and the training is simpler.

       2. For this nonlinear function approximation problem, because the problem is very simple, the training time for the optimal BP neural network and the optimal RBF neural network is very short, and it is impossible to compare the complexity of the two network models. Therefore, if we consider When modifying the model parameters, the better network model should be the RBF network, but in practice, this conclusion has yet to be discussed, because the BP network parameters can be further optimized. The characteristics of the two network models are summarized as follows:

       A) The BP neural network has the following characteristics: 1. As long as there are enough hidden layers and hidden layer nodes, the BP network can approximate any nonlinear mapping relationship; 2. The learning algorithm of the BP network is a global approximation algorithm and has strong Generalization ability; 3. All values ​​of the network must be re-adjusted for each sample learning, the convergence speed is slow, and it is easy to fall into a local minimum, which is not suitable for real-time control.

       B) RBF neural network has the following characteristics: 1. RBF can approximate any continuous function with different precision; 2. Different from BP neural network, RBF is a local approximation neural network; 3. The mapping of RBF from input to hidden layer is non-linear. Linear, the mapping from hidden layer to output is linear, which can greatly speed up and avoid local minima, suitable for real-time control.

       C) BP network and RBF network have the following common features: strong robustness and adaptability, and the number of hidden layer and hidden layer nodes is uncertain, which means they need to be tried and tested based on experience.

       3. Finally, the optimal BP neural network parameters are obtained: the hidden layer is 1 layer, the number of neurons is 20, the training algorithm is trainlm, the learning rate is 0.001, the activation function logsig, the performance evaluation index MSE is 8.3655e-09, SSE is 8.3655e-05 and RMSE is 9.1463e-05. Optimal RBF neural network parameters: radial basis expansion speed spread=1, maximum number of neurons MN=90, step size DF=1, performance evaluation index: MSE is 1.83317e-11.

2.6 MATLAB source code for approximating nonlinear functions based on BP and RBF neural networks

       The source code for approximating nonlinear functions based on BP and RBF neural networks is as follows:

%% 基于BP、RBF神经网络逼近非线性函数的源码
%%
clear
clc
close all
%% 数据生成
n = 10000; % 总数据量
% x = 20 * rand(n,1)-10; % x生成
x = linspace(-10,10,n); % x生成
fx = (1 + 3.*x - 2.*x.^2).*exp(-x.^2./2); % fx生成
dataOrigin = [x,fx];% 原始数据生成
%%
figure(1)
plot(x,fx,'LineWidth',2)
grid on
title('原始非线性函数')
xlabel('X Label')
ylabel('Y Label')
%% 拆分数据集
X_train = x(1:8000);
X_test = x(8001:end);
Y_train = fx(1:8000);
Y_test = fx(8001:end);
%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% train
disp('1*************************************************')
epochs=1000000;
goal = 1e-7;
mid = [20];
transfun = {'logsig'};
stra = 'trainlm';
lr = 0.001;
[netALL] = trainAndModel(X_train,Y_train,epochs,goal,mid,transfun,stra,lr);
% test
SimYALL = sim(netALL,X_test);
% cal score
deltaY = Y_test - SimYALL;
fprintf('最大绝对误差绝对值,即绝对误差限为:')
[MaxindeltaY,p1] = max(abs(deltaY))
deltaToY = deltaY./Y_test;
fprintf('最大相对误差绝对值,即相对误差限为:')
[MaxindeltaToY,p2] = max(abs(deltaToY))
% 均方差MSE
fprintf('均方误差MSE为:')
test_mse = mse(deltaY)
% 和方差SSE
fprintf('和方差SSE为:')
test_sse = test_mse*n
% 均方根RMSE
fprintf('均方根差SSE为:')
test_rmse = sqrt(test_mse)
%%
figure(2)
hold on
plot(X_test,Y_test,'LineWidth',1.5)
plot(X_test,SimYALL,'LineWidth',1.5)
plot(X_train,Y_train,'','LineWidth',1.5)
legend('非线性函数理想曲线','非线性函数BP逼近曲线','训练部分')
title('BP网络训练结果')
xlabel('X Label')
ylabel('Y Label')
grid on
axes('Position',[0.54 0.17 0.34 0.33])
hold on
plot(X_test,Y_test,'LineWidth',2)
plot(X_test,SimYALL,'LineWidth',2)
grid on
axis([6 10 -0.00006 0.0002])

%% RBF spread
goal_rbf = 1e-16;
spread_rbf = 1;
MN_rbf = 26;
DF_rbf = 1;
SimRBF = zeros(10,2000);
for i = 1:10
    netRBF = newrb(X_train,Y_train,goal_rbf,spread_rbf,MN_rbf,DF_rbf);
    SimRBF(i,:) = sim(netRBF,X_test);
    spread_rbf = spread_rbf + 1;
end

% goal_rbf = 0;
% spread_rbf = 1;
% MN_rbf = 26;
% DF_rbf = 1;
% netRBF = newrbe(X_train,Y_train);
% SimRBF = netRBF(X_test);
%%
figure(3)
hold on
% plot(x,fx)
for i = 1:10
    plot(X_test,SimRBF(i,:),'LineWidth',1)
end
grid on
legend('spread = 1','spread = 2','spread = 3','spread = 4','spread = 5','spread = 6','spread = 7','spread = 8','spread = 9','spread = 10')
title('不同spread参数值的rbf网络曲线测试结果')
xlabel('X Label')
ylabel('Y Label')
%%
figure(4)
hold on
for j = 1:9
    subplot(3,3,j)
    plot(X_test,SimRBF(j,:),'LineWidth',1.5)
    grid on
    title(j)
end
%% best spread = 1
SimRBF_best_spread = SimRBF(1,:);
%% RBF MN
matrix_MN = [10,14,18,22,26,30,38,50,70,90];

spread_rbf = 1;
for i = 1:10
    DF_rbf = 1;
    MN_rbf = matrix_MN(i);
    netRBF_mn = newrb(X_train,Y_train,goal_rbf,spread_rbf,MN_rbf,DF_rbf);
    SimRBF_MN(i,:) = sim(netRBF_mn,X_test);
end
%%
figure(5)
hold on
% plot(x,fx)
for i = 1:10
    plot(X_test,SimRBF_MN(i,:),'LineWidth',1)
end
grid on
legend('MN = 10','MN = 14','MN = 18','MN = 22','MN = 26','MN = 30','MN = 38','MN = 50','MN = 70','MN = 90')
title('不同mn参数值的rbf网络曲线测试结果')
xlabel('X Label')
ylabel('Y Label')
%%
figure(6)
hold on
for j = 1:9
    subplot(3,3,j)
    plot(X_test,SimRBF_MN(j+1,:),'LineWidth',1.5)
    grid on
    title(matrix_MN(j+1))
end
%% best MN = 70
SimRBF_best_mn = SimRBF_MN(10,:);
%% RBF DF
matrix_DF = [1,2,3,5,9,10,15,30,45,90];
for i = 1:10
    DF_rbf = matrix_DF(i);
    MN_rbf = 90;
    netRBF_df = newrb(X_train,Y_train,goal_rbf,spread_rbf,MN_rbf,DF_rbf);
    SimRBF_DF(i,:) = sim(netRBF_df,X_test);
end
%% best DF = 1
SimRBF_best_df = SimRBF_DF(1,:);
%% Compare
figure(7)
hold on
plot(X_test,Y_test,'-.','LineWidth',1)
plot(X_test,SimYALL,'LineWidth',1.5)
plot(X_test,SimRBF_best_mn,'LineWidth',1)
plot(X_train,Y_train,'--','LineWidth',1)
legend('非线性函数理想曲线','非线性函数BP逼近曲线','非线性函数RBF逼近曲线','训练部分')
title('BP, RBF网络训练结果')
xlabel('X Label')
ylabel('Y Label')
grid on
axes('Position',[0.54 0.17 0.34 0.33])
hold on
plot(X_test,Y_test,'-.','LineWidth',1)
plot(X_test,SimYALL,'LineWidth',1.5)
plot(X_test,SimRBF_best_mn,'LineWidth',1)
grid on
axis([6 10 -5*10^(-5) 15*10^(-5)])

Other chapters of the Artificial Intelligence and Machine Learning Course Assignment: 1. Basics of Knowledge Engineering ; 3. Fuzzy Logic ; 4. Function Optimization

Guess you like

Origin blog.csdn.net/HISJJ/article/details/130404679