This program is based on the recognition of handwritten digits 0-9 realized by two-layer neural network.
1. Handwritten numbers 0-9
The training set is 5000 pieces of handwritten digits 0-9 with a pixel size of 20*20. As shown in the figure, 100 samples are displayed:
2. Neural network structure
The two-layer neural network structure is shown in the figure below:
401 input layer neurons (including 1 paranoid unit), 26 hidden layer neurons (including paranoid unit), 10 output layer neurons
The order of the coefficient matrix is 25 401, and the order of is 10 26 .
3. Mathematical derivation
The mathematical derivation is similar to the derivation process in the article "Simple Two-Layer BP Neural Network - Implementing Logical AND Gate (Matlab and Python)", so I won't repeat it here.
Article Link: Simple Two-Layer BP Neural Network - Implementing Logical AND Gates (Matlab and Python)
4. Program (Matlab)
clear all;clc
%数据初始化
num_labels = 10;
% Load Training Data
load('data.mat');
m = size(X, 1);
% Randomly select 100 data points to display
sel = randperm(size(X, 1));
sel = sel(1:100);
figure
displayData(X(sel, :));
%Loading Parameters
load('weights.mat');
alpha = 3; %learning_rate
number_iters = 5000; %number_of_training_iterations
%迭代
for i=1:number_iters
%Forward Propagation
a1 = [ones(m, 1) X]; %5000x401
z2 = a1 * Theta1'; %5000x25 Theta1 25*401
a2 = sigmoid(z2); %5000x25
a2 = [ones(m, 1) a2]; %5000x26
z3 = a2 * Theta2'; %5000x10 Theta2 10*26
a3 = sigmoid(z3); %5000x10
h = a3; %5000x10
u = eye(num_labels);
y1 = u(y,:);
%Back Propagation
delta3 = a3 - y1; % 5000 * 10
delta2 = delta3 * Theta2; % 5000 * 26
delta2 = delta2(:,2:end); % 5000 * 25
delta2 = delta2 .* sigmoidGradient(z2); % 5000 * 25
Delta1 = zeros(size(Theta1)); % 25 * 401
Delta2 = zeros(size(Theta2)); % 10 * 26
Delta1 = Delta1 + delta2' * a1; % 25 * 401 5000×25' * 5000x401
Delta2 = Delta2 + delta3' * a2; % 10 * 26 5000×10' * 5000x26
Theta2_grad = 1/m * Delta2;
Theta1_grad = 1/m * Delta1;
Theta1 = Theta1 - alpha * Theta1_grad;
Theta2 = Theta2 - alpha * Theta2_grad;
J(i) = 1/m*(sum(sum(-y1 .* log(h) - (1 - y1) .* log(1 - h))));
end
%绘制代价函数曲线
figure
plot(J);
xlabel('number of iterations')
ylabel('Costfunction in the output layer');
%预测
pred = predict(Theta1, Theta2, X);
fprintf('\nTraining Set Accuracy: %f\n', mean(double(pred == y)) * 100);
Output result:
Training Set Accuracy: 100.000000
Cost function image:
Note: The relevant content is modified by referring to Wu Enda's machine learning course and programming assignments, and can be deleted if intruded.