NN (Neural Network) of deep learning

According to my understanding, briefly introduce the function of each function in the NN of a DeepLearn Toolbox toolbox, and there may be some errors in individual expressions or understanding. Hahahahaha, just write and play casually ( ⊙ o ⊙ )!

Reference blog: Detailed code explanation and
a brief introduction
to forward propagation and back propagation

This toolbox download link: click to download

First you need to know how to train a NN. Open the test_example_NN.m file under the tests folder, the comment vanilla neural net on line 13 represents the original neural network. A NN is trained in the steps of code 15 to 20.
The following code is a general network model, here we call this the main program

load mnist_uint8;
train_x = double(train_x) / 255;
test_x  = double(test_x)  / 255;
train_y = double(train_y);
test_y  = double(test_y);

% normalize
[train_x, mu, sigma] = zscore(train_x);
test_x = normalize(test_x, mu, sigma);

%% ex1 vanilla neural net
rand('state',0)
nn = nnsetup([784 100 10]);
opts.numepochs =  1;   %  Number of full sweeps through data
opts.batchsize = 100;  %  Take a mean gradient step over this many samples
[nn, L] = nntrain(nn, train_x, train_y, opts);
[er, bad] = nntest(nn, test_x, test_y);
assert(er < 0.08, 'Too big error');

Among them, three functions nnsetup, nntrain, and nntest are used, which will be introduced below. Two parameters opts.numepochs and opts.batchsize are used. opts.numepochs indicates the number of training (cycles), equal to 1 means training once. opts.batchsize indicates the amount of data (samples) used, equal to 100 means that 100 sets of data, or 100 pictures, are used.

nnsetup: Obtain the overall structure of the network from the architecture, as well as the initialization value of each parameter, etc.
For example, architecture=[784 100 10] means that the input layer is a 784-dimensional input, and a handwritten picture is 28*28, so it is 784. 100 hidden layers, which can be modified arbitrarily, and 10 output layers, because handwritten There are 10 solutions from 0-9, so it is 10.
nn.n indicates the number of layers of the network,
followed by a lot of parameters
, and then the for loop is to initialize the network structure of each layer, there are three parameters w, vw, p, where w is the main parameter
vw is temporary when updating parameters Parameter, p is sparsity (noun in sparse coding, you can check it yourself)
nnsetup is so much

nntrain: Train a neural network. Line 4 of the main function is called to
return the updated neural network nn. nn.a nn.e nn.W nn.b represents the activation, error, weight, and bias after the update, and L is the squared error of each training.
Among them, m represents the number of training samples, and the suffix of opts is the various parameters we set.
assert(rem(numbatches, 1) == 0, 'numbatches must be a integer');
rem is the remainder, and assert is an assertion. I feel like a judgment. That sentence requires that numbatches be an integer
zeros (m, n) to generate an m. The zero matrix
for loop of *n represents training numepochs times
tic is a stopwatch timer, from tic timing to the end of toc
randperm (m) is to generate a random array of 1 to m, which means that batches should be trained and nested out of
order The for loop represents the number of training samples.
The if statement is to add noise, which is to adjust some data to 0. inputZeroMaskedFraction represents the adjustment ratio.
There are three functions below, nnff represents forward propagation, and nnbp represents backward propagation. nnapplygrads means gradient descent

        nn = nnff(nn, batch_x, batch_y);
        nn = nnbp(nn);
        nn = nnapplygrads(nn);

nnff means that the network runs forward once, and it will calculate dropou, sparsity, error (error), and loss
dropout will prevent overfitting, and sparsity means sparseness

The core of the backpropagation algorithm is the partial derivative of the cost function CC to the parameters in the network (weight ww and bias bb of each layer). The idea of ​​the BP algorithm is: if the current cost function value is far from the expected value, then we adjust the values ​​of ww and bb to make the new cost function value closer to the expected value (the greater the difference from the expected value, the adjusted ww and bb the greater the magnitude). Repeat this process until the final cost function value is within the error range, then the algorithm stops

Gradient descent: nn.weightPenaltyL2 is the part of weight decay, and it is also a parameter that can be set during nnsetup. If there is
any, add weight Penalty to prevent overfitting, then adjust it according to the size of momentum, and finally change nn.W{i} can

nntest is the last function that calls nnpredict and compares it with the set of tests. The function needs test data x and label y. If there is y, the accuracy can be calculated. If there is no y, you can directly call labels=nnpredict(nn, x) to get the predicted label

nnpredict is nnff once, and the returned label labels (number of columns) is returned, which is to return the maximum value of each row and the number of columns in which it is located

The program operation is as follows
You run the main program directly
I used the default name
If the following error occurs

>> untitled
错误使用 load
无法读取文件 'mnist_uint8'。没有此类文件或目录。

出错 untitled (line 1)
load mnist_uint8;

It means that you have not added the data file to the path, mnist_uint8 is under the data folder, use the following command to add

>> addpath data

Run untitled again, if the following error occurs again:

>> untitled
错误使用 normalize>parseInputs (line 218)
维度必须为正整数。

出错 normalize (line 83)
[dim,method,methodType,dataVars,AisTablular] = parseInputs(A,varargin{
    
    :});

出错 untitled (line 9)
test_x = normalize(test_x, mu, sigma);
 

You need to add the util folder to the path,

>> addpath util

Run untitled again, if there is an error as follows

未定义函数或变量 'nntrain'

出错 untitled (line 16)
[nn, L] = nntrain(nn, train_x, train_y, opts);

this. . . . . Or because of the lack of path, you add the NN folder path,

>> addpath NN

This should be successful. We will do everything we can and try our best. . .
The following results will appear:

>> untitled
epoch 1/1. Took 1.5443 seconds. Mini-batch mean squared error on training 
set is 0.16243; Full-batch train err = 0.075586

The result means: it took 1.5443 seconds. The small batch mean square error on the training set is 0.16243; the full batch training error is 0.075586

Because we only trained one, there is only one result.
More training is in tests\test_example_NN, this will output several results, try to run more by yourself

Guess you like

Origin blog.csdn.net/qq_36693723/article/details/103322323