Manual calculations neural networks III: data read and complete training

10137682-0075bf83f7210b46
image

Big Data Digest produced

Author: Jiang Shang Bao

Little friends ~ ~ okay you build neural networks Numpy, we have come to the third period. The first phase digest bacteria teach you how to build a simple neural network with Numpy, completed the feedforward part. The second period for everyone to bring a gradient descent relevant knowledge points.

In this issue, we teach you how to read the data set, and the set of data used to train the neural network, and the two, as this is still implemented Numpy. Before the beginning of the code, bacteria digest before we find out today datasets we use.

Dataset Introduction

Data collection using the famous MNIST handwritten data sets. According to the official website, this data set has 70,000 samples, including 60,000 training samples, 10,000 test samples.

After the data sets downloaded, the file is divided into four parts, namely: training set picture, label the training set, test set pictures, the test set labels. The data is stored in binary format.

10137682-5f7c71a097314289
image

Wherein the first 16 bytes of the training set is stored image file number, the number of rows and columns of numbers and other image. Training set label file first 8 bytes of storage such as the number of picture label. Two sets of test files the same way.

10137682-6f6d3a0d8816eb1b
image

Bacteria digest the downloaded file storage address

Read data

train_img_path=r'C:\Users\Dell\MNIST\train-images.idx3-ubyte'

The files are stored in the local address extracting, generating four addresses, the above code 'r' is the escape character, because \ special usage in Python, the file address clear escape characters required.

In order to make the back of the model performed better, we will split the training set, split into 50 000 and 10 000 training set validation set.

Note: The model validation set is during training sample set aside separately, it can be used to adjust the parameters of super model and the model for the ability to carry out a preliminary assessment.

import struct

Because the files are stored in binary format, the data read mode is 'rb'. And because we need the data displayed in Arabic numerals way. So here we used Python's struct package. The struct.unpack ( '> 4i', f.read (16))> direction represents the number of bytes of storage, i is an integer, the first four 4 represents an integer required. f.read (16) means to read 16 bytes, i.e., 4 integer, as a 4-byte integer equal.

reshape (-1,28 * 28): -1 if the parameter is present, indicating that the other parameters determined by the parameter converting one-dimensional array is a two-dimensional matrix, and the second argument is the number of each row-1. number.

NOTE: fromfile Usage np.fromfile (frame, dtype = np.float, count = -1, sep = ''), wherein: frame: file string. Reading data types: dtype. count: the number of elements is read, -1 represents read the entire file. sep: dividing the data string.

Read the complete file, then display the data in a manner with pictures.

import matplotlib.pyplot as plt

Note that if you do not define cmap = 'gray', the picture background will be very strange.

   ![image](https://upload-images.jianshu.io/upload_images/10137682-d258846d6fe2a0af?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)

Test, finished defining the function, the display is this -

Data display and reading is complete, the next start training parameters.

Training data

Before we begin, in order to be able to link up and down, we first Code Course Tieshanglai ~

def tanh(x):

First define two activation function of the derivative, derivative pushed to the specific process is not presented here, interested students can search for themselves.

def d_softmax(data):

Wherein the derivative is tanh np.diag (. 1 / (np.cosh (Data)) 2), it is the result of optimizing. 1 / (np.cosh (Data)) 2

NOTE: diag generating a diagonal matrix, the role of outer function is the first parameter the second parameter obtained by multiplying one by one matrix

Then a dictionary definition, and the number is resolved to a certain position of the 1-dimensional matrix

differential = {softmax:d_softmax,tanh:d_tanh}

Squared difference function, which parameters are parameters that we initialize the first course defined, in the process of training, will be automatically updated.

def sqr_loss(img,lab,parameters):

Gradient calculation

def grad_parameters(img,lab,init_parameters):

The gradient calculation formula uses the formula: (y_predict-y) ^ 2, the composite derivation function, so there -2 (y_prdict-y) multiplied by the derivative, which is behind the origins -2 grad_b1.

It stands to reason should be more lead defined number of [f (x + h) -f (x)] / h under validate our gradient seek the right to take care of novice students understand the process of the neural network, this step is omitted here in Kazakhstan .

Here enters the training session, we will input data in batch fashion, positioning each batch contains 100 images. batch_size = 100. Obtaining average gradient is obtained, the code embodied in: grad_accu [key] / = batch_size.

def train_batch(current_batch,parameters):

Using copy mechanism is to avoid changes in parameters affect the overall training, copy.deepcopy can re-copy does not affect the original data.

And here we used the formula:

   ![image](https://upload-images.jianshu.io/upload_images/10137682-4b009fd89bddba41?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)

Then define the learning rate:

def learn_self(learn_rate):

Inside the if statement allows us to see the progress of neural network training.

   ![image](https://upload-images.jianshu.io/upload_images/10137682-8eceaa6fac56fe2f?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)

Here, we have completed a training neural networks, in order to verify the accuracy of how we can use to verify how accurate set look.

Loss custom validation set:

def valid_loss(parameters):

Accuracy of the calculation:

def valid_accuracy(parameters):

Finally get the results:

   ![image](https://upload-images.jianshu.io/upload_images/10137682-7a1e260982c54fa7?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)

90% accuracy hey - better result, but fortunately, after all, did not learn how to adjust and solve the over-fitting rate.

Well, the contents of this issue on this, and you can enjoy some more content to digest the next issue, we talk about how to adjust the learning rate and look at a more complex neural networks.

Guess you like

Origin blog.csdn.net/weixin_33883178/article/details/90783668