Reading notes (ii) a digital handwriting recognition multilayer perceptron

Digital handwriting recognition multilayer perceptron

1 Introduction

Model 1. Introduction multilayer perceptron

Multilayer Perceptron model

2. Training and forecast Multilayer Perceptron model

After the establishment of Multilayer Perceptron model, training model must be able to predict (identification) of these handwritten numbers.

Training
the training data MNIST data set after preprocessing will produce data Features and Labels, and then enter the multilayer perceptron model training, the trained model can be used as the prediction.
Prediction
input digital image, the preprocessing generates Features, after training is completed using the model prediction, the prediction result is a number from 0 to 9.

Step 3. Establish Multilayer Perceptron model

2, data preprocessing

1. Import required modules

from keras.utils import np_utils
import numpy as np
np.random.seed(10)

2. Read MNIST data

from keras.datasets import mnist
(x_train_image, y_train_label),\
(x_test_image, y_test_label) = mnist.load_data()

3. Use reshape conversion features

The original 28x28 digital images to reshape converted into 784 Float number.

x_Train = x_train_image.reshape(60000, 784).astype('float32')
x_Test = x_test_image.reshape(10000, 784).astype('float32')

4. The standardization of features

The standardization features can improve the model prediction accuracy and faster convergence.

x_Train_normalize = x_Train/255
x_Test_normalize = x_Test/255

5. label conversion performed One-Hot Encoding

y_Train_OneHot = np_utils.to_categorical(y_train_label)
y_Test_OneHot = np_utils.to_categorical(y_test_label)

3, model

1. Import required modules

from keras.models import Sequential
from keras.layers import Dense

2. Establish Sequential model

Establishing a linear model of the stack, the latter available model.add () method of each layer of the neural network model is added.

model = Sequential()

3. Establish "input layer" and "hidden layer"

Dense layer neural network characteristics are: all of the upper layer are fully connected to the neurons in the next layer.

model.add(Dense(units = 256,    //定义“隐藏层”神经元个数为256
                input_dim = 784,    //设置“输入层”神经元个数为784
                kernel_initializer = 'normal',      //使用normal distribution正态分布的随机数来初始化weight(权重)和bias(偏差)
                activation = 'relu'))   //定义激活函数为relu

4. Establish "output layer"

model.add(Dense(units = 10,     //定义“输出层”神经元个数为100~9)
               kernel_initializer = 'normal',
               activation = 'softmax'))      //定义激活函数为softmax,softmax可以将神经元的输出转换为预测每一个数字的概率

5. Model Summary View

print(model.summary())

  • Hidden layers: a total of 256 neurons in the input layer and the hidden layer is built with the input layer not shown so
  • Output layer: a total of 10 neural cells

About Param:

  • Param each layer are hyper-parameters (Hyper-Parameters), we need to update the algorithm right of neuronal connections with the heavy bias reverse spread.
  • Param each layer is calculated: Param = (the number of the layer neural element) X + (the number of neurons in the current layer) (the number of neurons in this layer)
  • Param hidden layer: 200960 = 784 X 256 + 256, Param output layer: 2570 = 256 X 10 + 10
  • All must be trained hyperparameter (Trainable Params) is the sum of each layer of the Param, calculated: 2009600 + 2570 = 203,530, generally Trainable Params greater the value, the more complex the model representatives, requires more time training.

4, training

1. Define training methods

model.compile(loss='categorical_crossentropy',
             optimizer='adam',metrics=['acc'])

Team training model using the compile method set Definition:

loss: loss of function settings, depth learning to use cross_entropy (cross-entropy) training result is better
optimizer: When setting up training, optimizing the use of adam in depth learning can make training faster convergence, and to improve the accuracy of
metrics: Set Assessment Model way is accuracy

2. Start Training

Use model.fit training, the training process will be stored in the variable train_history

train_history = model.fit(x=x_Train_normalize,  //features数字图像的特征值
                          y=y_Train_OneHot,     //label数字图像真实值
                          validation_split=0.2,    //设置训练与验证数据比例:80%作为训练数据,20%作为验证数据
                          epochs=10,       //训练周期数为10
                          batch_size=200,    //每一批次200项数据
                          verbose=2)   //显示训练过程

3. Display the training process

import matplotlib.pyplot as plt
def show_train_history(train_history,train,validation):
    plt.plot(train_history.history[train])
    plt.plot(train_history.history[validation])
    plt.title('Train History')     //显示图的标题
    plt.ylabel(train)       //显示y轴的标签
    plt.xlabel('Epoch')     //显示x轴的标签是‘Epoch’
    plt.legend(['train','validation'],loc='upper left')
    plt.show()

4. Draw the accuracy of the results of the assessment

show_train_history(train_history,'acc','val_acc')

The reasons for the accuracy acc training than val_acc verify the accuracy rate of:

accuracy acc training: training data to calculate accuracy, because the same data has been trained, and they brought calculation accuracy, so the accuracy rate will be higher
val_acc verify accuracy: to validate the data to calculate accurate rate, these validation data when the training was not brought before training, so the calculation accuracy rate is relatively low. However, such a calculated accuracy rate of more objective, more in line with the real situation

5. Draw the results of the error

show_train_history(train_history,'loss','val_loss')

5. The accuracy of the test data evaluation model

Use test test data to evaluate the accuracy of the model

scores = model.evaluate(x_Test_normalize, y_Test_OneHot)  //使用model.evaluate评估模型的准确率,并存储在scores中
print()
print('accuracy=', scores[1])      //显示准确率

Program code for performing the above results are accurate 0.97

6, prediction

1. forecast

prediction = model.predict_classes(x_Test)

2. The display forecasting data

prediction

Run a result array ([7,2,1, ..., 4,5,6]), i.e., a first term predictor 7, the second prediction result 2 ...

3. Display predictions

import matplotlib.pyplot as plt
def plot_images_labels_prediction(images,labels,prediction,idx,num=10):
    fig=plt.gcf()
    fig.set_size_inches(12,14)
    if num>25: num=25
    for i in range(0,num):
        ax=plt.subplot(5,5,1+i)
        ax.imshow(images[idx],cmap='binary')
        title="label="+str(labels[idx])
        if len(prediction)>0:
            title="label="+str(labels[idx])+",predict="+str(prediction[idx])
        ax.set_title(title,fontsize=10)
        ax.set_xticks([]);ax.set_yticks([])
        idx+=1
    
plot_images_labels_prediction(x_test_image, y_test_label, prediction, idx = 340)

7, confusion matrix display

Using confusion matrix (also called the error matrix, a specific table display, visual way to understand the outcome of supervised learning algorithms) showed the highest prediction accuracy which numbers, which numbers most easily confused (for example, the real value is 5, but the predicted value is 3)

1. Establish confusion matrix

import pandas as pd
pd.crosstab(y_test_label,prediction,rownames=['label'],colnames=['predict'])


From the above confusion matrix, the following observations were made:

  • ** diagonal is predicting the correct number, we found that: * the true value is "1", was correctly predicted as "1" has a number of 1124, the highest prediction accuracy rate, the least confusing. The real value of "5" predictive value "5" is the lowest number of entries there are 854, that is most confusing.
  • ** Other non-diagonal numbers represent a label prediction error will become another label, we found that: * the true value is "5", but predicted value is "3."

2. Establish real and predicted values ​​DataFrame

df = pd.DataFrame({'label':y_test_label,'predict':prediction})
df[:2]

3. query data

Pandas DataFrame can easily query the data, such as to find out the true value is "5", but predicted value is "3" data.

df[(df.label == 5)&(df.predict == 3)]

4. Check results

View the first 340 results.

plot_images_labels_prediction(x_test_image,y_test_label,prediction,idx=340,num=1)

Released three original articles · won praise 3 · Views 144

Guess you like

Origin blog.csdn.net/bboyliang67/article/details/104123123