Digital handwriting recognition multilayer perceptron
1 Introduction
Model 1. Introduction multilayer perceptron
2. Training and forecast Multilayer Perceptron model
After the establishment of Multilayer Perceptron model, training model must be able to predict (identification) of these handwritten numbers.
Training
the training data MNIST data set after preprocessing will produce data Features and Labels, and then enter the multilayer perceptron model training, the trained model can be used as the prediction.
Prediction
input digital image, the preprocessing generates Features, after training is completed using the model prediction, the prediction result is a number from 0 to 9.
Step 3. Establish Multilayer Perceptron model
2, data preprocessing
1. Import required modules
from keras.utils import np_utils
import numpy as np
np.random.seed(10)
2. Read MNIST data
from keras.datasets import mnist
(x_train_image, y_train_label),\
(x_test_image, y_test_label) = mnist.load_data()
3. Use reshape conversion features
The original 28x28 digital images to reshape converted into 784 Float number.
x_Train = x_train_image.reshape(60000, 784).astype('float32')
x_Test = x_test_image.reshape(10000, 784).astype('float32')
4. The standardization of features
The standardization features can improve the model prediction accuracy and faster convergence.
x_Train_normalize = x_Train/255
x_Test_normalize = x_Test/255
5. label conversion performed One-Hot Encoding
y_Train_OneHot = np_utils.to_categorical(y_train_label)
y_Test_OneHot = np_utils.to_categorical(y_test_label)
3, model
1. Import required modules
from keras.models import Sequential
from keras.layers import Dense
2. Establish Sequential model
Establishing a linear model of the stack, the latter available model.add () method of each layer of the neural network model is added.
model = Sequential()
3. Establish "input layer" and "hidden layer"
Dense layer neural network characteristics are: all of the upper layer are fully connected to the neurons in the next layer.
model.add(Dense(units = 256, //定义“隐藏层”神经元个数为256
input_dim = 784, //设置“输入层”神经元个数为784
kernel_initializer = 'normal', //使用normal distribution正态分布的随机数来初始化weight(权重)和bias(偏差)
activation = 'relu')) //定义激活函数为relu
4. Establish "output layer"
model.add(Dense(units = 10, //定义“输出层”神经元个数为10(0~9)
kernel_initializer = 'normal',
activation = 'softmax')) //定义激活函数为softmax,softmax可以将神经元的输出转换为预测每一个数字的概率
5. Model Summary View
print(model.summary())
- Hidden layers: a total of 256 neurons in the input layer and the hidden layer is built with the input layer not shown so
- Output layer: a total of 10 neural cells
About Param:
- Param each layer are hyper-parameters (Hyper-Parameters), we need to update the algorithm right of neuronal connections with the heavy bias reverse spread.
- Param each layer is calculated: Param = (the number of the layer neural element) X + (the number of neurons in the current layer) (the number of neurons in this layer)
- Param hidden layer: 200960 = 784 X 256 + 256, Param output layer: 2570 = 256 X 10 + 10
- All must be trained hyperparameter (Trainable Params) is the sum of each layer of the Param, calculated: 2009600 + 2570 = 203,530, generally Trainable Params greater the value, the more complex the model representatives, requires more time training.
4, training
1. Define training methods
model.compile(loss='categorical_crossentropy',
optimizer='adam',metrics=['acc'])
Team training model using the compile method set Definition:
loss: loss of function settings, depth learning to use cross_entropy (cross-entropy) training result is better
optimizer: When setting up training, optimizing the use of adam in depth learning can make training faster convergence, and to improve the accuracy of
metrics: Set Assessment Model way is accuracy
2. Start Training
Use model.fit training, the training process will be stored in the variable train_history
train_history = model.fit(x=x_Train_normalize, //features数字图像的特征值
y=y_Train_OneHot, //label数字图像真实值
validation_split=0.2, //设置训练与验证数据比例:80%作为训练数据,20%作为验证数据
epochs=10, //训练周期数为10
batch_size=200, //每一批次200项数据
verbose=2) //显示训练过程
3. Display the training process
import matplotlib.pyplot as plt
def show_train_history(train_history,train,validation):
plt.plot(train_history.history[train])
plt.plot(train_history.history[validation])
plt.title('Train History') //显示图的标题
plt.ylabel(train) //显示y轴的标签
plt.xlabel('Epoch') //显示x轴的标签是‘Epoch’
plt.legend(['train','validation'],loc='upper left')
plt.show()
4. Draw the accuracy of the results of the assessment
show_train_history(train_history,'acc','val_acc')
The reasons for the accuracy acc training than val_acc verify the accuracy rate of:
accuracy acc training: training data to calculate accuracy, because the same data has been trained, and they brought calculation accuracy, so the accuracy rate will be higher
val_acc verify accuracy: to validate the data to calculate accurate rate, these validation data when the training was not brought before training, so the calculation accuracy rate is relatively low. However, such a calculated accuracy rate of more objective, more in line with the real situation
5. Draw the results of the error
show_train_history(train_history,'loss','val_loss')
5. The accuracy of the test data evaluation model
Use test test data to evaluate the accuracy of the model
scores = model.evaluate(x_Test_normalize, y_Test_OneHot) //使用model.evaluate评估模型的准确率,并存储在scores中
print()
print('accuracy=', scores[1]) //显示准确率
Program code for performing the above results are accurate 0.97
6, prediction
1. forecast
prediction = model.predict_classes(x_Test)
2. The display forecasting data
prediction
Run a result array ([7,2,1, ..., 4,5,6]), i.e., a first term predictor 7, the second prediction result 2 ...
3. Display predictions
import matplotlib.pyplot as plt
def plot_images_labels_prediction(images,labels,prediction,idx,num=10):
fig=plt.gcf()
fig.set_size_inches(12,14)
if num>25: num=25
for i in range(0,num):
ax=plt.subplot(5,5,1+i)
ax.imshow(images[idx],cmap='binary')
title="label="+str(labels[idx])
if len(prediction)>0:
title="label="+str(labels[idx])+",predict="+str(prediction[idx])
ax.set_title(title,fontsize=10)
ax.set_xticks([]);ax.set_yticks([])
idx+=1
plot_images_labels_prediction(x_test_image, y_test_label, prediction, idx = 340)
7, confusion matrix display
Using confusion matrix (also called the error matrix, a specific table display, visual way to understand the outcome of supervised learning algorithms) showed the highest prediction accuracy which numbers, which numbers most easily confused (for example, the real value is 5, but the predicted value is 3)
1. Establish confusion matrix
import pandas as pd
pd.crosstab(y_test_label,prediction,rownames=['label'],colnames=['predict'])
From the above confusion matrix, the following observations were made:
- ** diagonal is predicting the correct number, we found that: * the true value is "1", was correctly predicted as "1" has a number of 1124, the highest prediction accuracy rate, the least confusing. The real value of "5" predictive value "5" is the lowest number of entries there are 854, that is most confusing.
- ** Other non-diagonal numbers represent a label prediction error will become another label, we found that: * the true value is "5", but predicted value is "3."
2. Establish real and predicted values DataFrame
df = pd.DataFrame({'label':y_test_label,'predict':prediction})
df[:2]
3. query data
Pandas DataFrame can easily query the data, such as to find out the true value is "5", but predicted value is "3" data.
df[(df.label == 5)&(df.predict == 3)]
4. Check results
View the first 340 results.
plot_images_labels_prediction(x_test_image,y_test_label,prediction,idx=340,num=1)