1.单层神经网络
2.多层神经网络
3.MLP的3个步骤
MLP learning procedure in three simple steps:
- Starting at the input layer, we forward propagate the patterns of the training data through the network to generate an output.
- Based on the network's output, we calculate the error that we want to minimize using a cost function that we will describe later.
- We backpropagate the error, find its derivative with respect to each weight inthe network, and update the model.
前向算法
隐藏层中的每个单元链接所有输入层,计算隐藏层的激活单元
输出也是同样的方法
4.Obtaining the MNIST dataset
获取60000个训练集和10000个测试集,将原始的数据转换成784(28*28)像素的数据。
# -*- coding: utf-8 -*-
"""
Created on Sat Nov 10 14:30:38 2018
@author:YRP
"""
import os
import struct
import numpy as np
#Load_mnist返回两个值样品和特征
def load_mnist(path, kind='train'):
"""Load MNIST data from `path`"""
labels_path = os.path.join(path,
'%s-labels.idx1-ubyte' % kind)
images_path = os.path.join(path,
'%s-images.idx3-ubyte' % kind)
with open(labels_path, 'rb') as lbpath:
magic, n = struct.unpack('>II',
lbpath.read(8))
labels = np.fromfile(lbpath,
dtype=np.uint8)
with open(images_path, 'rb') as imgpath:
magic, num, rows, cols = struct.unpack(">IIII",
imgpath.read(16))
images = np.fromfile(imgpath,
dtype=np.uint8).reshape(
len(labels), 784)
images = ((images / 255.) - .5) * 2
return images, labels
#读取60000个训练集和10000个测试集
X_train, y_train = load_mnist('', kind='train')
print('Rows: %d, columns: %d'
% (X_train.shape[0], X_train.shape[1]))
X_test, y_test = load_mnist('', kind='t10k')
print('Rows: %d, columns: %d'
% (X_test.shape[0], X_test.shape[1]))
#显示图像中的1到9
import matplotlib.pyplot as plt
fig, ax = plt.subplots(nrows=2, ncols=5,
sharex=True, sharey=True)
ax = ax.flatten()
for i in range(10):
img = X_train[y_train == i][0].reshape(28, 28)
ax[i].imshow(img, cmap='Greys')
ax[0].set_xticks([])
ax[0].set_yticks([])
plt.tight_layout()
plt.show()
#存储训练和测试集到文件中
np.savez_compressed('mnist_scaled.npz',
X_train=X_train,
y_train=y_train,
X_test=X_test,
y_test=y_test)
#将文件读取
mnist = np.load('mnist_scaled.npz')
图像显示结果
5.区分手写数据 Classifying handwritten digits
实现MLP包括一层输入、一层隐藏、一层输出,来对MNIST的数据集进行识别
对55000个数据进行训练,留下5000个数据进行验证
在NeuralNetMLP中设置参数
-
-
- l2: This is the l parameter for L2 regularization to decrease the degree of overfitting.
- epochs: This is the number of passes over the training set.
- eta: This is the learning rate h .
- shuffle: This is for shuffling the training set prior to every epoch to prevent that the algorithm gets stuck in circles.
- seed: This is a random seed for shuffling and weight initialization.
- minibatch_size: This is the number of training samples in each mini-batch when splitting of the training data in each epoch for stochastic gradient descent. The gradient is computed for each mini-batch separately instead of the entire training data for faster learning.
-
通过得到200个epochs的cost,绘制出如下图表
得到200Epochs的验证和训练精度
最后通过分析验证集和训练集的精度评估模型的泛化能力
Test accuracy: 97.54%
观察一个5*5的子图矩阵,其中副标题中的第一个数字表示图索引,第二个数字表示真正的类标签(t),第三个数字表示预测的类标签(p):
import os
import mlp
import numpy as np
import matplotlib.pyplot as plt
mnist = np.load('./mnist/mnist_scaled.npz')
X_train, y_train, X_test, y_test = [mnist[f] for f in mnist.files]
n_epochs = 200
if 'TRAVIS' in os.environ:
n_epochs = 20
nn = mlp.NeuralNetMLP(n_hidden=100,
l2=0.01,
epochs=n_epochs,
eta=0.0005,
minibatch_size=100,
shuffle=True,
seed=1)
nn.fit(X_train=X_train[:55000],
y_train=y_train[:55000],
X_valid=X_train[55000:],
y_valid=y_train[55000:])
plt.plot(range(nn.epochs), nn.eval_['cost'])
plt.ylabel('Cost')
plt.xlabel('Epochs')
plt.savefig('images/costEpochs.png', dpi=300)
plt.show()
plt.plot(range(nn.epochs), nn.eval_['train_acc'], label='training')
plt.plot(range(nn.epochs), nn.eval_['valid_acc'], label='validation', linestyle='--')
plt.ylabel('Accuracy')
plt.xlabel('Epochs')
plt.legend()
plt.savefig('images/accuracyEpochs.png', dpi=300)
plt.show()
y_test_pred = nn.predict(X_test)
acc = (np.sum(y_test == y_test_pred)
.astype(np.float) / X_test.shape[0])
print('Test accuracy: %.2f%%' % (acc * 100))
miscl_img = X_test[y_test != y_test_pred][:25]
correct_lab = y_test[y_test != y_test_pred][:25]
miscl_lab = y_test_pred[y_test != y_test_pred][:25]
fig, ax = plt.subplots(nrows=5, ncols=5, sharex=True, sharey=True,)
ax = ax.flatten()
for i in range(25):
img = miscl_img[i].reshape(28, 28)
ax[i].imshow(img, cmap='Greys', interpolation='nearest')
ax[i].set_title('%d) t: %d p: %d' % (i+1, correct_lab[i], miscl_lab[i]))
ax[0].set_xticks([])
ax[0].set_yticks([])
plt.tight_layout()
plt.savefig('images/misclassifying.png', dpi=300)
plt.show()
参考资料:《Python Machine Learning(2th)》