文章目录

第 2 章、使用Python进行DIY

2.5、手写数字的数据集MNIST

2.5.1、准备MNIST训练数据
2.5.2、测试网络
2.5.3、使用完整数据集进行训练和测试
2.5.4、一些改进：调整学习率
2.5.5、一些改进：多次进行
2.5.6、改变网络形状
2.5.7、大功告成
2.5.8、最终代码

第 2 章、使用Python进行DIY

2.5、手写数字的数据集MNIST

2.5.1、准备MNIST训练数据

1、下载训练集和测试集
完整的（训练集6万个，测试集1万个）：
https://pjreddie.com/projects/mnist-in-csv/
不完整的（训练集100个，测试集10个）：
https://raw.githubusercontent.com/makeyourownneuralnetwork/makeyourownneuralnetwork/master/mnist_dataset/mnist_train_100.csv
https://raw.githubusercontent.com/makeyourownneuralnetwork/makeyourownneuralnetwork/master/mnist_dataset/mnist_test_10.csv
2、数据组成
数据放在csv文件中。
一条数据是一行。
每行的第一个是标签，即书写着实际希望表示的数据。
随后的数值，用逗号隔开，是手写体数字的像素值。像素数组尺寸为28*28，因此在一个标签中有784个像素值。
每个像素值的范围是0~255。代表颜色的深度。
3、python查看文件
打开文件，获取其中的内容，然后关闭文件：

data_file = open("mnist_dataset/mnist_train.csv",'r')
data_list = data_file.readlines()
data_file.close()

说明：
（1）、open函数的参数列表，第一个是要打开的文件，第二个是打开方式，这里r代表只读方式。
（2）、readlines()函数的意思是读取整个文件所有的内容，生成一个字符串数组，每行是一个字符串。
（3）、打开的文件必须close()掉，否则可能会出现各种问题。
4、读取一条数据

import numpy
import matplotlib.pyplot
# 训练集有6万条，测试集有1万条
# 读文件，open函数的参数列表，第一个是要打开的文件，第二个是打开方式，这里r代表只读方式
data_file = open("mnist_dataset/mnist_train.csv", 'r')
# readlines()函数的意思是读取整个文件所有的内容，生成一个字符串数组，每行是一个字符串
data_list = data_file.readlines()
# 打开的文件必须close()掉，否则可能会出现各种问题。
data_file.close()
# 输出第1行内容
print(data_list[0])

在这里插入图片描述
5、使用inshow()函数绘制数字矩阵数组

import numpy
import matplotlib.pyplot
# 训练集有6万条，测试集有1万条
# 读文件，open函数的参数列表，第一个是要打开的文件，第二个是打开方式，这里r代表只读方式
data_file = open("mnist_dataset/mnist_train.csv", 'r')
# readlines()函数的意思是读取整个文件所有的内容，生成一个字符串数组，每行是一个字符串
data_list = data_file.readlines()
# 打开的文件必须close()掉，否则可能会出现各种问题。
data_file.close()
# 输出第1行内容
# print(data_list[0])
# 用逗号分隔字符串，将长的文本字符串拆分成单个的值
all_values = data_list[0].split(',')
# 忽略第一个值，从第二个开始，asfarray()将文本字符串转换成数字形式的
# reshape()将剩余的28*28=784个数据转换成28行28列的数组
image_array = numpy.asfarray(all_values[1:]).reshape((28, 28))
# 绘制数组，第一个参数是数组，第二个参数是绘制方式，Greys是灰度调色板
matplotlib.pyplot.imshow(image_array, cmap='Greys', interpolation='None')
# 显示上面的画的图像
matplotlib.pyplot.show()

在这里插入图片描述
6、优化输入数据
前面说过，输入值应控制在0.0到1.0之间，不包含0.0，因为使用非常小或者非常大的输入可能会丧失精度。
所以，我们将0~255的数据映射到0.01~1.00之间，代码如下：
7、目标输出的方案
目标输出这种为10个结点，每个结点对应一种可能的答案或者标签，如果答案是0，第一个结点激发，其余结点则保持一致状态。如下图所示，是可能的情况：
在这里插入图片描述
因为前面说过，定义的目标输出在0.0~1.0之间，而且不包含0.0和1.0，所以，我们定义的0应该为0.01,1.0应该为0.99，比如标签为5的，目标输出应为[0.01, 0.01, 0.01, 0.01, 0.01, 0.99, 0.01, 0.01, 0.01, 0.01]
如下代码可以为某一条数据设置目标输出：

# 整理文件输出
import numpy
import matplotlib.pyplot
# 训练集有6万条，测试集有1万条
# 读文件，open函数的参数列表，第一个是要打开的文件，第二个是打开方式，这里r代表只读方式
data_file = open("mnist_dataset/mnist_train.csv", 'r')
# readlines()函数的意思是读取整个文件所有的内容，生成一个字符串数组，每行是一个字符串
data_list = data_file.readlines()
# 打开的文件必须close()掉，否则可能会出现各种问题。
data_file.close()
# 用逗号分隔字符串，将长的文本字符串拆分成单个的值
all_values = data_list[0].split(',')
# 整理输入数据
scaled_input = (numpy.asfarray(all_values[1:])/255.0*0.99)+0.01
# output nodes is 10(example)
onodes = 10
# 生成一个长度为10的数组，然后都加0.01，这样就10个都是0.01了
targets = numpy.zeros(onodes)+0.01
# 然后再将标签所指的那个点设成0.99即可
targets[int(all_values[0])] = 0.99
# 输出
print(targets)

在这里插入图片描述
8、截止到目前的代码：

# make your own neural network
# code for a 3-layer neural network, and code for learning the MNIST dataset
import numpy
# scipy.special for the sigmoid function expit()
import scipy.special
# library for plotting arrays
import matplotlib.pyplot

# neural network class definition
class neuralNetwork:

    # initialise the neural network
    # 初始化神经网络
    # inputnodes,hiddennodes,outputnodes分别是输入层，隐藏层和输出层网络节点的个数
    # learningrate是学习率
    def __init__(self, inputnodes, hiddennodes, outputnodes, learningrate):

        # set number of nodes in each input, hidden, output layer
        # 设置输入，隐藏和输出层节点的数量
        self.inodes = inputnodes
        self.hnodes = hiddennodes
        self.onodes = outputnodes

        # link weight matrices, wih an who
        # 链接权重矩阵
        # wih是输入层和隐藏层之间的链接权重矩阵W_input_hidden
        # who是输入层和隐藏层之间的链接权重矩阵W_hidden_output
        # weights inside the arrays are w_i_j, where link is from node i to node j in the next layer
        # 数组里的权重是wij，其中链接是从节点i到节点j的下一层
        # w11 w21
        # numpy.random.normal(a,b,(X,Y))的意思是生成一个随机数组,数组大小为X*Y，内容服从中心值为a，方差为b
        self.wih = numpy.random.normal(
            0.0, pow(self.inodes, -0.5), (self.hnodes, self.inodes))
        self.who = numpy.random.normal(
            0.0, pow(self.inodes, -0.5), (self.onodes, self.hnodes))

        # learning rate
        # 设置学习率
        self.lr = learningrate
        # activation function is the sigmod function
        self.activation_function = lambda x: scipy.special.expit(x)
        pass

    # train the neural network
    def train(self, inputs_list, targets_list):
        # vonvert inputs list to 2d array
        inputs = numpy.array(inputs_list, ndmin=2).T
        targets = numpy.array(targets_list, ndmin=2).T
        # calculate signals into hidden layer
        # 计算隐藏层的输入
        # numpy.dot(X,Y)的意思是两个数组的点乘
        hidden_inputs = numpy.dot(self.wih, inputs)
        # calculate the signals emerging from hidden layer
        # 计算隐藏层的输出
        hidden_outputs = self.activation_function(hidden_inputs)
        # calculate signals into final output layer
        # 计算输出层的输入
        final_inputs = numpy.dot(self.who, hidden_outputs)
        # calculate the signals emerging from final output layer
        # 计算输出层的输出
        final_outputs = self.activation_function(final_inputs)
        # output layer error is the (target-actual)
        # 计算输出层的误差
        output_error = targets-final_outputs
        # hidden layer error is the output_error, split by weight,recombined at hidden nodes
        # 计算隐藏层的误差
        hidden_error = numpy.dot(self.who.T, output_error)
        # update the weights for the links between the hidden and output layers
        # 更新隐藏层到输出层的权重
        # 下面行末尾加的反斜杠\的意思是编译的时候忽略换行符
        # 如果一行写不下，在代码末尾加上“\”即可
        # 另外，在括号() {} [] 中的代码不需要换行符“\”，直接换行即可达到同样的效果
        self.who += self.lr * \
            numpy.dot((output_error*final_outputs*(1.0-final_outputs)),
                      numpy.transpose(hidden_outputs))
        # update the weights for the links between the input and hidden layers
        # 更新输入层到隐藏层的权重
        self.wih += self.lr * \
            numpy.dot((hidden_error*hidden_outputs *
                       (1.0-hidden_outputs)), numpy.transpose(inputs))

    # query the neural network
    def query(self, inputs_list):
        # convert inputs list to 2d array
        inputs = numpy.array(inputs_list, ndmin=2).T
        # calculate signals into hidden layer
        # 计算隐藏层的输入
        # numpy.dot(X,Y)的意思是两个数组的点乘
        hidden_inputs = numpy.dot(self.wih, inputs)
        # calculate the signals emerging from hidden layer
        # 计算隐藏层的输出
        hidden_outputs = self.activation_function(hidden_inputs)
        # calculate signals into final output layer
        # 计算输出层的输入
        final_inputs = numpy.dot(self.who, hidden_outputs)
        # calculate the signals emerging from final output layer
        # 计算输出层的输出
        final_outputs = self.activation_function(final_inputs)
        return final_outputs

# number of input, hidden and output nodes
# 设置输入，隐藏和输出层节点的数量
# 输出层有28*28=784个数据
input_nodes = 784
# 这个自己随便设的
hidden_nodes = 100
# 按本例子的的方案，输出有10中，结点有10个
output_nodes = 10

# learning rate is 0.3
# 设置学习率为0.3
learning_rate = 0.3

# create instance of neural network
# 创建一个神经网络的实例
n = neuralNetwork(input_nodes, hidden_nodes, output_nodes, learning_rate)

# load the mnist training data CSV file into a list
# 训练集有6万条，测试集有1万条
# 读文件，open函数的参数列表，第一个是要打开的文件，第二个是打开方式，这里r代表只读方式
training_data_file = open("mnist_dataset/mnist_train.csv", 'r')
# readlines()函数的意思是读取整个文件所有的内容，生成一个字符串数组，每行是一个字符串
training_data_list = training_data_file.readlines()
# 打开的文件必须close()掉，否则可能会出现各种问题。
training_data_file.close()

# train the neural network
# go through all recordes in the training data set
for record in training_data_list:
    # split the record by the ',' commas
    # 用逗号分隔字符串，将长的文本字符串拆分成单个的值
    all_values = record.split(',')
    # scale and shift the inputs
    # 整理输入数据
    inputs = (numpy.asfarray(all_values[1:])/255.0*0.99)+0.01
    # PS：这里能不能不每次都设置新的target，待改进：提前设置好然后直接调用某个目标输出
    # create the target output values (all 0.01, except the desired label which is 0.09)
    # 生成一个长度为10的数组，然后都加0.01，这样就10个都是0.01了
    targets = numpy.zeros(output_nodes)+0.01
    # all_values[0] is the target label for this record
    # 然后再将标签所指的那个点设成0.99即可
    targets[int(all_values[0])] = 0.99
    # 开始训练
    n.train(inputs, targets)
    pass

2.5.2、测试网络

测试代码主要还是用query函数，之前有写过，现在进行一个改写。
另外就是加了个scorecard变量用来统计准确率。

scorecard_array = numpy.asarray(scorecard)
print("performance = ", scorecard_array.sum()/scorecard_array.size)

最终代码如下：

# make your own neural network
# code for a 3-layer neural network, and code for learning the MNIST dataset
import numpy
# scipy.special for the sigmoid function expit()
import scipy.special
# library for plotting arrays
import matplotlib.pyplot

# neural network class definition
class neuralNetwork:

    # initialise the neural network
    # 初始化神经网络
    # inputnodes,hiddennodes,outputnodes分别是输入层，隐藏层和输出层网络节点的个数
    # learningrate是学习率
    def __init__(self, inputnodes, hiddennodes, outputnodes, learningrate):

        # set number of nodes in each input, hidden, output layer
        # 设置输入，隐藏和输出层节点的数量
        self.inodes = inputnodes
        self.hnodes = hiddennodes
        self.onodes = outputnodes

        # link weight matrices, wih an who
        # 链接权重矩阵
        # wih是输入层和隐藏层之间的链接权重矩阵W_input_hidden
        # who是输入层和隐藏层之间的链接权重矩阵W_hidden_output
        # weights inside the arrays are w_i_j, where link is from node i to node j in the next layer
        # 数组里的权重是wij，其中链接是从节点i到节点j的下一层
        # w11 w21
        # numpy.random.normal(a,b,(X,Y))的意思是生成一个随机数组,数组大小为X*Y，内容服从中心值为a，方差为b
        self.wih = numpy.random.normal(
            0.0, pow(self.inodes, -0.5), (self.hnodes, self.inodes))
        self.who = numpy.random.normal(
            0.0, pow(self.inodes, -0.5), (self.onodes, self.hnodes))

        # learning rate
        # 设置学习率
        self.lr = learningrate
        # activation function is the sigmod function
        self.activation_function = lambda x: scipy.special.expit(x)
        pass

    # train the neural network
    def train(self, inputs_list, targets_list):
        # vonvert inputs list to 2d array
        inputs = numpy.array(inputs_list, ndmin=2).T
        targets = numpy.array(targets_list, ndmin=2).T
        # calculate signals into hidden layer
        # 计算隐藏层的输入
        # numpy.dot(X,Y)的意思是两个数组的点乘
        hidden_inputs = numpy.dot(self.wih, inputs)
        # calculate the signals emerging from hidden layer
        # 计算隐藏层的输出
        hidden_outputs = self.activation_function(hidden_inputs)
        # calculate signals into final output layer
        # 计算输出层的输入
        final_inputs = numpy.dot(self.who, hidden_outputs)
        # calculate the signals emerging from final output layer
        # 计算输出层的输出
        final_outputs = self.activation_function(final_inputs)
        # output layer error is the (target-actual)
        # 计算输出层的误差
        output_error = targets-final_outputs
        # hidden layer error is the output_error, split by weight,recombined at hidden nodes
        # 计算隐藏层的误差
        hidden_error = numpy.dot(self.who.T, output_error)
        # update the weights for the links between the hidden and output layers
        # 更新隐藏层到输出层的权重
        # 下面行末尾加的反斜杠\的意思是编译的时候忽略换行符
        # 如果一行写不下，在代码末尾加上“\”即可
        # 另外，在括号() {} [] 中的代码不需要换行符“\”，直接换行即可达到同样的效果
        self.who += self.lr * \
            numpy.dot((output_error*final_outputs*(1.0-final_outputs)),
                      numpy.transpose(hidden_outputs))
        # update the weights for the links between the input and hidden layers
        # 更新输入层到隐藏层的权重
        self.wih += self.lr * \
            numpy.dot((hidden_error*hidden_outputs *
                       (1.0-hidden_outputs)), numpy.transpose(inputs))

    # query the neural network
    def query(self, inputs_list):
        # convert inputs list to 2d array
        inputs = numpy.array(inputs_list, ndmin=2).T
        # calculate signals into hidden layer
        # 计算隐藏层的输入
        # numpy.dot(X,Y)的意思是两个数组的点乘
        hidden_inputs = numpy.dot(self.wih, inputs)
        # calculate the signals emerging from hidden layer
        # 计算隐藏层的输出
        hidden_outputs = self.activation_function(hidden_inputs)
        # calculate signals into final output layer
        # 计算输出层的输入
        final_inputs = numpy.dot(self.who, hidden_outputs)
        # calculate the signals emerging from final output layer
        # 计算输出层的输出
        final_outputs = self.activation_function(final_inputs)
        return final_outputs

# number of input, hidden and output nodes
# 设置输入，隐藏和输出层节点的数量
# 输出层有28*28=784个数据
input_nodes = 784
# 这个自己随便设的
hidden_nodes = 100
# 按本例子的的方案，输出有10中，结点有10个
output_nodes = 10

# learning rate is 0.3
# 设置学习率为0.3
learning_rate = 0.3

# create instance of neural network
# 创建一个神经网络的实例
n = neuralNetwork(input_nodes, hidden_nodes, output_nodes, learning_rate)

# load the mnist training data CSV file into a list
# 训练集有6万条，测试集有1万条
# 读文件，open函数的参数列表，第一个是要打开的文件，第二个是打开方式，这里r代表只读方式
training_data_file = open("mnist_dataset/mnist_train.csv", 'r')
# readlines()函数的意思是读取整个文件所有的内容，生成一个字符串数组，每行是一个字符串
training_data_list = training_data_file.readlines()
# 打开的文件必须close()掉，否则可能会出现各种问题。
training_data_file.close()

# train the neural network
# go through all recordes in the training data set
for record in training_data_list:
    # split the record by the ',' commas
    # 用逗号分隔字符串，将长的文本字符串拆分成单个的值
    training_all_values = record.split(',')
    # scale and shift the inputs
    # 整理输入数据
    training_inputs = (numpy.asfarray(training_all_values[1:])/255.0*0.99)+0.01
    # PS：这里能不能不每次都设置新的target，待改进：提前设置好然后直接调用某个目标输出
    # create the target output values (all 0.01, except the desired label which is 0.09)
    # 生成一个长度为10的数组，然后都加0.01，这样就10个都是0.01了
    training_targets = numpy.zeros(output_nodes)+0.01
    # training_all_values[0] is the target label for this record
    # 然后再将标签所指的那个点设成0.99即可
    training_targets[int(training_all_values[0])] = 0.99
    # 开始训练
    n.train(training_inputs, training_targets)
    pass
# 测试部分代码
# load the mnist test data CSV file into a list
# 读测试数据文件
test_data_file = open("mnist_dataset/mnist_test.csv", 'r')
test_data_list = test_data_file.readlines()
test_data_file.close()

# test the neural network
# scorecard for how well the network perform, initially empty
scorecard = []
# go through all the records in the test data set
for record in test_data_list:
    # split the record by the ',' commas
    all_values = record.split(',')
    # correct answer is first value
    correct_label = int(all_values[0])
    print(correct_label, "correct label")
    # scale and shift the inputs
    inputs = (numpy.asfarray(all_values[1:])/255.0*0.99)+0.01
    # query the network
    outputs = n.query(inputs)
    # the index of the highest value corresponds to the label
    label = numpy.argmax(outputs)
    print(label, "network's answer")
    # append correct or incorrect to list
    if (label == correct_label):
        # network's answer matches correct answer, add 1 to scorecard
        scorecard.append(1)
    else:
        # network's answer doesn't match correct answer, add O to scorecard
        scorecard.append(0)
        pass
    pass
# calculate the performance score, the fraction of correct answers
scorecard_array = numpy.asarray(scorecard)
print("performance = ", scorecard_array.sum()/scorecard_array.size)

在这里插入图片描述

2.5.3、使用完整数据集进行训练和测试

这个数据集有一种精简版的，训练集100个，测试集10个，我一开始就用的是完全版的训练集6万个，测试集1万个，所以之前写的都是完整版的。
2.5.2最后的代码就是了。

2.5.4、一些改进：调整学习率

书上作者经过实验，做了不同学习率下的对比实验，画了个图标，如下：
在这里插入图片描述
实验的结论是：学习率在0.1和0.3之间可能会有较好的表现，因此将学习率改为0.2，准确率得到了微小的提高。
做了两个改动：
1、把学习率改成0.2
2、把输出的lable注释掉
完整代码如下：

# make your own neural network
# code for a 3-layer neural network, and code for learning the MNIST dataset
import numpy
# scipy.special for the sigmoid function expit()
import scipy.special
# library for plotting arrays
import matplotlib.pyplot

# neural network class definition
class neuralNetwork:

    # initialise the neural network
    # 初始化神经网络
    # inputnodes,hiddennodes,outputnodes分别是输入层，隐藏层和输出层网络节点的个数
    # learningrate是学习率
    def __init__(self, inputnodes, hiddennodes, outputnodes, learningrate):

        # set number of nodes in each input, hidden, output layer
        # 设置输入，隐藏和输出层节点的数量
        self.inodes = inputnodes
        self.hnodes = hiddennodes
        self.onodes = outputnodes

        # link weight matrices, wih an who
        # 链接权重矩阵
        # wih是输入层和隐藏层之间的链接权重矩阵W_input_hidden
        # who是输入层和隐藏层之间的链接权重矩阵W_hidden_output
        # weights inside the arrays are w_i_j, where link is from node i to node j in the next layer
        # 数组里的权重是wij，其中链接是从节点i到节点j的下一层
        # w11 w21
        # numpy.random.normal(a,b,(X,Y))的意思是生成一个随机数组,数组大小为X*Y，内容服从中心值为a，方差为b
        self.wih = numpy.random.normal(
            0.0, pow(self.inodes, -0.5), (self.hnodes, self.inodes))
        self.who = numpy.random.normal(
            0.0, pow(self.inodes, -0.5), (self.onodes, self.hnodes))

        # learning rate
        # 设置学习率
        self.lr = learningrate
        # activation function is the sigmod function
        self.activation_function = lambda x: scipy.special.expit(x)
        pass

    # train the neural network
    def train(self, inputs_list, targets_list):
        # vonvert inputs list to 2d array
        inputs = numpy.array(inputs_list, ndmin=2).T
        targets = numpy.array(targets_list, ndmin=2).T
        # calculate signals into hidden layer
        # 计算隐藏层的输入
        # numpy.dot(X,Y)的意思是两个数组的点乘
        hidden_inputs = numpy.dot(self.wih, inputs)
        # calculate the signals emerging from hidden layer
        # 计算隐藏层的输出
        hidden_outputs = self.activation_function(hidden_inputs)
        # calculate signals into final output layer
        # 计算输出层的输入
        final_inputs = numpy.dot(self.who, hidden_outputs)
        # calculate the signals emerging from final output layer
        # 计算输出层的输出
        final_outputs = self.activation_function(final_inputs)
        # output layer error is the (target-actual)
        # 计算输出层的误差
        output_error = targets-final_outputs
        # hidden layer error is the output_error, split by weight,recombined at hidden nodes
        # 计算隐藏层的误差
        hidden_error = numpy.dot(self.who.T, output_error)
        # update the weights for the links between the hidden and output layers
        # 更新隐藏层到输出层的权重
        # 下面行末尾加的反斜杠\的意思是编译的时候忽略换行符
        # 如果一行写不下，在代码末尾加上“\”即可
        # 另外，在括号() {} [] 中的代码不需要换行符“\”，直接换行即可达到同样的效果
        self.who += self.lr * \
            numpy.dot((output_error*final_outputs*(1.0-final_outputs)),
                      numpy.transpose(hidden_outputs))
        # update the weights for the links between the input and hidden layers
        # 更新输入层到隐藏层的权重
        self.wih += self.lr * \
            numpy.dot((hidden_error*hidden_outputs *
                       (1.0-hidden_outputs)), numpy.transpose(inputs))

    # query the neural network
    def query(self, inputs_list):
        # convert inputs list to 2d array
        inputs = numpy.array(inputs_list, ndmin=2).T
        # calculate signals into hidden layer
        # 计算隐藏层的输入
        # numpy.dot(X,Y)的意思是两个数组的点乘
        hidden_inputs = numpy.dot(self.wih, inputs)
        # calculate the signals emerging from hidden layer
        # 计算隐藏层的输出
        hidden_outputs = self.activation_function(hidden_inputs)
        # calculate signals into final output layer
        # 计算输出层的输入
        final_inputs = numpy.dot(self.who, hidden_outputs)
        # calculate the signals emerging from final output layer
        # 计算输出层的输出
        final_outputs = self.activation_function(final_inputs)
        return final_outputs

# number of input, hidden and output nodes
# 设置输入，隐藏和输出层节点的数量
# 输出层有28*28=784个数据
input_nodes = 784
# 这个自己随便设的
hidden_nodes = 100
# 按本例子的的方案，输出有10中，结点有10个
output_nodes = 10

# learning rate is 0.3
# 设置学习率为0.3
learning_rate = 0.2

# create instance of neural network
# 创建一个神经网络的实例
n = neuralNetwork(input_nodes, hidden_nodes, output_nodes, learning_rate)

# load the mnist training data CSV file into a list
# 训练集有6万条，测试集有1万条
# 读文件，open函数的参数列表，第一个是要打开的文件，第二个是打开方式，这里r代表只读方式
training_data_file = open("mnist_dataset/mnist_train.csv", 'r')
# readlines()函数的意思是读取整个文件所有的内容，生成一个字符串数组，每行是一个字符串
training_data_list = training_data_file.readlines()
# 打开的文件必须close()掉，否则可能会出现各种问题。
training_data_file.close()

# train the neural network
# go through all recordes in the training data set
for record in training_data_list:
    # split the record by the ',' commas
    # 用逗号分隔字符串，将长的文本字符串拆分成单个的值
    training_all_values = record.split(',')
    # scale and shift the inputs
    # 整理输入数据
    training_inputs = (numpy.asfarray(training_all_values[1:])/255.0*0.99)+0.01
    # PS：这里能不能不每次都设置新的target，待改进：提前设置好然后直接调用某个目标输出
    # create the target output values (all 0.01, except the desired label which is 0.09)
    # 生成一个长度为10的数组，然后都加0.01，这样就10个都是0.01了
    training_targets = numpy.zeros(output_nodes)+0.01
    # training_all_values[0] is the target label for this record
    # 然后再将标签所指的那个点设成0.99即可
    training_targets[int(training_all_values[0])] = 0.99
    # 开始训练
    n.train(training_inputs, training_targets)
    pass
# 测试部分代码
# load the mnist test data CSV file into a list
# 读测试数据文件
test_data_file = open("mnist_dataset/mnist_test.csv", 'r')
test_data_list = test_data_file.readlines()
test_data_file.close()

# test the neural network
# scorecard for how well the network perform, initially empty
scorecard = []
# go through all the records in the test data set
for record in test_data_list:
    # split the record by the ',' commas
    all_values = record.split(',')
    # correct answer is first value
    correct_label = int(all_values[0])
    # print(correct_label, "correct label")
    # scale and shift the inputs
    inputs = (numpy.asfarray(all_values[1:])/255.0*0.99)+0.01
    # query the network
    outputs = n.query(inputs)
    # the index of the highest value corresponds to the label
    label = numpy.argmax(outputs)
    # print(label, "network's answer")
    # append correct or incorrect to list
    if (label == correct_label):
        # network's answer matches correct answer, add 1 to scorecard
        scorecard.append(1)
    else:
        # network's answer doesn't match correct answer, add O to scorecard
        scorecard.append(0)
        pass
    pass
# calculate the performance score, the fraction of correct answers
scorecard_array = numpy.asarray(scorecard)
print("performance = ", scorecard_array.sum()/scorecard_array.size)

2.5.5、一些改进：多次进行

书上作者经过实验，做了不同代次下的对比实验，画了个图表，如下：
在这里插入图片描述
实验的结论是：随着代次的提高效果会越来越好，但是超过6次之后就会走下坡路。原因可能是过拟合了，可也能是陷入了局部最优，也可能是学习率过高了。
所以作者将学习率改为0.1，做了对比实验，画了个图表，如下：
在这里插入图片描述
实验的结论是：在较大的代数的情况下，较小的学习率表现比较好。
我的代码做了改动，加了代次的代码，让训练整个文件的数据重复执行多次，代次设定为2。
完整代码如下：

# make your own neural network
# code for a 3-layer neural network, and code for learning the MNIST dataset
import numpy
# scipy.special for the sigmoid function expit()
import scipy.special
# library for plotting arrays
import matplotlib.pyplot

# neural network class definition
class neuralNetwork:

    # initialise the neural network
    # 初始化神经网络
    # inputnodes,hiddennodes,outputnodes分别是输入层，隐藏层和输出层网络节点的个数
    # learningrate是学习率
    def __init__(self, inputnodes, hiddennodes, outputnodes, learningrate):

        # set number of nodes in each input, hidden, output layer
        # 设置输入，隐藏和输出层节点的数量
        self.inodes = inputnodes
        self.hnodes = hiddennodes
        self.onodes = outputnodes

        # link weight matrices, wih an who
        # 链接权重矩阵
        # wih是输入层和隐藏层之间的链接权重矩阵W_input_hidden
        # who是输入层和隐藏层之间的链接权重矩阵W_hidden_output
        # weights inside the arrays are w_i_j, where link is from node i to node j in the next layer
        # 数组里的权重是wij，其中链接是从节点i到节点j的下一层
        # w11 w21
        # numpy.random.normal(a,b,(X,Y))的意思是生成一个随机数组,数组大小为X*Y，内容服从中心值为a，方差为b
        self.wih = numpy.random.normal(
            0.0, pow(self.inodes, -0.5), (self.hnodes, self.inodes))
        self.who = numpy.random.normal(
            0.0, pow(self.inodes, -0.5), (self.onodes, self.hnodes))

        # learning rate
        # 设置学习率
        self.lr = learningrate
        # activation function is the sigmod function
        self.activation_function = lambda x: scipy.special.expit(x)
        pass

    # train the neural network
    def train(self, inputs_list, targets_list):
        # vonvert inputs list to 2d array
        inputs = numpy.array(inputs_list, ndmin=2).T
        targets = numpy.array(targets_list, ndmin=2).T
        # calculate signals into hidden layer
        # 计算隐藏层的输入
        # numpy.dot(X,Y)的意思是两个数组的点乘
        hidden_inputs = numpy.dot(self.wih, inputs)
        # calculate the signals emerging from hidden layer
        # 计算隐藏层的输出
        hidden_outputs = self.activation_function(hidden_inputs)
        # calculate signals into final output layer
        # 计算输出层的输入
        final_inputs = numpy.dot(self.who, hidden_outputs)
        # calculate the signals emerging from final output layer
        # 计算输出层的输出
        final_outputs = self.activation_function(final_inputs)
        # output layer error is the (target-actual)
        # 计算输出层的误差
        output_error = targets-final_outputs
        # hidden layer error is the output_error, split by weight,recombined at hidden nodes
        # 计算隐藏层的误差
        hidden_error = numpy.dot(self.who.T, output_error)
        # update the weights for the links between the hidden and output layers
        # 更新隐藏层到输出层的权重
        # 下面行末尾加的反斜杠\的意思是编译的时候忽略换行符
        # 如果一行写不下，在代码末尾加上“\”即可
        # 另外，在括号() {} [] 中的代码不需要换行符“\”，直接换行即可达到同样的效果
        self.who += self.lr * \
            numpy.dot((output_error*final_outputs*(1.0-final_outputs)),
                      numpy.transpose(hidden_outputs))
        # update the weights for the links between the input and hidden layers
        # 更新输入层到隐藏层的权重
        self.wih += self.lr * \
            numpy.dot((hidden_error*hidden_outputs *
                       (1.0-hidden_outputs)), numpy.transpose(inputs))

    # query the neural network
    def query(self, inputs_list):
        # convert inputs list to 2d array
        inputs = numpy.array(inputs_list, ndmin=2).T
        # calculate signals into hidden layer
        # 计算隐藏层的输入
        # numpy.dot(X,Y)的意思是两个数组的点乘
        hidden_inputs = numpy.dot(self.wih, inputs)
        # calculate the signals emerging from hidden layer
        # 计算隐藏层的输出
        hidden_outputs = self.activation_function(hidden_inputs)
        # calculate signals into final output layer
        # 计算输出层的输入
        final_inputs = numpy.dot(self.who, hidden_outputs)
        # calculate the signals emerging from final output layer
        # 计算输出层的输出
        final_outputs = self.activation_function(final_inputs)
        return final_outputs

# number of input, hidden and output nodes
# 设置输入，隐藏和输出层节点的数量
# 输出层有28*28=784个数据
input_nodes = 784
# 这个自己随便设的
hidden_nodes = 100
# 按本例子的的方案，输出有10中，结点有10个
output_nodes = 10

# learning rate is 0.3
# 设置学习率为0.3
learning_rate = 0.3

# create instance of neural network
# 创建一个神经网络的实例
n = neuralNetwork(input_nodes, hidden_nodes, output_nodes, learning_rate)

# load the mnist training data CSV file into a list
# 训练集有6万条，测试集有1万条
# 读文件，open函数的参数列表，第一个是要打开的文件，第二个是打开方式，这里r代表只读方式
training_data_file = open("mnist_dataset/mnist_train.csv", 'r')
# readlines()函数的意思是读取整个文件所有的内容，生成一个字符串数组，每行是一个字符串
training_data_list = training_data_file.readlines()
# 打开的文件必须close()掉，否则可能会出现各种问题。
training_data_file.close()

epochs = 2
# train the neural network
for e in range(epochs):
    # go through all recordes in the training data set
    for record in training_data_list:
        # split the record by the ',' commas
        # 用逗号分隔字符串，将长的文本字符串拆分成单个的值
        training_all_values = record.split(',')
        # scale and shift the inputs
        # 整理输入数据
        training_inputs = (numpy.asfarray(
            training_all_values[1:])/255.0*0.99)+0.01
        # PS：这里能不能不每次都设置新的target，待改进：提前设置好然后直接调用某个目标输出
        # create the target output values (all 0.01, except the desired label which is 0.09)
        # 生成一个长度为10的数组，然后都加0.01，这样就10个都是0.01了
        training_targets = numpy.zeros(output_nodes)+0.01
        # training_all_values[0] is the target label for this record
        # 然后再将标签所指的那个点设成0.99即可
        training_targets[int(training_all_values[0])] = 0.99
        # 开始训练
        n.train(training_inputs, training_targets)
        pass
    pass
# 测试部分代码
# load the mnist test data CSV file into a list
# 读测试数据文件
test_data_file = open("mnist_dataset/mnist_test.csv", 'r')
test_data_list = test_data_file.readlines()
test_data_file.close()

# test the neural network
# scorecard for how well the network perform, initially empty
scorecard = []
# go through all the records in the test data set
for record in test_data_list:
    # split the record by the ',' commas
    all_values = record.split(',')
    # correct answer is first value
    correct_label = int(all_values[0])
    print(correct_label, "correct label")
    # scale and shift the inputs
    inputs = (numpy.asfarray(all_values[1:])/255.0*0.99)+0.01
    # query the network
    outputs = n.query(inputs)
    # the index of the highest value corresponds to the label
    label = numpy.argmax(outputs)
    print(label, "network's answer")
    # append correct or incorrect to list
    if (label == correct_label):
        # network's answer matches correct answer, add 1 to scorecard
        scorecard.append(1)
    else:
        # network's answer doesn't match correct answer, add O to scorecard
        scorecard.append(0)
        pass
    pass
# calculate the performance score, the fraction of correct answers
scorecard_array = numpy.asarray(scorecard)
print("performance = ", scorecard_array.sum()/scorecard_array.size)

在这里插入图片描述

2.5.6、改变网络形状

书上作者经过实验，对隐藏节点的不同做了对比实验，画了个图表，如下：
在这里插入图片描述
实验的结论是：随着隐藏层节点数量的增加，实验的准确率有了一些提高，但是效果不显著，而且产生了很多额外的计算量，因此训练的时间也显著增加了。
因此，必须在我们可以容忍的时间内选择某个数目的隐藏层节点。
我把隐藏层节点改成200，代码就不贴了，运行结果如下：
在这里插入图片描述

2.5.7、大功告成

回顾这项工作，当遇到不同问题时，我们可以做如下改动，来达到期望的效果：
1、改变神经网络的层数
2、改变中间层节点的数目
3、用不同的激活函数
4、用不同的学习率
5、甚至用变化的学习率，将学习率和某些参数联系起来
6、多次训练
7、改变每次训练的数量（每批个数）
8、改变数据集
或者：
1、更换配置更好的电脑（我已经做了。。。）
2、使用GPU加速（计划搞一搞，用NVIDIA的cube来做，具体的还没搞）

2.5.8、最终代码

最终的代码上上一节的末尾已经给出，这里就不再写了。
另外，这一节我发现在每次训练的时候，都得重新训练各个权重，能不能把每次训练好的各个权重都线起来，然后下次训练的时候调用，以便接着训练，或者测试的时候，直接调用这些链接权重，就免去了在此训练的时间。
1、存入csv
工作的主要内容是讲两个权重数组在训练的过程中存入csv文件中：

numpy.savetxt("who.csv", n.who,delimiter = ',')
numpy.savetxt("wih.csv", n.wih,delimiter = ',')

2、改进的存入
本来这里写的是每次训练一个就存一次，后来发现对硬盘的读写耗费太大，遂作罢。
然后，在整个循环外加了个no变量，用来记录次数，然后每训练3000次给权重文件更新一次数据。

if (no % 3000) == 0:
    numpy.savetxt("who.csv", n.who,delimiter = ',')
    numpy.savetxt("wih.csv", n.wih,delimiter = ',')
    print(no)
no += 1

3、从csv文件中取出数据

self.wih = numpy.loadtxt(open("wih.csv","rb"),delimiter=",",skiprows=0)
self.who = numpy.loadtxt(open("who.csv","rb"),delimiter=",",skiprows=0)

4、但是本来就没有这个文件，也没有这个数据，所以得线判断一下是否存在，如果不存在，还是按照原来的随机化一个数组出来

        # 如果有存入文件的权值，就使用，否自初始化一个
        if os.access("wih.csv", os.F_OK):
            self.wih = numpy.loadtxt(open("wih.csv","rb"),delimiter=",",skiprows=0)
        else:
            self.wih = numpy.random.normal(
                0.0, pow(self.inodes, -0.5), (self.hnodes, self.inodes))
        if os.access("who.csv", os.F_OK):
            self.who = numpy.loadtxt(open("who.csv","rb"),delimiter=",",skiprows=0)
        else:
            self.who = numpy.random.normal(
                0.0, pow(self.inodes, -0.5), (self.onodes, self.hnodes))

5、上面的取参数的代码当然是放在类初始化的函数里。
完整代码如下：

# 第二章结束后自己修改了作者的完整的代码后最终的代码
# make your own neural network
# code for a 3-layer neural network, and code for learning the MNIST dataset
import numpy
# scipy.special for the sigmoid function expit()
import scipy.special
# library for plotting arrays
import matplotlib.pyplot
# 这个用于判读文件是否存在
import os

# neural network class definition
class neuralNetwork:

    # initialise the neural network
    # 初始化神经网络
    # inputnodes,hiddennodes,outputnodes分别是输入层，隐藏层和输出层网络节点的个数
    # learningrate是学习率
    def __init__(self, inputnodes, hiddennodes, outputnodes, learningrate):

        # set number of nodes in each input, hidden, output layer
        # 设置输入，隐藏和输出层节点的数量
        self.inodes = inputnodes
        self.hnodes = hiddennodes
        self.onodes = outputnodes

        # link weight matrices, wih an who
        # 链接权重矩阵
        # wih是输入层和隐藏层之间的链接权重矩阵W_input_hidden
        # who是输入层和隐藏层之间的链接权重矩阵W_hidden_output
        # weights inside the arrays are w_i_j, where link is from node i to node j in the next layer
        # 数组里的权重是wij，其中链接是从节点i到节点j的下一层
        # w11 w21
        # numpy.random.normal(a,b,(X,Y))的意思是生成一个随机数组,数组大小为X*Y，内容服从中心值为a，方差为b
        # 如果有存入文件的权值，就使用，否自初始化一个
        if os.access("wih.csv", os.F_OK):
            self.wih = numpy.loadtxt(open("wih.csv","rb"),delimiter=",",skiprows=0)
        else:
            self.wih = numpy.random.normal(
                0.0, pow(self.inodes, -0.5), (self.hnodes, self.inodes))
        if os.access("who.csv", os.F_OK):
            self.who = numpy.loadtxt(open("who.csv","rb"),delimiter=",",skiprows=0)
        else:
            self.who = numpy.random.normal(
                0.0, pow(self.inodes, -0.5), (self.onodes, self.hnodes))

        # learning rate
        # 设置学习率
        self.lr = learningrate
        # activation function is the sigmod function
        self.activation_function = lambda x: scipy.special.expit(x)
        pass

    # train the neural network
    def train(self, inputs_list, targets_list):
        # vonvert inputs list to 2d array
        inputs = numpy.array(inputs_list, ndmin=2).T
        targets = numpy.array(targets_list, ndmin=2).T
        # calculate signals into hidden layer
        # 计算隐藏层的输入
        # numpy.dot(X,Y)的意思是两个数组的点乘
        hidden_inputs = numpy.dot(self.wih, inputs)
        # calculate the signals emerging from hidden layer
        # 计算隐藏层的输出
        hidden_outputs = self.activation_function(hidden_inputs)
        # calculate signals into final output layer
        # 计算输出层的输入
        final_inputs = numpy.dot(self.who, hidden_outputs)
        # calculate the signals emerging from final output layer
        # 计算输出层的输出
        final_outputs = self.activation_function(final_inputs)
        # output layer error is the (target-actual)
        # 计算输出层的误差
        output_error = targets-final_outputs
        # hidden layer error is the output_error, split by weight,recombined at hidden nodes
        # 计算隐藏层的误差
        hidden_error = numpy.dot(self.who.T, output_error)
        # update the weights for the links between the hidden and output layers
        # 更新隐藏层到输出层的权重
        # 下面行末尾加的反斜杠\的意思是编译的时候忽略换行符
        # 如果一行写不下，在代码末尾加上“\”即可
        # 另外，在括号() {} [] 中的代码不需要换行符“\”，直接换行即可达到同样的效果
        self.who += self.lr * \
            numpy.dot((output_error*final_outputs*(1.0-final_outputs)),
                      numpy.transpose(hidden_outputs))
        # update the weights for the links between the input and hidden layers
        # 更新输入层到隐藏层的权重
        self.wih += self.lr * \
            numpy.dot((hidden_error*hidden_outputs *
                       (1.0-hidden_outputs)), numpy.transpose(inputs))
        pass

    # query the neural network
    def query(self, inputs_list):
        # convert inputs list to 2d array
        inputs = numpy.array(inputs_list, ndmin=2).T
        # calculate signals into hidden layer
        # 计算隐藏层的输入
        # numpy.dot(X,Y)的意思是两个数组的点乘
        hidden_inputs = numpy.dot(self.wih, inputs)
        # calculate the signals emerging from hidden layer
        # 计算隐藏层的输出
        hidden_outputs = self.activation_function(hidden_inputs)
        # calculate signals into final output layer
        # 计算输出层的输入
        final_inputs = numpy.dot(self.who, hidden_outputs)
        # calculate the signals emerging from final output layer
        # 计算输出层的输出
        final_outputs = self.activation_function(final_inputs)
        return final_outputs

# number of input, hidden and output nodes
# 设置输入，隐藏和输出层节点的数量
# 输出层有28*28=784个数据
input_nodes = 784
# 这个自己随便设的
hidden_nodes = 200
# 按本例子的的方案，输出有10中，结点有10个
output_nodes = 10

# learning rate is 0.2
# 设置学习率为0.2
learning_rate = 0.2

# create instance of neural network
# 创建一个神经网络的实例
n = neuralNetwork(input_nodes, hidden_nodes, output_nodes, learning_rate)

# load the mnist training data CSV file into a list
# 训练集有6万条，测试集有1万条
# 读文件，open函数的参数列表，第一个是要打开的文件，第二个是打开方式，这里r代表只读方式
training_data_file = open("mnist_dataset/mnist_train.csv", 'r')
# readlines()函数的意思是读取整个文件所有的内容，生成一个字符串数组，每行是一个字符串
training_data_list = training_data_file.readlines()
# 打开的文件必须close()掉，否则可能会出现各种问题。
training_data_file.close()

epochs = 3
# train the neural network
for e in range(epochs):
    no = 0
    # go through all recordes in the training data set
    for record in training_data_list:
        # split the record by the ',' commas
        # 用逗号分隔字符串，将长的文本字符串拆分成单个的值
        training_all_values = record.split(',')
        # scale and shift the inputs
        # 整理输入数据
        training_inputs = (numpy.asfarray(
            training_all_values[1:])/255.0*0.99)+0.01
        # PS：这里能不能不每次都设置新的target，待改进：提前设置好然后直接调用某个目标输出
        # create the target output values (all 0.01, except the desired label which is 0.09)
        # 生成一个长度为10的数组，然后都加0.01，这样就10个都是0.01了
        training_targets = numpy.zeros(output_nodes)+0.01
        # training_all_values[0] is the target label for this record
        # 然后再将标签所指的那个点设成0.99即可
        training_targets[int(training_all_values[0])] = 0.99
        # 开始训练
        n.train(training_inputs, training_targets)
        # 将更新后的权值放入文件中
        if (no % 3000) == 0:
            numpy.savetxt("who.csv", n.who,delimiter = ',')
            numpy.savetxt("wih.csv", n.wih,delimiter = ',')
            print(no)
        no += 1
        pass
    pass
# 测试部分代码
# load the mnist test data CSV file into a list
# 读测试数据文件
test_data_file = open("mnist_dataset/mnist_test.csv", 'r')
test_data_list = test_data_file.readlines()
test_data_file.close()

# test the neural network
# scorecard for how well the network perform, initially empty
scorecard = []
# go through all the records in the test data set
for record in test_data_list:
    # split the record by the ',' commas
    all_values = record.split(',')
    # correct answer is first value
    correct_label = int(all_values[0])
    # print(correct_label, "correct label")
    # scale and shift the inputs
    inputs = (numpy.asfarray(all_values[1:])/255.0*0.99)+0.01
    # query the network
    outputs = n.query(inputs)
    # the index of the highest value corresponds to the label
    label = numpy.argmax(outputs)
    # print(label, "network's answer")
    # append correct or incorrect to list
    if (label == correct_label):
        # network's answer matches correct answer, add 1 to scorecard
        scorecard.append(1)
    else:
        # network's answer doesn't match correct answer, add O to scorecard
        scorecard.append(0)
        pass
    pass
# calculate the performance score, the fraction of correct answers
scorecard_array = numpy.asarray(scorecard)
print("performance = ", scorecard_array.sum()/scorecard_array.size)

另：一些说明
1、本博客仅用于学习交流，欢迎大家瞧瞧看看，为了方便大家学习。
2、如果原作者认为侵权，请及时联系我，我的qq是244509154，邮箱是[email protected]，我会及时删除侵权文章。
3、我的文章大家如果觉得对您有帮助或者您喜欢，请您在转载的时候请注明来源，不管是我的还是其他原作者，我希望这些有用的文章的作者能被大家记住。
4、最后希望大家多多的交流，提高自己，从而对社会和自己创造更大的价值。

05 第二章、使用Python进行DIY 2.5