具体代码见github

文章目录

1问题
2代码实践

2.1载入数据
2.2模型变量
2.3训练模型超参数
2.4定义模型
2.5模型训练

3评估模型
4应用模型
5结果讨论

1问题

在这里插入图片描述

2代码实践

2.1载入数据

import tensorflow as tf
import tensorflow.examples.tutorials.mnist.input_data as input_data
mnist=input_data.read_data_sets("MNIST/",one_hot=True)

可以下载完数据集放入MNIST文件下，如果用tensorflow自己的方法下载较慢

print("训练集数量：",mnist.train.num_examples,'，验证集数量：',mnist.validation.num_examples,',测试集数量：',mnist.test.num_examples)

在这里插入图片描述

2.2模型变量

定义x和y的占位符

# mnist中每张图片共有28*28=784个像素点
x=tf.placeholder(tf.float32,[None,784],name="X")
# 0-9一共10个数字--10个类别
y=tf.placeholder(tf.float32,[None,10],name="Y")

创建变量

W=tf.Variable(tf.random_normal([784,10]),name='W')
b=tf.Variable(tf.zeros([10]),name="b")

单个神经元构建神经网络

forward=tf.matmul(x,W)+b# 向前计算

关于softmax regression 当我们处理多分类任务时，通常需要使用softmax regression模型 softmax会为每一类估算出一个概率工作原理：当判定为某一的特征相加，然后把这些特征转化为判定是这一类的概率

pred=tf.nn.softmax(forward)

2.3训练模型超参数

train_epochs=100
batch_size=50
total_batch=int(mnist.train.num_examples/batch_size)
display_step=1
learning_rate=0.02

2.4定义模型

定义损失函数

# 定义交叉熵损失函数
loss_function=tf.reduce_mean(-tf.reduce_sum(y*tf.log(pred),reduction_indices=1))

定义优化器

#梯度下降优化器
optimizer=tf.train.GradientDescentOptimizer(learning_rate).minimize(loss_function)

定义准确率

# 检查预测类别tf.argmax(pred,1)与实际类别tf.argmax(y,1)的匹配情况
correct_prediction=tf.equal(tf.argmax(pred,1),tf.argmax(y,1))
# 准确率，将布尔值转化为浮点数，并计算平均值
accuracy=tf.reduce_mean(tf.cast(correct_prediction,tf.float32))
sess=tf.Session()
init=tf.global_variables_initializer()
sess.run(init)

2.5模型训练

# 开始训练
for epoch in range(train_epochs):
    for batch in range(total_batch):
        xs,ys=mnist.train.next_batch(batch_size)#读取数据
        sess.run(optimizer,feed_dict={x:xs,y:ys})# 执行批次训练
        # total_batch个批次训练完成后，使用验证数据计算误差和准确率；验证集没有分批
        loss,acc=sess.run([loss_function,accuracy],feed_dict={x:mnist.validation.images,y:mnist.validation.labels})
        #打印训练过程中信息
        if(epoch+1)%display_step==0:
            print("Train Epoch:","%02d"%(epoch+1),"Loss=","{:.9f}".format(loss),\
                  'Accuracy=',"{:4f}".format(acc))
print("Train Finished")

在这里插入图片描述

3评估模型

accu_test=sess.run(accuracy,feed_dict={x:mnist.test.images,y:mnist.test.labels})
print("test accuracy:",accu_test)

在这里插入图片描述

accu_train=sess.run(accuracy,feed_dict={x:mnist.train.images,y:mnist.train.labels})
print("test accuracy:",accu_test)

在这里插入图片描述

accu_validation=sess.run(accuracy,feed_dict={x:mnist.validation.images,y:mnist.validation.labels})
print("test accuracy:",accu_validation)

在这里插入图片描述

4应用模型

#由于pred预测结果是0ne-hot编码格式，所以转换为0-9的数字
prediction_result=sess.run(tf.argmax(pred,1),feed_dict={x:mnist.test.images})
# 查看结果
prediction_result[0:10]

在这里插入图片描述
定义可视化函数

import matplotlib.pyplot as plt
import numpy as np
def plot_images_labels_prediction(images, #图像列表
                                 labels, # 标签列表
                                 prediction, # 预测值列表
                                 index, #从第index个开始显示
                                 num=10): # 缺省一次显示10幅
    fig=plt.gcf() # 获取当前图标
    fig.set_size_inches(10,12)# 1英寸等于2.54cm
    if num >25: # 最多显示25个图
        num=25
    for i in range(0,num):
        ax=plt.subplot(5,5,i+1)#获取当前要处理的子图
        ax.imshow(np.reshape(images[index],(28,28)),cmap='binary')
        title='label='+str(np.argmax([labels[index]]))
        if len(prediction)>0:
            title+=",predict="+str(prediction[index])
        ax.set_title(title,fontsize=10)
        ax.set_xticks([])
        ax.set_yticks([])
        index+=1
    plt.show

plot_images_labels_prediction(mnist.test.images,
                             mnist.test.labels,
                             prediction_result,10,10)

在这里插入图片描述

5结果讨论

在超参数如下时候，正确率为88.4%

train_epochs=100
batch_size=100
total_batch=int(mnist.train.num_examples/batch_size)
display_step=1
learning_rate=0.01

当超参数如下时候，正确率为90.7%满足条件

train_epochs=100
batch_size=50
total_batch=int(mnist.train.num_examples/batch_size)
display_step=1
learning_rate=0.02

#楚歌

发布了284 篇原创文章 · 获赞 19 · 访问量 2万+

私信关注

深度学习应用实践7MNIST手写数字识别问题的神经元模型实践

文章目录

1问题

2代码实践

2.1载入数据

2.2模型变量

2.3训练模型超参数

2.4定义模型

2.5模型训练

3评估模型

4应用模型

5结果讨论

猜你喜欢