TensorFlow实现Softmax Regression识别手写数字

本文是按照黄文坚、唐源所著的《TensorFlow实战》一书，进行编写。在TensorFlow实战之余，力求简洁地讲清当中涉及机器学习、深度学习的原理，而并非只是简单调用TensorFlow中的Python api！

1 Softmax Regression原理

要讲Softmax回归，就要先讲Logistic回归。定性地来讲，Softmax回归是Logitic回归的拓展。一般而言，Logistic常用于二分类问题上，而Softmax回归则可用于多分类的问题上，比如后面实现的手写数字识别。

这里在使用TensorFlow实现Softmax回归识别手写数字之前，简单地讲解一下Softmax回归的原理。此处讲解以和Logistic回归对比为主。

Logistic回归和Softmax回归的对比：

前提：训练集由m个已标记（label）的样本构成：，其中，输入特征

无论是使用Logitic回归，还是Softmax回归，对于J(θ)的最小化问题，目前还没有闭式解法（即没有严格的公式，给出任意自变量就可以求出因变量的方法）。因此，我们可使用梯度下降法等，进行迭代优化。此时，需要求偏导，然后，调整学习速率进行权重更新。

2 算法实现流程

A 加载MNIST数据集

MNIST数据集下载网站：http://yann.lecun.com/exdb/mnist/，TensorFlow中有函数可以自动下载MNIST数据，如果下好的话，运行会更快一些。

B 初始化参数

在TensorFlow中使用placeholder初始化自变量x，使用Variable初始化权重w和偏置b。

C 构造模型

此程序使用的是Softmax模型，在TensorFlow中直接可调用

D 迭代训练参数

代价函数使用交叉熵的思想，梯度下降进行优化，多次迭代进行参数更新。

E 显示在测试集中的准确率

3 编程实现

# -*- coding:utf-8 -*-
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
import os
# 本程序是tensorflow中的基本例程： 使用softmax回归实现手写数字识别
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'  # 只显示error

mnist = input_data.read_data_sets("./MNIST_data/",one_hot=True)
print(mnist.train.images.shape,mnist.train.labels.shape)
print(mnist.test.images.shape,mnist.test.labels.shape)
print(mnist.validation.images.shape,mnist.validation.labels.shape)
#print(mnist.train.labels[0])

sess = tf.InteractiveSession()
x = tf.placeholder(tf.float32,[None,784]) # placeholder可指定数据类型
w = tf.Variable(tf.zeros([784,10]))
b = tf.Variable(tf.zeros([10]))

y = tf.nn.softmax(tf.matmul(x,w)+b) # 预测值

y_ = tf.placeholder(tf.float32,[None,10]) # 真实值
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_*tf.log(y),reduction_indices=[1]))

train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)

tf.global_variables_initializer().run()

for i in range(1000):
    batch_xs,batch_ys = mnist.train.next_batch(100)
    train_step.run({x:batch_xs,y_:batch_ys})
    #print(sess.run(y,feed_dict={x:batch_xs})) # 显示每一次Softmax回归的结果，即每一类别的概率值
correct_prediction = tf.equal(tf.argmax(y,1),tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction,tf.float32))

print(accuracy.eval({x:mnist.test.images,y_:mnist.test.labels}))

4 实验结果

显示部分Softmax回归求出来的值，可以从下图可见，其求出的值是一个概率值。

识别率大致为92%