本文首发于我的个人博客QIMING.INFO，转载请带上链接及署名。

在上文（《TensorFlow快速上手》）中，我们介绍了TensorFlow中的一些基本概念，并实现了一个线性回归的例子。

本文我们趁热打铁，接着用TensorFlow实现一下神经网络吧。

TensorFlow中的神经网络可以用来实现回归算法和分类算法，本文将分别给出实现这两种算法的代码。除此之外，还将介绍一个TensorFlow中重要且常用的概念——placeholder（占位符），和一个著名的数据集：MINST数据集。

1 placeholder

在开始之前，先得说一下placeholder，中文翻译为占位符。

tensor不仅以常量或变量的形式存储，TensorFlow 还提供了feed机制，该机制可以临时替代计算图中的任意操作中的tensor，可以对图中任何操作提交补丁，直接插入一个tensor。具体方法即使用tf.placeholder()为这些操作创建占位符。简单使用如下：

# 创建input1和input2这两个占位符
input1 = tf.placeholder(tf.float32)
input2 = tf.placeholder(tf.float32)
output = tf.multiply(input1,input2)

with tf.Session() as sess:
    # 通过字典的形式向input1和input2传值
    print(sess.run(output,feed_dict={input1:[7.],input2:[2.]}))

# 输出结果为：[14.]

2 神经网络实现回归算法

2.1 代码及说明

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt

# 使用numpy生成100个随机点作为假数据
x_data = np.linspace(-0.5,0.5,200)[:,np.newaxis]
noise = np.random.normal(0,0.02,x_data.shape)
y_data = np.square(x_data)+noise

# 定义两个placeholder
x = tf.placeholder(tf.float32,[None,1])
y = tf.placeholder(tf.float32,[None,1])

# 定义神经网络中间层
Weights_L1 = tf.Variable(tf.random_normal([1,10]))
biases_L1 = tf.Variable(tf.zeros([1,10]))
Wx_plus_b_L1 = tf.matmul(x,Weights_L1) + biases_L1
L1 = tf.nn.tanh(Wx_plus_b_L1)

# 定义神经网络输出层
Weights_L2 = tf.Variable(tf.random_normal([10,1]))
biases_L2 = tf.Variable(tf.zeros([1,1]))
Wx_plus_b_L2 = tf.matmul(L1,Weights_L2)+biases_L2
prediction = tf.nn.tanh(Wx_plus_b_L2)

# 二次代价函数
loss = tf.reduce_mean(tf.square(y-prediction))
# 定义一个梯度下降法来进行训练的优化器 学习率0.1
train_step = tf.train.GradientDescentOptimizer(0.1).minimize(loss)

with tf.Session() as sess:
    # 变量初始化
    sess.run(tf.global_variables_initializer())
    # 训练2000次
    for step in range(2000):
        sess.run(train_step,feed_dict={x:x_data,y:y_data})
    # 获得预测值
    prediction_value = sess.run(prediction,feed_dict={x:x_data}) 
    # 画图展示结果
    plt.figure()
    plt.scatter(x_data,y_data)
    plt.plot(x_data,prediction_value,'r-',lw=5)
    plt.show()

2.2 结果

这个神经网络比较简单，使用了tanh()作为激活函数，梯度下降法为优化器，二次代价函数为损失函数。

拟合出的结果如上图红线所示，可以看出，大致是一个二次函数曲线。

3 神经网络实现分类算法

3.1 MNIST数据集简介

MNIST是一个入门级的计算机视觉数据集，它包含各种手写数字图片，它也包含每一张图片对应的标签，告诉我们这个是数字几。比如，下面这四张图片的标签分别是5，0，4，1。

MNIST数据集有两部分组成：60000行的训练数据集（mnist.train）和10000行的测试数据集（mnist.test）。

每一个MNIST数据单元有两部分组成：一张包含手写数字的图片和一个对应的标签。我们把这些图片设为“xs”，把这些标签设为“ys”。训练数据集和测试数据集都包含xs和ys，比如训练数据集的图片是 mnist.train.images ，训练数据集的标签是 mnist.train.labels。

每一张图片包含28像素X28像素。我们可以用一个数字数组来表示这张图片：

我们把这个数组展开成一个向量，长度是 28x28 = 784。因此，在MNIST训练数据集中，mnist.train.images 是一个形状为 [60000, 784] 的张量，第一个维度数字用来索引图片，第二个维度数字用来索引每张图片中的像素点。相对应的MNIST数据集的标签是介于0到9的数字，用来描述给定图片里表示的数字。为了用于这个教程，我们使标签数据是"one-hot vectors"。一个one-hot向量除了某一位的数字是1以外其余各维度数字都是0。所以在此教程中，数字n将表示成一个只有在第n维度（从0开始）数字为1的10维向量。比如，标签0将表示成([1,0,0,0,0,0,0,0,0,0,0])。因此， mnist.train.labels 是一个 [60000, 10] 的数字矩阵。

3.2 代码及说明

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

# 载入数据集
mnist = input_data.read_data_sets("MNIST_data",one_hot=True)

# 每个批次的大小
batch_size = 100
# 计算一共有多少个批次
n_batch = mnist.train.num_examples // batch_size

# 定义两个placeholder
x = tf.placeholder(tf.float32,[None,784])
y = tf.placeholder(tf.float32,[None,10])

# 创建一个简单的神经网络（无中间层）
W = tf.Variable(tf.zeros([784,10]))
b = tf.Variable(tf.zeros([10]))
prediction = tf.nn.softmax(tf.matmul(x,W)+b)

# 交叉熵
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y,logits=prediction))
# 定义一个梯度下降法来进行训练的优化器 学习率0.2
train_step = tf.train.GradientDescentOptimizer(0.2).minimize(loss)

# 初始化变量
init = tf.global_variables_initializer()

# 结果存放在一个布尔型列表中
correct_prediction = tf.equal(tf.argmax(y,1),tf.argmax(prediction,1)) # argmax返回一维张量中最大的值所在的位置
# 求准确率
accuracy = tf.reduce_mean(tf.cast(correct_prediction,tf.float32))

with tf.Session() as sess:
    sess.run(init)
    # 训练21轮次
    for epoch in range(21):
        for batch in range(n_batch):
            batch_xs,batch_ys = mnist.train.next_batch(batch_size)
            sess.run(train_step,feed_dict={x:batch_xs,y:batch_ys})
        # 用测试数据计算模型的准确率
        acc = sess.run(accuracy,feed_dict={x:mnist.test.images,y:mnist.test.labels})
        print("Iter "+str(epoch)+",Testing Accuracy "+str(acc))

3.3 结果

Iter 0,Testing Accuracy 0.8488
Iter 1,Testing Accuracy 0.8941
Iter 2,Testing Accuracy 0.9013
Iter 3,Testing Accuracy 0.9053
Iter 4,Testing Accuracy 0.9093
Iter 5,Testing Accuracy 0.91
Iter 6,Testing Accuracy 0.9119
Iter 7,Testing Accuracy 0.914
Iter 8,Testing Accuracy 0.9138
Iter 9,Testing Accuracy 0.916
Iter 10,Testing Accuracy 0.9174
Iter 11,Testing Accuracy 0.9191
Iter 12,Testing Accuracy 0.9184
Iter 13,Testing Accuracy 0.9194
Iter 14,Testing Accuracy 0.9196
Iter 15,Testing Accuracy 0.9203
Iter 16,Testing Accuracy 0.9211
Iter 17,Testing Accuracy 0.9215
Iter 18,Testing Accuracy 0.9211
Iter 19,Testing Accuracy 0.9218
Iter 20,Testing Accuracy 0.9222

本例中神经网络的输出层用了softmax()函数进行分类，损失函数用了交叉熵函数，依旧使用了梯度下降法作为优化器。

结果显示，在训练了21轮后，模型的准确率达到了92.2%，这个准确度不算高，所以还需要进行优化，优化方式下文（TensorFlow进一步优化神经网络）将介绍。

4 小结

在本文中，分别实现了神经网络的回归算法和分类算法，其中提到的有关神经网络的一些概念，如激活函数、损失函数、优化器等，先请读者自行参考相关资料，本人后续可能会补充。

5 参考资料

[1]@Bilibili.深度学习框架Tensorflow学习与应用.2018-03
[2]TensorFlow中文社区.基本用法 | TensorFlow 官方文档中文版

TensorFlow实现简单神经网络