Tensorflow 笔记 Ⅳ——mnist手写数字识别

数据集

mnist 数据集简介

图

• MNIST数据库(Modified National Institute of Standards and Technology database)是一个大型数据库的手写数字是通常用于训练各种图像处理系统。该数据库还广泛用于机器学习领域的培训和测试。它是通过重新混合 NIST原始数据集中的样本而创建的。由于NIST的培训数据集来自美国人口普查局的员工,而测试数据集则来自美国 高中学生,这不是非常适合于机器学习实验。此外,将来自NIST的黑白图像归一化以适合28x28像素的边界框并进行抗锯齿处理,从而引入了灰度级。

• MNIST数据库包含60,000个训练图像和10,000个测试图像。训练集的一半和测试集的一半来自NIST的训练数据集,而训练集的另一半和测试集的另一半则来自NIST的测试数据集。数据库的原始创建者保留了一些经过测试的方法的列表。在他们的原始论文中,他们使用支持向量机获得0.8%的错误率。类似于MNIST的扩展数据集EMNIST已于2017年发布,其中包含240,000个训练图像和40,000个手写数字和字符的测试图像。

•MNIST数据集下载地址

数据文件 大小
train-images-idx3-ubyte.gz training set images (9912422 bytes)
train-labels-idx1-ubyte.gz training set labels (28881 bytes)
t10k-images-idx3-ubyte.gz test set images (1648877 bytes)
t10k-labels-idx1-ubyte.gz test set labels (4542 bytes)

训练集:55000 验证集:5000 测试集:10000

Tensorflow 读取 mnist 数据集方式

TensorFlow1.x 与 TensorFlow2.x 对于 mnist 数据集有不同的方式
•TensorFlow1.x 未使用默认 mnist 数据集格式,TensorFlow1.x 在读取 minist 数据集时需要将 TensorFlow 源码下的 toturial 文件夹拷贝到你的虚拟环境下,详细步骤如下
1.下载 TensorFlow 源码添加链接描述

2.将 tensorflow 源码下的 examples 拷贝到安装的环境下,以我下载的保存路径已经环境路径为例,将 E:\AI\tensorflow\tensorflow\examples 拷贝到E:\Anaconda3\envs\tensorflow1.x\Lib\site-packages\tensorflow 以及 E:\Anaconda3\envs\tensorflow1.x\Lib\site-packages\tensorflow_core,以避免出现 ModuleNotFoundError: No module named ‘tensorflow.examples.tutorials’,如果 github 下载较慢,可以在我的 gitee下载

•TensorFlow2.x 已将 mnist 同化,使用的 mnist.npy 文件进行,TensorFlow2.x 与 TensorFlow1.x 一样,提供运行时下载数据集

基本原理

从预测问题到分类问题
从线性回归到逻辑回归

逻辑回归

逻辑回归(Logistic regression)是一种对数几率模型,是离散选择法模型之一,属于多重变量分析范畴,很多实际场景问题需要的是一种概率性的结果,比如明天会不会下雨,判断细胞形态,以及之前 Kaggle 上的猫狗识别,我们的计算机无时无刻不在处理类似的数据,很多社交软件也会根据你的浏览喜好来为你推送相关通知等,现在的手写数字识别是输出 0、1、2、3、4、5、6、7、8、9 的概率,需要将预测输出值控制在 [0,1]区间内,二元分类问题的目标是正确预测两个可能的标签中的一个,逻辑回归(Logistic Regression)就能很好的处理这类问题

Sigmod函数

逻辑回归模型如何确保输出值始终落在 0 和 1 之间,
Sigmod 函数(S型函数)生成的输出值正好具有这些特性,其定义如下:
y = 1 1 + e z y = \frac{1}{1+e^{-z}}
sigmod
z = x 1 × w 1 + x 2 × w 2 + + x n × w n + b z = x_1×w_1 + x_2×w_2 +⋯+x_n ×w_n + b
定义域为全体实数,值域在[0,1]之间,Z值在0点对应的结果为0.5,sigmoid函数连续可微分,在分类任务中,特别是二分类任务,常用 sigmod 作为激活函数,通过上面的逻辑回归模型,就能将模型输出的值映射在[0,1]之间

逻辑回归中的损失函数

线性回归的损失函数是平方损失,如果逻辑回归的损失函数也定义为平方损失,损失函数形式如下
J ( w ) = 1 n i = 0 n ( φ ( z i ) y i ) 2 J(w) = \frac{1}{n}\sum_{i=0}^n(φ(z_i)-y_i)^2
其中:
     i i 表示第 i i 个样本点
     z i = x i × w + b z_i = x_i×w+b
     φ ( z i ) φ(z_i) 表示对 i i 个样本的预测值
     y i y_i 表示第 i i 个样本的标签值
联立如下公式:

J ( w ) = { J ( w ) = 1 n i = 0 n ( φ ( z i ) y i ) 2 φ ( z i ) = 1 1 + e z J(w)=\left\{ \begin{aligned} J(w) = \frac{1}{n}\sum_{i=0}^n(φ(z_i)-y_i)^2 \\ φ(z_i) = \frac{1}{1+e^{-z}} \end{aligned} \right.
可得到:
J ( w ) = 1 n i = 0 n ( 1 1 + e z i y i ) 2 J(w)= \frac{1}{n}\sum_{i=0}^n( \frac{1}{1+e^{-z_i}}-yi)^2
其中
z i = x 1 × w 1 + x 2 × w 2 + + x n × w n + b z_i = x_1×w_1 + x_2×w_2 +⋯+x_n ×w_n + b
其中一位情况的损失函数图像如下图所示,可以简单理解一下,如果我们把 ( 1 1 + e z i y i ) 2 (\frac{1}{1+e^{-z_i}}-yi)^2 当作一个整体,外部则为一个二次函数和的均值,也就呈现了下图的二次函数轮廓,但在每个小区域内,由于 w w 的正负不同,就使得每项均值对整体函数增减性不同,就出现下图的局部极值
图
利用 matlab 模拟一维 w w 情况,第一幅图使用 20000 个 w w 计算损失函数值,第二幅图放大其中一小段观察,很容易的发现,损失函数是一个 non-convex function (非凸函数)

w = linspace(-100, 100, 20000);
p = randperm(10000)./10000;
q = -randperm(10000)./10000;
x = [p, q];
sum = 0;
for i = 1:1:20000
    a = 1./(1+exp(x(i).*w));
    sum=sum+a;
end
plot(w, sum/20000)

为了解决这个问题,二元逻辑回归一般采用对数损失函数,定义如下

J ( W , b ) = ( x , y ) D y l o g ( y ) ( 1 y ) l o g ( 1 y ) J(W, b)= \sum_{(x,y)∈D} -ylog(y')-(1-y)log(1-y')
其中:
     ( x , y ) D (x,y)∈D 是有标签样本 ( x , y ) (x,y) 的数据集
     y y 是有标签样本中的标签,取值非 0 即 1
     y y' 是对于特征集 x x 的预测值,范围在 0~1
同样将 sigmod 带入后,可以发现, J ( W , b ) J(W,b) 是一个凸函数
图
这里可以简单解释一下,由于 y y 是标签值,如果 y = 0 y=0 ,则 J ( W , b ) = ( x , y ) D l o g ( 1 y ) J(W, b)= \sum_{(x,y)∈D} -log(1-y') ,可以发现是 y y' 增函数,也是凸函数,又 y = 1 1 + e z y' = \frac{1}{1+e^{-z}} ,是 w b w,b 的增函数,所以 J ( W , b ) J(W, b) 是一个凸函数,同理如果 y = 1 y=1 ,则 J ( W , b ) = ( x , y ) D l o g ( y ) J(W, b)= \sum_{(x,y)∈D} -log(y') ,可以发现是 y y' 减函数,也是凸函数,又 y = 1 1 + e z y' = \frac{1}{1+e^{-z}} ,是 w b w,b 的增函数,所以 J ( W , b ) J(W, b) 是一个减函数也是凸函数

多元分类 Softmax 思想

逻辑回归可生成介于 0 和 1.0 之间的小数。
例如,某电子邮件分类器的逻辑回归输出值为 0.8,表明电子邮件是垃圾邮件的概率为 80%,不是垃圾邮件的概率为 20%。很明显,一封电子邮件是垃圾邮件或非垃圾邮件的概率之和为 1.0。Softmax 将这一想法延伸到多类别领域。在多类别问题中,Softmax 会为每个类别分配一个用小数表示的率。这些用小数表示的概率相加之和必须是 1.0
softmax 公式如下
p i = e y i k = 1 c e y k p_i = \frac{e^{y_i}}{\sum_{k=1}^c e^{y_k}}

类别 计算值 概率
dog 9 0.8737043
cat 7 0.11824302
rabit 4 0.00588697
monkey 3 0.0021657

此表格的计算公式应如下(仅以猴为例):
p i = e 9 e 9 + e 7 + e 4 + e 3 = 0.8737043 p_i = \frac{e^{9}}{e^{9}+e^{7}+e^{4}+e^{3}}=0.8737043
TensorFlow1.x 示例代码

X = tf.constant([9, 7, 4, 3], dtype=tf.float32)
prob = tf.Session().run(tf.nn.softmax(X))
tf.Session().close()
print('X softmax value:', prob)
X softmax value: [0.8737043  0.11824302 0.00588697 0.0021657 ]

交叉熵损失函数

交叉熵是一个信息论中的概念,它原来是用来估算平均编码长度的。给定两个概率分布 p p q q ,通过 q q 来表示 p p 的交叉熵为
H ( p , q ) = x p ( x ) l o g q ( x ) H(p, q)= -\sum_{x} p(x)logq(x)

交叉熵刻画的是两个概率分布之间的距离, p p 代表正确答案, q q 代表的是预测值,交叉熵越 小,两个概率的分布约接近

例如
假设有一个3分类问题,某个样例的正确答案是(1,0,0)
甲模型经过softmax回归之后的预测答案是(0.5,0.2,0.3)
乙模型经过softmax回归之后的预测答案是(0.7,0.1,0.2)
H ( ( 1 , 0 , 0 ) , ( 0.5 , 0.2 , 0.3 ) ) = l o g 0.5 = 0.301 H((1, 0, 0), (0.5, 0.2, 0.3))= -log0.5=0.301
H ( ( 1 , 0 , 0 ) , ( 0.7 , 0.1 , 0.2 ) ) = l o g 0.7 = 0.155 H((1, 0, 0), (0.7, 0.1, 0.2))= -log0.7=0.155

交叉熵损失函数定义
L o s s = i = 1 n y i l o g y i Loss= -\sum_{i=1}^n y_ilogy_i'
其中: y i yi 为标签值, y i yi' 为预测值

mnist 手写数字识别 TensorFlow 1.x 实现

前情函数

reshape

reshape 是对数组整形,对于一个数组 array
array.reshape([d0, d1,…,dn]) 或 array.reshape((d0, d1,…,dn))
这种带有负号 reshape((-1,4,2)) 三维数组,根据 numpy 库官网的介绍,这里的-1被理解为unspecified value,意思是未指定为给定的。由计算机自动计算 -1 维度的填充,例如下面 64 长度的数组,reshape((-1,4,2)) 则 -1 维度值为 64 / 4 / 2 = 8 64 / 4 / 2 = 8 ,最终维度为 (8, 4, 2),参数优先级按顺序执行
重整维度,如下

import numpy as np


array = np.array([i for i in range(64)], dtype=float)
print('array value:\n', array,
      '\narray shape', array.shape)
array value:
 [ 0.  1.  2.  3.  4.  5.  6.  7.  8.  9. 10. 11. 12. 13. 14. 15. 16. 17.
 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35.
 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. 51. 52. 53.
 54. 55. 56. 57. 58. 59. 60. 61. 62. 63.] 
array shape (64,)

重整为 8x8,4x4x4 维度

array_8_8 = array.reshape([8, 8])
array_4_4_4 = array.reshape((4, 4, 4))
array_unspecified = array.reshape((-1, 4, 2))
print('array_8_8 value:\n', array_8_8,
      '\narray_8_8 shape:', array_8_8.shape,
      '\narray_4_4_4 value:\n', array_4_4_4,
      '\narray_4_4_4 shape:', array_4_4_4.shape,
      '\narray_unspecified value:\n', array_unspecified,
      '\narray_unspecified shape:', array_unspecified.shape)
array_8_8 value:
 [[ 0.  1.  2.  3.  4.  5.  6.  7.]
 [ 8.  9. 10. 11. 12. 13. 14. 15.]
 [16. 17. 18. 19. 20. 21. 22. 23.]
 [24. 25. 26. 27. 28. 29. 30. 31.]
 [32. 33. 34. 35. 36. 37. 38. 39.]
 [40. 41. 42. 43. 44. 45. 46. 47.]
 [48. 49. 50. 51. 52. 53. 54. 55.]
 [56. 57. 58. 59. 60. 61. 62. 63.]] 
array_8_8 shape: (8, 8) 
array_4_4_4 value:
 [[[ 0.  1.  2.  3.]
  [ 4.  5.  6.  7.]
  [ 8.  9. 10. 11.]
  [12. 13. 14. 15.]]

 [[16. 17. 18. 19.]
  [20. 21. 22. 23.]
  [24. 25. 26. 27.]
  [28. 29. 30. 31.]]

 [[32. 33. 34. 35.]
  [36. 37. 38. 39.]
  [40. 41. 42. 43.]
  [44. 45. 46. 47.]]

 [[48. 49. 50. 51.]
  [52. 53. 54. 55.]
  [56. 57. 58. 59.]
  [60. 61. 62. 63.]]] 
array_4_4_4 shape: (4, 4, 4) 
array_unspecified value:
 [[[ 0.  1.]
  [ 2.  3.]
  [ 4.  5.]
  [ 6.  7.]]

 [[ 8.  9.]
  [10. 11.]
  [12. 13.]
  [14. 15.]]

 [[16. 17.]
  [18. 19.]
  [20. 21.]
  [22. 23.]]

 [[24. 25.]
  [26. 27.]
  [28. 29.]
  [30. 31.]]

 [[32. 33.]
  [34. 35.]
  [36. 37.]
  [38. 39.]]

 [[40. 41.]
  [42. 43.]
  [44. 45.]
  [46. 47.]]

 [[48. 49.]
  [50. 51.]
  [52. 53.]
  [54. 55.]]

 [[56. 57.]
  [58. 59.]
  [60. 61.]
  [62. 63.]]] 
array_unspecified shape: (8, 4, 2)

reshape新生成数组和原数组公用一个内存,不管改变哪个都会互相影响
将 array[0] 修改为 520

array[0] = 520
print('array_8_8 value:\n', array_8_8,
      '\narray_8_8 shape:', array_8_8.shape,
      '\narray_4_4_4 value:\n', array_4_4_4,
      '\narray_4_4_4 shape:', array_4_4_4.shape,
      '\narray_unspecified value:\n', array_unspecified,
      '\narray_unspecified shape:', array_unspecified.shape)
array_8_8 value:
 [[520.   1.   2.   3.   4.   5.   6.   7.]
 [  8.   9.  10.  11.  12.  13.  14.  15.]
 [ 16.  17.  18.  19.  20.  21.  22.  23.]
 [ 24.  25.  26.  27.  28.  29.  30.  31.]
 [ 32.  33.  34.  35.  36.  37.  38.  39.]
 [ 40.  41.  42.  43.  44.  45.  46.  47.]
 [ 48.  49.  50.  51.  52.  53.  54.  55.]
 [ 56.  57.  58.  59.  60.  61.  62.  63.]] 
array_8_8 shape: (8, 8) 
array_4_4_4 value:
 [[[520.   1.   2.   3.]
  [  4.   5.   6.   7.]
  [  8.   9.  10.  11.]
  [ 12.  13.  14.  15.]]

 [[ 16.  17.  18.  19.]
  [ 20.  21.  22.  23.]
  [ 24.  25.  26.  27.]
  [ 28.  29.  30.  31.]]

 [[ 32.  33.  34.  35.]
  [ 36.  37.  38.  39.]
  [ 40.  41.  42.  43.]
  [ 44.  45.  46.  47.]]

 [[ 48.  49.  50.  51.]
  [ 52.  53.  54.  55.]
  [ 56.  57.  58.  59.]
  [ 60.  61.  62.  63.]]] 
array_4_4_4 shape: (4, 4, 4) 
array_unspecified value:
 [[[520.   1.]
  [  2.   3.]
  [  4.   5.]
  [  6.   7.]]

 [[  8.   9.]
  [ 10.  11.]
  [ 12.  13.]
  [ 14.  15.]]

 [[ 16.  17.]
  [ 18.  19.]
  [ 20.  21.]
  [ 22.  23.]]

 [[ 24.  25.]
  [ 26.  27.]
  [ 28.  29.]
  [ 30.  31.]]

 [[ 32.  33.]
  [ 34.  35.]
  [ 36.  37.]
  [ 38.  39.]]

 [[ 40.  41.]
  [ 42.  43.]
  [ 44.  45.]
  [ 46.  47.]]

 [[ 48.  49.]
  [ 50.  51.]
  [ 52.  53.]
  [ 54.  55.]]

 [[ 56.  57.]
  [ 58.  59.]
  [ 60.  61.]
  [ 62.  63.]]] 
array_unspecified shape: (8, 4, 2)

将 array_8_8[0][0] 修改为 521

array_8_8[0][0] = 521
print('array_8_8 value:\n', array_8_8,
      '\narray_8_8 shape:', array_8_8.shape,
      '\narray_4_4_4 value:\n', array_4_4_4,
      '\narray_4_4_4 shape:', array_4_4_4.shape,
      '\narray_unspecified value:\n', array_unspecified,
      '\narray_unspecified shape:', array_unspecified.shape)
array_8_8 value:
 [[521.   1.   2.   3.   4.   5.   6.   7.]
 [  8.   9.  10.  11.  12.  13.  14.  15.]
 [ 16.  17.  18.  19.  20.  21.  22.  23.]
 [ 24.  25.  26.  27.  28.  29.  30.  31.]
 [ 32.  33.  34.  35.  36.  37.  38.  39.]
 [ 40.  41.  42.  43.  44.  45.  46.  47.]
 [ 48.  49.  50.  51.  52.  53.  54.  55.]
 [ 56.  57.  58.  59.  60.  61.  62.  63.]] 
array_8_8 shape: (8, 8) 
array_4_4_4 value:
 [[[521.   1.   2.   3.]
  [  4.   5.   6.   7.]
  [  8.   9.  10.  11.]
  [ 12.  13.  14.  15.]]

 [[ 16.  17.  18.  19.]
  [ 20.  21.  22.  23.]
  [ 24.  25.  26.  27.]
  [ 28.  29.  30.  31.]]

 [[ 32.  33.  34.  35.]
  [ 36.  37.  38.  39.]
  [ 40.  41.  42.  43.]
  [ 44.  45.  46.  47.]]

 [[ 48.  49.  50.  51.]
  [ 52.  53.  54.  55.]
  [ 56.  57.  58.  59.]
  [ 60.  61.  62.  63.]]] 
array_4_4_4 shape: (4, 4, 4) 
array_unspecified value:
 [[[521.   1.]
  [  2.   3.]
  [  4.   5.]
  [  6.   7.]]

 [[  8.   9.]
  [ 10.  11.]
  [ 12.  13.]
  [ 14.  15.]]

 [[ 16.  17.]
  [ 18.  19.]
  [ 20.  21.]
  [ 22.  23.]]

 [[ 24.  25.]
  [ 26.  27.]
  [ 28.  29.]
  [ 30.  31.]]

 [[ 32.  33.]
  [ 34.  35.]
  [ 36.  37.]
  [ 38.  39.]]

 [[ 40.  41.]
  [ 42.  43.]
  [ 44.  45.]
  [ 46.  47.]]

 [[ 48.  49.]
  [ 50.  51.]
  [ 52.  53.]
  [ 54.  55.]]

 [[ 56.  57.]
  [ 58.  59.]
  [ 60.  61.]
  [ 62.  63.]]] 
array_unspecified shape: (8, 4, 2)

numpy.argmax(a, axis=None, out=None)

返回沿轴的最大值的索引
a:输入数组、axis:选择轴、out:不涉及
对于二维矩阵:
    axis = 0 :表示同列每一行最大值
    axis = 1 :表示同行每一列最大值
    axis = -1:表示取最后一个维度最大值
numpy.argmin同理

array = np.array([[10, 11, 12], [13, 14, 15]])
print('array value:\n', array,
      '\narray type:', array.shape,
      '\n1-D[every column max index] max index:', np.argmax(array, 0),
      '\n2-D[every row max index] max index:', np.argmax(array, 1),
      '\n2-D[every row max index] max index:', np.argmax(array, -1),
      '\n1-D[every column min index] min index:', np.argmin(array, 0),
      '\n2-D[every row min index] min index:', np.argmin(array, 1),
      '\n2-D[every row max index] max index:', np.argmin(array, -1))

print('*******line*******')

array = np.array([[[10, 11, 12], [13, 14, 15]], [[2, 9, 4], [7, 3, 5]], [[9, 6, 4], [8, 7, 7]]])
print('array value:\n', array,
      '\narray type:', array.shape,
      '\n1-D:\n', np.argmax(array, 0),
      '\n2-D:\n', np.argmax(array, 1),
      '\n3-D:\n', np.argmax(array, 2),
      '\n3-D:\n', np.argmax(array, -1),
      '\n1-D:\n', np.argmin(array, 0),
      '\n2-D:\n', np.argmin(array, 1),
      '\n2-D:\n', np.argmin(array, 2),
      '\n3-D:\n', np.argmin(array, -1))
array value:
 [[10 11 12]
 [13 14 15]] 
array type: (2, 3) 
1-D[every column max index] max index: [1 1 1] 
2-D[every row max index] max index: [2 2] 
2-D[every row max index] max index: [2 2] 
1-D[every column min index] min index: [0 0 0] 
2-D[every row min index] min index: [0 0] 
2-D[every row max index] max index: [0 0]
*******line*******
array value:
 [[[10 11 12]
  [13 14 15]]

 [[ 2  9  4]
  [ 7  3  5]]

 [[ 9  6  4]
  [ 8  7  7]]] 
array type: (3, 2, 3) 
1-D:
 [[0 0 0]
 [0 0 0]] 
2-D:
 [[1 1 1]
 [1 0 1]
 [0 1 1]] 
3-D:
 [[2 2]
 [1 0]
 [0 0]] 
3-D:
 [[2 2]
 [1 0]
 [0 0]] 
1-D:
 [[1 2 1]
 [1 1 1]] 
2-D:
 [[0 0 0]
 [0 1 0]
 [1 0 0]] 
2-D:
 [[0 0]
 [0 1]
 [2 1]] 
3-D:
 [[0 0]
 [0 1]
 [2 1]]

tensorflow.argmax 与 numpy.argmax 效果一样

示例如下

import tensorflow as tf


array_1 = np.array([[10, 11, 12], [13, 14, 15]])
array_2 = np.array([[[10, 11, 12], [13, 14, 15]], [[2, 9, 4], [7, 3, 5]], [[9, 6, 4], [8, 7, 7]]])
with tf.Session() as sess:
    print('array value:\n', array,
      '\narray type:', array.shape,
      '\n1-D[every column max index] max index:', tf.argmax(array_1, 0).eval(),
      '\n2-D[every row max index] max index:', tf.argmax(array_1, 1).eval(),
      '\n2-D[every row max index] max index:', tf.argmax(array_1, -1).eval(),
      '\n1-D[every column min index] min index:', tf.argmax(array_1, 0).eval(),
      '\n2-D[every row min index] min index:', tf.argmax(array_1, 1).eval(),
      '\n2-D[every row max index] max index:', tf.argmax(array_1, -1).eval(),)
    print('array value:\n', array,
      '\narray type:', array.shape,
      '\n1-D:\n', tf.argmax(array_2, 0).eval(),
      '\n2-D:\n', tf.argmax(array_2, 1).eval(),
      '\n3-D:\n', tf.argmax(array_2, 2).eval(),
      '\n3-D:\n', tf.argmax(array_2, -1).eval(),
      '\n1-D:\n', tf.argmin(array_2, 0).eval(),
      '\n2-D:\n', tf.argmin(array_2, 1).eval(),
      '\n2-D:\n', tf.argmin(array_2, 2).eval(),
      '\n3-D:\n', tf.argmin(array_2, -1).eval())
array value:
 [[[10 11 12]
  [13 14 15]]

 [[ 2  9  4]
  [ 7  3  5]]

 [[ 9  6  4]
  [ 8  7  7]]] 
array type: (3, 2, 3) 
1-D[every column max index] max index: [1 1 1] 
2-D[every row max index] max index: [2 2] 
2-D[every row max index] max index: [2 2] 
1-D[every column min index] min index: [1 1 1] 
2-D[every row min index] min index: [2 2] 
2-D[every row max index] max index: [2 2]
array value:
 [[[10 11 12]
  [13 14 15]]

 [[ 2  9  4]
  [ 7  3  5]]

 [[ 9  6  4]
  [ 8  7  7]]] 
array type: (3, 2, 3) 
1-D:
 [[0 0 0]
 [0 0 0]] 
2-D:
 [[1 1 1]
 [1 0 1]
 [0 1 1]] 
3-D:
 [[2 2]
 [1 0]
 [0 0]] 
3-D:
 [[2 2]
 [1 0]
 [0 0]] 
1-D:
 [[1 2 1]
 [1 1 1]] 
2-D:
 [[0 0 0]
 [0 1 0]
 [1 0 0]] 
2-D:
 [[0 0]
 [0 1]
 [2 1]] 
3-D:
 [[0 0]
 [0 1]
 [2 1]]

适用于 mnist 数据集 onehot 编码获取数字标签

array = np.array([0, 0, 0, 0, 0, 1, 0, 0, 0, 0])
print('array value:\n', array,
      '\narray type:', array.shape,
      '\narray max index:', np.argmax(array))
array value:
 [0 0 0 0 0 1 0 0 0 0] 
array type: (10,) 
array max index: 5

tensorflow.random_normal()

前面系列已经解析,此处大致再走一遍

import matplotlib.pyplot as plt


norm = tf.random_normal([1000])
with tf.Session() as sess:
    norm_data=norm.eval()
plt.hist(norm_data, bins=50)
plt.show()

图

matplotlib.pyplot.hist

matplotlib.pyplot.hist(x, bins=None, range=None, density=False, weights=None, cumulative=False, bottom=None, histtype=‘bar’, align=‘mid’, orientation=‘vertical’, rwidth=None, log=False, color=None, label=None, stacked=False, *, data=None, **kwargs)
plt.hist 官网解释地址

属性 说明 类型
x 数据 数值类型
bins 条形数 int
color 颜色 "r","g","y","c"
density 是否以密度的形式显示 bool
range x轴的范围 数值元组(起,终)
bottom y轴的起始位置 数值类型
histtype 线条的类型 "bar":方形,"barstacked":柱形,<br />"step":"未填充线条"<br />"stepfilled":"填充线条"
align 对齐方式 "left":左,"mid":中间,"right":右
orientation orientation "horizontal":水平,"vertical":垂直
log 单位是否以科学计术法 bool
示例如下
mu = 100
sigma = 20 
x = mu + sigma * np.random.randn(2000)
plt.hist(x=x, bins=10)
plt.show()

图

tensorflow 中 softmax() 函数

将输入值在整个值列表中映射到 0~1之间

x = np.array([-1.1, 2.2, 3.3, 9.6])
pred = tf.nn.softmax(x)
with tf.Session() as sess:
    probability = sess.run(pred)
print('x value:\n', x,
      '\nafter softmax value:\n', probability,
      '\nsum of probability:', sum(probability))
x value:
 [-1.1  2.2  3.3  9.6] 
after softmax value:
 [2.24893868e-05 6.09746624e-04 1.83178009e-03 9.97535984e-01] 
sum of probability: 1.0

正式开始

本次模型为图像分类
从预测问题到分类问题
从线性回归到逻辑回归

import tensorflow as tf
import numpy as np
from tensorflow.examples.tutorials.mnist import input_data
import math
import matplotlib.pyplot as plt


mnist = input_data.read_data_sets('./mnist_dataset/', one_hot=True)
Extracting ./mnist_dataset/train-images-idx3-ubyte.gz
Extracting ./mnist_dataset/train-labels-idx1-ubyte.gz
Extracting ./mnist_dataset/t10k-images-idx3-ubyte.gz
Extracting ./mnist_dataset/t10k-labels-idx1-ubyte.gz
print('Number of training sets:', mnist.train.num_examples,
      '\nNumber of validation sets:', mnist.validation.num_examples,
      '\nNumber of test sets:', mnist.test.num_examples)
Number of training sets: 55000 
Number of validation sets: 5000 
Number of test sets: 10000

查看 train data

print('train images shape:', mnist.train.images.shape,
      '\nlabels shape:', mnist.train.labels.shape)
train images shape: (55000, 784) 
labels shape: (55000, 10)

一幅图像的维度与其值

print('Shape:', mnist.train.images[520].shape)
print('Value:', mnist.train.images[520])
Shape: (784,)
Value: [0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.21960786
 0.7568628  0.65882355 0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.5254902  0.9921569  0.8313726
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.2509804  0.9921569  0.8313726  0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.14901961
 0.92549026 0.94117653 0.16470589 0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.427451   0.9921569  0.8588236
 0.04313726 0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.6666667  0.9960785  0.8352942  0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.6627451
 0.9921569  0.8313726  0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.6627451  0.9921569  0.8313726
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.6666667  0.9921569  0.8313726  0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.6627451
 0.9921569  0.8313726  0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.6666667  1.         0.8352942
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.6627451  0.9921569  0.8313726  0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.6666667
 0.9921569  0.8313726  0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.6627451  0.9921569  0.8313726
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.6627451  0.9921569  0.8313726  0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.6666667
 1.         0.8352942  0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.6627451  0.9921569  0.8313726
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.6313726  0.9921569  0.8470589  0.02352941 0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.2509804
 0.9921569  0.9960785  0.24705884 0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.04313726 0.68235296 0.9294118
 0.14509805 0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.         0.         0.
 0.         0.         0.         0.        ]

随机显示图像 9 幅图像

def plot_num_images(num):
    if num < 1:
        print('INFO:The number of input pictures must be greater than zero!')
    else:
        choose_list = []
        for i in range(num):
            choose_n = np.random.randint(len(mnist.train.images))
            choose_list.append(choose_n)
        fig = plt.gcf()
        fig.set_size_inches(18, 5 * math.ceil(num / 3))
        for i in range(num):
            ax_img = plt.subplot(math.ceil(num / 3), 3, i + 1)
            plt_img = mnist.train.images[choose_list[i]].reshape(28, 28)
            ax_img.imshow(plt_img, cmap='binary')
            ax_img.set_title('label:' + str(np.argmax(mnist.train.labels[choose_list[i]])),
                             fontsize=10)
        plt.show()
plot_num_images(9)

在这里插入图片描述

mnist 提供真实标签读取,显示 10 个数据

mnist_no_one_hot = input_data.read_data_sets('./mnist_dataset/', one_hot=False)
print(mnist_no_one_hot.train.labels[0:10])
Extracting ./mnist_dataset/train-images-idx3-ubyte.gz
Extracting ./mnist_dataset/train-labels-idx1-ubyte.gz
Extracting ./mnist_dataset/t10k-images-idx3-ubyte.gz
Extracting ./mnist_dataset/t10k-labels-idx1-ubyte.gz
[7 3 4 6 1 8 1 0 9 8]

模型构建

使用单个神经元构建模型,将神经元的输出经过 softmax 层分类
占位符
x = tf.placeholder(tf.float32, [None, 784], name='X')
y = tf.placeholder(tf.float32, [None, 10], name='Y')

变量

W = tf.Variable(tf.random_normal([784, 10]), name='W')
b = tf.Variable(tf.zeros([10]), name='b')
forward = tf.matmul(x, W) + b
pred = tf.nn.softmax(forward)

训练模型

设置训练参数

epochs = 50
batch_size = 100
total_batch = int(mnist.train.num_examples / batch_size)
display_step = 1
learning_rate = 0.01

定义损失函数

loss_function = tf.reduce_mean(-tf.reduce_sum(y * tf.log(pred), reduction_indices=1))

选择优化器

optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss_function)

定义准确率

correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
sess = tf.Session()
loss_list = []
acc_list = []
sess.run(tf.global_variables_initializer())
for epoch in range(epochs):
    for batch in range(total_batch):
        xs, ys = mnist.train.next_batch(batch_size)
        sess.run(optimizer, feed_dict={x:xs, y:ys})
    loss, acc = sess.run([loss_function, accuracy],
                         feed_dict={x:mnist.validation.images, y:mnist.validation.labels})
    loss_list.append(loss)
    acc_list.append(acc)
    if (epoch + 1) % display_step == 0:
        print('Epoch: %2d' % (epoch + 1), 'Loss= %6f' % loss, 'Accuracy=%4f' % acc)
print('INFO:Train Finished!')
Epoch:  1 Loss= 5.181013 Accuracy=0.289400
Epoch:  2 Loss= 3.226336 Accuracy=0.463000
Epoch:  3 Loss= 2.442250 Accuracy=0.557600
Epoch:  4 Loss= 2.023449 Accuracy=0.615400
Epoch:  5 Loss= 1.761311 Accuracy=0.662400
Epoch:  6 Loss= 1.580896 Accuracy=0.687800
Epoch:  7 Loss= 1.448330 Accuracy=0.714400
Epoch:  8 Loss= 1.344897 Accuracy=0.732200
Epoch:  9 Loss= 1.263207 Accuracy=0.748000
Epoch: 10 Loss= 1.195717 Accuracy=0.759400
Epoch: 11 Loss= 1.139762 Accuracy=0.770400
Epoch: 12 Loss= 1.091881 Accuracy=0.776400
Epoch: 13 Loss= 1.050353 Accuracy=0.783200
Epoch: 14 Loss= 1.014509 Accuracy=0.789000
Epoch: 15 Loss= 0.983110 Accuracy=0.794000
Epoch: 16 Loss= 0.954650 Accuracy=0.799000
Epoch: 17 Loss= 0.929542 Accuracy=0.804800
Epoch: 18 Loss= 0.906525 Accuracy=0.808400
Epoch: 19 Loss= 0.885733 Accuracy=0.811800
Epoch: 20 Loss= 0.866608 Accuracy=0.815400
Epoch: 21 Loss= 0.849514 Accuracy=0.819400
Epoch: 22 Loss= 0.833153 Accuracy=0.821600
Epoch: 23 Loss= 0.818339 Accuracy=0.826200
Epoch: 24 Loss= 0.804542 Accuracy=0.828200
Epoch: 25 Loss= 0.791879 Accuracy=0.832000
Epoch: 26 Loss= 0.779769 Accuracy=0.832400
Epoch: 27 Loss= 0.768629 Accuracy=0.834600
Epoch: 28 Loss= 0.758255 Accuracy=0.835400
Epoch: 29 Loss= 0.748280 Accuracy=0.837400
Epoch: 30 Loss= 0.738828 Accuracy=0.840000
Epoch: 31 Loss= 0.730522 Accuracy=0.841800
Epoch: 32 Loss= 0.721803 Accuracy=0.843400
Epoch: 33 Loss= 0.713824 Accuracy=0.844000
Epoch: 34 Loss= 0.706201 Accuracy=0.845800
Epoch: 35 Loss= 0.699364 Accuracy=0.847600
Epoch: 36 Loss= 0.692261 Accuracy=0.849000
Epoch: 37 Loss= 0.685300 Accuracy=0.850600
Epoch: 38 Loss= 0.679159 Accuracy=0.852000
Epoch: 39 Loss= 0.673282 Accuracy=0.853000
Epoch: 40 Loss= 0.667670 Accuracy=0.853000
Epoch: 41 Loss= 0.661976 Accuracy=0.854600
Epoch: 42 Loss= 0.656577 Accuracy=0.855800
Epoch: 43 Loss= 0.651724 Accuracy=0.857200
Epoch: 44 Loss= 0.646391 Accuracy=0.858400
Epoch: 45 Loss= 0.641690 Accuracy=0.859000
Epoch: 46 Loss= 0.637001 Accuracy=0.859600
Epoch: 47 Loss= 0.632599 Accuracy=0.860400
Epoch: 48 Loss= 0.628048 Accuracy=0.861800
Epoch: 49 Loss= 0.623791 Accuracy=0.863800
Epoch: 50 Loss= 0.620104 Accuracy=0.863800
INFO:Train Finished!

损失与准确率可视化

fig = plt.figure()
ax1 = fig.add_subplot(111)
ax1.set_title('Train Picture')
ax1.set_ylabel('Loss value')
line1, = ax1.plot(loss_list, color='b', label='Loss')
ax2 = ax1.twinx()
ax2.set_ylabel('Accuracy value')
line2, = ax2.plot(acc_list, color='r', label='Accuracy')
plt.legend(handles=(line1, line2), loc='best')
plt.show()

图

模型评估

测试集测试

acc_test = sess.run(accuracy, feed_dict={x:mnist.test.images, y:mnist.test.labels})
print('Test Accuracy:', acc_test)
Test Accuracy: 0.8624

验证集测试

acc_valid = sess.run(accuracy, feed_dict={x:mnist.validation.images, y:mnist.validation.labels})
print('Test Accuracy:', acc_valid)
Test Accuracy: 0.8638

训练集测试

acc_train = sess.run(accuracy, feed_dict={x:mnist.train.images, y:mnist.train.labels})
print('Test Accuracy:', acc_train)
Test Accuracy: 0.85856366

模型应用

def plot_apply_images(num):
    if num < 1:
        print('INFO:The number of input pictures must be greater than zero!')
    else:
        choose_list = []
        for i in range(num):
            choose_n = np.random.randint(len(mnist.test.images))
            choose_list.append(choose_n)
        fig = plt.gcf()
        fig.set_size_inches(18, 5 * math.ceil(num / 3))
        prediction_result = sess.run(tf.argmax(pred, 1), feed_dict={x:mnist.test.images})
        for i in range(num):
            ax_img = plt.subplot(math.ceil(num / 3), 3, i + 1)
            plt_img = mnist.test.images[choose_list[i]].reshape(28, 28)
            ax_img.imshow(plt_img, cmap='binary')
            ax_img.set_title('Original label:' \
                             + str(np.argmax(mnist.test.labels[choose_list[i]])) \
                             + ' Predict label:' \
                             + str(prediction_result[choose_list[i]]),
                             fontsize=10)
            ax_img.set_xticks([])
            ax_img.set_yticks([])
        plt.show()
plot_apply_images(9)

图

mnist 手写数字识别 TensorFlow 2.x 实现

前情函数

tensorflow.one_hot

tf.one_hot(
          indices,
          depth,
          on_value=None,
          off_value=None,
          axis=None,
          dtype=None,
          name=None
)
仅介绍最常用的使用方法
depth:表示独热编码的向量长度
indices:输入的数据,可以是列表,数组,矩阵等
dtype:数据类型,整型或者浮点型

import tensorflow as tf
import numpy as np


x_list = [3, 4]
x_list_onehot = tf.one_hot(x_list, depth=10)
print('x_list type:', type(x_list),
      '\nx_list:', x_list,
      '\nx_list_onehot:\n',x_list_onehot.numpy())

x_array = np.array([3, 4])
x_array_onehot = tf.one_hot(x_array, depth=10, dtype=tf.int32)
print('x_array type:', type(x_array),
      '\nx_array:', x_array,
      '\nx_array_onehot:\n',x_array_onehot.numpy())
x_list type: <class 'list'> 
x_list: [3, 4] 
x_list_onehot:
 [[0. 0. 0. 1. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]]
x_array type: <class 'numpy.ndarray'> 
x_array: [3 4] 
x_array_onehot:
 [[0 0 0 1 0 0 0 0 0 0]
 [0 0 0 0 1 0 0 0 0 0]]

基于 onehot 编码

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
import math


tf.__version__
'2.0.0'

读取文件

文件路径为 C:\Users\your_user_name.keras\datasets

mnist = tf.keras.datasets.mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

显示数据维度

tensorflow2.x 与 tensorflow1.x 数据的形式不一样,在后面的显示图像函数将会与 tensorflow1.x 不同,数据集的存储的像素点格式是 unit8,范围是 0~255

print('Train image shape:', train_images.shape, 'Train label shape:', train_labels.shape)
print('Test image shape:', test_images.shape, 'Test label shape:', test_labels.shape)
print('images dtype:', type(train_images[0, 0, 0]))
Train image shape: (60000, 28, 28) Train label shape: (60000,)
Test image shape: (10000, 28, 28) Test label shape: (10000,)
images dtype: <class 'numpy.uint8'>

显示训练集图像

def plot_num_images(num):
    if num < 1:
        print('INFO:The number of input pictures must be greater than zero!')
    else:
        choose_list = []
        for i in range(num):
            choose_n = np.random.randint(train_images.shape[0])
            choose_list.append(choose_n)
        fig = plt.gcf()
        fig.set_size_inches(18, 5 * math.ceil(num / 3))
        for i in range(num):
            ax_img = plt.subplot(math.ceil(num / 3), 3, i + 1)
            plt_img = train_images[choose_list[i]]
            ax_img.imshow(plt_img, cmap='binary')
            ax_img.set_title('label:' + str(train_labels[choose_list[i]]),
                             fontsize=10)
        plt.show()
plot_num_images(9)

图

划分验证集

total_num = len(train_images)
valid_split = 0.2
train_num = int(total_num * (1 - valid_split))

train_x = train_images[:train_num]
train_y = train_labels[:train_num]

valid_x = train_images[train_num:]
valid_y = train_labels[train_num:]

test_x = test_images
test_y = test_labels

查看验证集 scale

print('validation dataset scale:', valid_x.shape)
validation dataset scale: (12000, 28, 28)

数据 flattern

train_x = train_x.reshape(-1, 784)
valid_x = valid_x.reshape(-1, 784)
test_x = test_x.reshape(-1, 784)

数据归一化与独热码

train_x = tf.cast(train_x / 255.0, tf.float32)
valid_x = tf.cast(valid_x / 255.0, tf.float32)
test_x = tf.cast(test_x / 255.0, tf.float32)

train_y = tf.one_hot(train_y, depth=10)
valid_y = tf.one_hot(valid_y, depth=10)
test_y = tf.one_hot(test_y, depth=10)

print('*******demo*******')
valid_y
*******demo*******





<tf.Tensor: id=25, shape=(12000, 10), dtype=float32, numpy=
array([[0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 1., 0., 0.],
       [0., 0., 0., ..., 0., 1., 0.],
       ...,
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 1., 0.]], dtype=float32)>

模型定义

前向计算

def model(x, w, b):
    pred = tf.matmul(x, w) + b
    return tf.nn.softmax(pred)

创建变量

W = tf.Variable(tf.random.normal([784, 10], mean=0.0, stddev=1.0, dtype=tf.float32))
B = tf.Variable(tf.zeros([10]), name='B')

定义损失函数

def loss(x, y, w, b):
    pred = model(x, w, b)
    loss_ = tf.keras.losses.categorical_crossentropy(y_true=y, y_pred=pred)
    return tf.reduce_mean(loss_)

模型训练参数设置

epochs = 40
batch_size = 50
learning_rate = 0.001

定义梯度计算函数

def grad(x, y, w, b):
    with tf.GradientTape() as tape:
        loss_ = loss(x, y, w, b)
    return tape.gradient(loss_, [w, b])

设置优化器

optimizer = tf.keras.optimizers.Adam(learning_rate=learning_rate)

定义准确率

def accuracy(x, y, w, b):
    pred = model(x, w, b)
    correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
    return tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

模型训练

total_step = int(train_num / batch_size)

loss_list_train = []
loss_list_valid = []
acc_list_train = []
acc_list_valid = []

for epoch in range(epochs):
    for step in range(total_step):
        xs = train_x[step * batch_size:(step + 1) * batch_size]
        ys = train_y[step * batch_size:(step + 1) * batch_size]
        
        grads = grad(xs, ys, W, B)
        optimizer.apply_gradients(zip(grads, [W, B]))
    loss_train = loss(train_x, train_y, W, B).numpy()
    loss_valid = loss(valid_x, valid_y, W, B).numpy()
    acc_train = accuracy(train_x, train_y, W, B).numpy()
    acc_valid = accuracy(valid_x, valid_y, W, B).numpy()
    loss_list_train.append(loss_train)
    loss_list_valid.append(loss_valid)
    acc_list_train.append(acc_train)
    acc_list_valid.append(acc_valid)
    print('Epoch: %2d' % (epoch + 1),
          'train_loss= %6f' % loss_train,
          'train_acc=%6f' % acc_train,
          'val_loss= %6f' % loss_valid,
          'val_acc=%4f' % acc_valid)
print('INFO:Train Finished!')
Epoch:  1 train_loss= 1.794028 train_acc=0.663083 val_loss= 1.688922 val_acc=0.675667
Epoch:  2 train_loss= 1.026453 train_acc=0.787437 val_loss= 0.956573 val_acc=0.796583
Epoch:  3 train_loss= 0.784511 train_acc=0.833396 val_loss= 0.735276 val_acc=0.839833
Epoch:  4 train_loss= 0.665004 train_acc=0.856875 val_loss= 0.629650 val_acc=0.864167
Epoch:  5 train_loss= 0.592476 train_acc=0.871167 val_loss= 0.567846 val_acc=0.875583
Epoch:  6 train_loss= 0.543093 train_acc=0.880000 val_loss= 0.526859 val_acc=0.883833
Epoch:  7 train_loss= 0.506531 train_acc=0.886771 val_loss= 0.497269 val_acc=0.889583
Epoch:  8 train_loss= 0.478374 train_acc=0.891437 val_loss= 0.474750 val_acc=0.893000
Epoch:  9 train_loss= 0.455545 train_acc=0.894813 val_loss= 0.456882 val_acc=0.895000
Epoch: 10 train_loss= 0.436355 train_acc=0.898396 val_loss= 0.442191 val_acc=0.898250
Epoch: 11 train_loss= 0.420245 train_acc=0.901312 val_loss= 0.429889 val_acc=0.900250
Epoch: 12 train_loss= 0.406171 train_acc=0.903313 val_loss= 0.419219 val_acc=0.901833
Epoch: 13 train_loss= 0.393859 train_acc=0.905500 val_loss= 0.410134 val_acc=0.902167
Epoch: 14 train_loss= 0.383010 train_acc=0.907292 val_loss= 0.401944 val_acc=0.903417
Epoch: 15 train_loss= 0.373411 train_acc=0.909083 val_loss= 0.394878 val_acc=0.904167
Epoch: 16 train_loss= 0.364686 train_acc=0.910500 val_loss= 0.388374 val_acc=0.905333
Epoch: 17 train_loss= 0.356754 train_acc=0.911667 val_loss= 0.382529 val_acc=0.906333
Epoch: 18 train_loss= 0.349559 train_acc=0.912833 val_loss= 0.377245 val_acc=0.907083
Epoch: 19 train_loss= 0.342927 train_acc=0.914021 val_loss= 0.372456 val_acc=0.907417
Epoch: 20 train_loss= 0.336870 train_acc=0.914729 val_loss= 0.368039 val_acc=0.908333
Epoch: 21 train_loss= 0.331268 train_acc=0.916083 val_loss= 0.364073 val_acc=0.908667
Epoch: 22 train_loss= 0.326073 train_acc=0.917188 val_loss= 0.360308 val_acc=0.909500
Epoch: 23 train_loss= 0.321280 train_acc=0.918042 val_loss= 0.356898 val_acc=0.910583
Epoch: 24 train_loss= 0.316829 train_acc=0.918854 val_loss= 0.353810 val_acc=0.911167
Epoch: 25 train_loss= 0.312685 train_acc=0.919375 val_loss= 0.351033 val_acc=0.912667
Epoch: 26 train_loss= 0.308808 train_acc=0.919667 val_loss= 0.348400 val_acc=0.912667
Epoch: 27 train_loss= 0.305181 train_acc=0.920479 val_loss= 0.345925 val_acc=0.913083
Epoch: 28 train_loss= 0.301788 train_acc=0.921229 val_loss= 0.343634 val_acc=0.913333
Epoch: 29 train_loss= 0.298610 train_acc=0.921792 val_loss= 0.341496 val_acc=0.913750
Epoch: 30 train_loss= 0.295608 train_acc=0.922375 val_loss= 0.339517 val_acc=0.913833
Epoch: 31 train_loss= 0.292766 train_acc=0.923021 val_loss= 0.337612 val_acc=0.914250
Epoch: 32 train_loss= 0.290099 train_acc=0.923458 val_loss= 0.335835 val_acc=0.915000
Epoch: 33 train_loss= 0.287562 train_acc=0.923813 val_loss= 0.334138 val_acc=0.915333
Epoch: 34 train_loss= 0.285154 train_acc=0.924458 val_loss= 0.332562 val_acc=0.915750
Epoch: 35 train_loss= 0.282870 train_acc=0.924750 val_loss= 0.331062 val_acc=0.915583
Epoch: 36 train_loss= 0.280705 train_acc=0.925313 val_loss= 0.329646 val_acc=0.915583
Epoch: 37 train_loss= 0.278647 train_acc=0.925646 val_loss= 0.328310 val_acc=0.916083
Epoch: 38 train_loss= 0.276685 train_acc=0.925958 val_loss= 0.327048 val_acc=0.916583
Epoch: 39 train_loss= 0.274815 train_acc=0.926458 val_loss= 0.325852 val_acc=0.916750
Epoch: 40 train_loss= 0.273008 train_acc=0.926813 val_loss= 0.324666 val_acc=0.916583
INFO:Train Finished!

损失率与准确率可视化

fig = plt.gcf()
fig.set_size_inches(10, 5)
ax1 = fig.add_subplot(111)
ax1.set_title('Train and Validation Picture')
ax1.set_ylabel('Loss value')
line1, = ax1.plot(loss_list_train, color=(0.5, 0.5, 1.0), label='Loss train')
line2, = ax1.plot(loss_list_valid, color=(0.5, 1.0, 0.5), label='Loss valid')
ax2 = ax1.twinx()
ax2.set_ylabel('Accuracy value')
line3, = ax2.plot(acc_list_train, color=(0.5, 0.5, 0.5), label='Accuracy train')
line4, = ax2.plot(acc_list_valid, color=(1, 0, 0), label='Accuracy valid')
plt.legend(handles=(line1, line2, line3, line4), loc='best')
plt.show()

图

模型评估

acc_test = accuracy(test_x, test_y, W, B).numpy()
print('Test Accuracy:', acc_test)
Test Accuracy: 0.9162

模型应用

def plot_apply_images(num):
    if num < 1:
        print('INFO:The number of input pictures must be greater than zero!')
    else:
        choose_list = []
        for i in range(num):
            choose_n = np.random.randint(len(test_x))
            choose_list.append(choose_n)
        fig = plt.gcf()
        fig.set_size_inches(18, 5 * math.ceil(num / 3))
        
        pred = model(test_x, W, B)
        prediction_result = tf.argmax(pred, 1).numpy()

        for i in range(num):
            ax_img = plt.subplot(math.ceil(num / 3), 3, i + 1)
            plt_img = test_images[choose_list[i]]
            ax_img.imshow(plt_img, cmap='binary')
            ax_img.set_title('Original label:' \
                             + str(test_labels[choose_list[i]]) \
                             + ' Predict label:' \
                             + str(prediction_result[choose_list[i]]),
                             fontsize=10)
            ax_img.set_xticks([])
            ax_img.set_yticks([])
        plt.show()
plot_apply_images(9)

图

基于整数型标签

mnist = tf.keras.datasets.mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()
total_num = len(train_images)
valid_split = 0.2
train_num = int(total_num * (1 - valid_split))

train_x = train_images[:train_num]
train_y = train_labels[:train_num]

valid_x = train_images[train_num:]
valid_y = train_labels[train_num:]

test_x = test_images
test_y = test_labels
train_x = train_x.reshape(-1, 784)
valid_x = valid_x.reshape(-1, 784)
test_x = test_x.reshape(-1, 784)
train_x = tf.cast(train_x / 255.0, tf.float32)
valid_x = tf.cast(valid_x / 255.0, tf.float32)
test_x = tf.cast(test_x / 255.0, tf.float32)
def model(x, w, b):
    pred = tf.matmul(x, w) + b
    return tf.nn.softmax(pred)
W = tf.Variable(tf.random.normal([784, 10], mean=0.0, stddev=1.0, dtype=tf.float32))
B = tf.Variable(tf.zeros([10]), name='B')
def loss(x, y, w, b):
    pred = model(x, w, b)
    loss_ = tf.keras.losses.sparse_categorical_crossentropy(y_true=y, y_pred=pred)
    return tf.reduce_mean(loss_)
epochs = 40
batch_size = 50
learning_rate = 0.001
def grad(x, y, w, b):
    with tf.GradientTape() as tape:
        loss_ = loss(x, y, w, b)
    return tape.gradient(loss_, [w, b])
optimizer = tf.keras.optimizers.Adam(learning_rate=learning_rate)
def accuracy(x, y, w, b):
    pred = model(x, w, b)
    correct_prediction = tf.equal(tf.argmax(pred, 1), y)
    return tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
total_step = int(train_num / batch_size)

loss_list_train = []
loss_list_valid = []
acc_list_train = []
acc_list_valid = []

for epoch in range(epochs):
    for step in range(total_step):
        xs = train_x[step * batch_size:(step + 1) * batch_size]
        ys = train_y[step * batch_size:(step + 1) * batch_size]
        
        grads = grad(xs, ys, W, B)
        optimizer.apply_gradients(zip(grads, [W, B]))
    loss_train = loss(train_x, train_y, W, B).numpy()
    loss_valid = loss(valid_x, valid_y, W, B).numpy()
    acc_train = accuracy(train_x, train_y, W, B).numpy()
    acc_valid = accuracy(valid_x, valid_y, W, B).numpy()
    loss_list_train.append(loss_train)
    loss_list_valid.append(loss_valid)
    acc_list_train.append(acc_train)
    acc_list_valid.append(acc_valid)
    print('Epoch: %2d' % (epoch + 1),
          'train_loss= %6f' % loss_train,
          'train_acc=%6f' % acc_train,
          'val_loss= %6f' % loss_valid,
          'val_acc=%4f' % acc_valid)
print('INFO:Train Finished!')
Epoch:  1 train_loss= 1.630926 train_acc=0.684563 val_loss= 1.520867 val_acc=0.700000
Epoch:  2 train_loss= 1.012482 train_acc=0.790854 val_loss= 0.959439 val_acc=0.802167
Epoch:  3 train_loss= 0.793968 train_acc=0.833063 val_loss= 0.764571 val_acc=0.839833
Epoch:  4 train_loss= 0.678524 train_acc=0.853562 val_loss= 0.661618 val_acc=0.858167
Epoch:  5 train_loss= 0.604963 train_acc=0.865708 val_loss= 0.595957 val_acc=0.870250
Epoch:  6 train_loss= 0.553632 train_acc=0.875583 val_loss= 0.550357 val_acc=0.876667
Epoch:  7 train_loss= 0.515726 train_acc=0.882583 val_loss= 0.516921 val_acc=0.882417
Epoch:  8 train_loss= 0.486057 train_acc=0.887333 val_loss= 0.491046 val_acc=0.886333
Epoch:  9 train_loss= 0.462157 train_acc=0.890958 val_loss= 0.470663 val_acc=0.889917
Epoch: 10 train_loss= 0.442267 train_acc=0.893729 val_loss= 0.453869 val_acc=0.893250
Epoch: 11 train_loss= 0.425563 train_acc=0.896375 val_loss= 0.439977 val_acc=0.895083
Epoch: 12 train_loss= 0.411025 train_acc=0.899437 val_loss= 0.427945 val_acc=0.897250
Epoch: 13 train_loss= 0.398382 train_acc=0.901667 val_loss= 0.417644 val_acc=0.899000
Epoch: 14 train_loss= 0.387267 train_acc=0.903792 val_loss= 0.408653 val_acc=0.899750
Epoch: 15 train_loss= 0.377387 train_acc=0.905229 val_loss= 0.400638 val_acc=0.901250
Epoch: 16 train_loss= 0.368553 train_acc=0.907229 val_loss= 0.393522 val_acc=0.902583
Epoch: 17 train_loss= 0.360529 train_acc=0.908813 val_loss= 0.387201 val_acc=0.902833
Epoch: 18 train_loss= 0.353216 train_acc=0.910312 val_loss= 0.381401 val_acc=0.903583
Epoch: 19 train_loss= 0.346536 train_acc=0.911604 val_loss= 0.376152 val_acc=0.904667
Epoch: 20 train_loss= 0.340360 train_acc=0.912938 val_loss= 0.371333 val_acc=0.905667
Epoch: 21 train_loss= 0.334666 train_acc=0.914521 val_loss= 0.367024 val_acc=0.905917
Epoch: 22 train_loss= 0.329405 train_acc=0.915313 val_loss= 0.363046 val_acc=0.906167
Epoch: 23 train_loss= 0.324505 train_acc=0.916396 val_loss= 0.359412 val_acc=0.907333
Epoch: 24 train_loss= 0.319910 train_acc=0.917125 val_loss= 0.356028 val_acc=0.908083
Epoch: 25 train_loss= 0.315597 train_acc=0.918187 val_loss= 0.352929 val_acc=0.909250
Epoch: 26 train_loss= 0.311598 train_acc=0.918771 val_loss= 0.349997 val_acc=0.909833
Epoch: 27 train_loss= 0.307848 train_acc=0.919583 val_loss= 0.347245 val_acc=0.910250
Epoch: 28 train_loss= 0.304321 train_acc=0.920458 val_loss= 0.344667 val_acc=0.910833
Epoch: 29 train_loss= 0.301005 train_acc=0.921229 val_loss= 0.342330 val_acc=0.911500
Epoch: 30 train_loss= 0.297862 train_acc=0.921792 val_loss= 0.340088 val_acc=0.912083
Epoch: 31 train_loss= 0.294898 train_acc=0.922333 val_loss= 0.337978 val_acc=0.912417
Epoch: 32 train_loss= 0.292092 train_acc=0.922896 val_loss= 0.336003 val_acc=0.912667
Epoch: 33 train_loss= 0.289436 train_acc=0.923396 val_loss= 0.334153 val_acc=0.912917
Epoch: 34 train_loss= 0.286909 train_acc=0.924062 val_loss= 0.332400 val_acc=0.913500
Epoch: 35 train_loss= 0.284467 train_acc=0.924646 val_loss= 0.330701 val_acc=0.914167
Epoch: 36 train_loss= 0.282164 train_acc=0.925083 val_loss= 0.329103 val_acc=0.914583
Epoch: 37 train_loss= 0.279960 train_acc=0.925812 val_loss= 0.327613 val_acc=0.915000
Epoch: 38 train_loss= 0.277867 train_acc=0.926104 val_loss= 0.326200 val_acc=0.915333
Epoch: 39 train_loss= 0.275865 train_acc=0.926375 val_loss= 0.324861 val_acc=0.915833
Epoch: 40 train_loss= 0.273955 train_acc=0.926771 val_loss= 0.323592 val_acc=0.916000
INFO:Train Finished!
fig = plt.gcf()
fig.set_size_inches(10, 5)
ax1 = fig.add_subplot(111)
ax1.set_title('Train and Validation Picture')
ax1.set_ylabel('Loss value')
line1, = ax1.plot(loss_list_train, color=(0.5, 0.5, 1.0), label='Loss train')
line2, = ax1.plot(loss_list_valid, color=(0.5, 1.0, 0.5), label='Loss valid')
ax2 = ax1.twinx()
ax2.set_ylabel('Accuracy value')
line3, = ax2.plot(acc_list_train, color=(0.5, 0.5, 0.5), label='Accuracy train')
line4, = ax2.plot(acc_list_valid, color=(1, 0, 0), label='Accuracy valid')
plt.legend(handles=(line1, line2, line3, line4), loc='best')
plt.show()

图

完结

TensorFlow1.x 与 TensorFlow2.x 在读入数据的方式不同,这篇仅涉及单个神经元的搭建,下篇将使用多层神经网络进行搭建,整个文件已经上传在该页面,土豪任性下载,我将在整个系列完成后,上传至我的 GitHub

猜你喜欢

转载自blog.csdn.net/qq_39567427/article/details/105755147
今日推荐