文章目录

数据集

数据集来源
数据集简介
数据集下载
数据集读取

卷积神经网络

卷积神经网络应用

图像分类(本篇所述)
目标检测(将在后续版本给出)

卷积神经网络原理

全连接网络的局限性
卷积神经网络的提出
卷积神经网络结构
卷积神经网络中的操作

卷积
0 填充(Padding)
多通道卷积
池化

操作可视化

Cifar10 TensorFLow 1.x 低阶 API 实现

前情函数

np.random.rand(d0, d1, d2...dn)
tf.nn.conv2d() 函数
tf.nn.max_pool() 函数
tf.nn.avg_pool() 函数

二维卷积示例

定义卷积核
定义卷积函数
图像卷积
可视化

pickle 模块简单的数据读入
numpy.concatenate()函数
numpy.transpose()函数
sklearn.preprocessing 的 OneHotEncoder

正式开始

下载数据

定义加载每个 batch 的函数
定义数据加载函数，连接整个 batch

读取数据，查看数据信息
定义图像可视化函数
数据预处理
定义网络结构
定义优化器
定义准确率
定义训练参数
开始训练
训练可视化
模型测试
预测可视化

Cifar10 TensorFLow 2.x 低阶 API 实现

前情函数

numpy.random.sample(size=None)
tensorflow.data.Dataset.from_tensor_slices
TensorFlow dataset 中 shuffle()、repeat()、batch()、prefetch()

正式开始

读取数据
数据预处理与参数设置
设置读入的模式
定义卷积，池化操作
定义全连接层参数
定义前向运算
定义损失函数与准确率
梯度下降
开始训练

Cifar10 TensorFLow 2.x 高阶 API Keras 实现

导入必要包
数据读取
模型构建
开始训练
训练可视化
预测可视化
注意

总结

数据集

数据集来源

CIFAR-10和CIFAR-100是8000万个微型图像数据集的标记子集。它们由Alex Krizhevsky，Vinod Nair和Geoffrey Hinton收集

数据集简介

CIFAR-10数据集包含10个类别的60000个32x32彩色图像，每个类别6000个图像。有50000张训练图像和10000张测试图像。

数据集分为五个训练批次和一个测试批次，每个批次具有10000张图像。测试批次包含每个类别中恰好1000张随机选择的图像。训练批次按随机顺序包含其余图像，但是某些训练批次可能包含比另一类更多的图像。在它们之间，培训批次精确地包含每个班级的5000张图像。

以下是数据集中的类别，以及每个类别中的10张随机图像：

airplane
automobile
bird
cat
deer
dog
frog
horse
ship
truck

数据集下载

Version	Size	md5sum
CIFAR-10 python version	163 MB	c58f30108f718f92721af3b95e74349a
CIFAR-10 Matlab version	175 MB	70270af85842c9e89bb428ec9976c926
CIFAR-10 binary version (适用于C程序)	162 MB	c32a1d4ab5d03f1284b67883e8d87530

数据集读取

数据文件解压后包括 data_batch_1，data_batch_2，…，data_batch_5(训练集) 以及 test_batch(测试集)。每个文件都是由 cPickle 生成的 Python “ pickled” 对象，所以在读取数据时，需要用到 pickle 包，在后续的 TensorFlow1.x 版本中将演示 pickle 的简单使用，以及数据的原始读入方法(因为在 TensorFlow2.x 中将 TensorFlow 内置了一站式读取方法)，下面是 cifar10 官网演示的数据读取方案

def unpickle(file):
    import pickle
    with open(file, 'rb') as fo:
        dict = pickle.load(fo, encoding='bytes')
    return dict

卷积神经网络

Convolutional Neural Network (CNN)

卷积神经网络(CNN 或 ConvNet)是一类深度神经网络，最常用于分析视觉图像。基于它们的共享权重架构和平移不变性特征，它们也被称为位移不变或空间不变的人工神经网络(SIANN)，卷积神经网络在各个领域应用广泛，尤其在视觉领域更是一枝独秀

卷积神经网络应用

图像分类(本篇所述)

在这里插入图片描述

卷积神经网络利用卷积核进行特征提取来区分不同类别的物体，其正确率已超过人类

目标检测(将在后续版本给出)

在这里插入图片描述
目标检测是计算机视觉的一大热门领域，不仅需要进行图像分类，还要将分类的图像进行划分，TensorFLow 有现成的目标检测 API，详情可移步 NO.1 Tensorflow在win10下实现object detection，TensorFlow 内置的目标检测 API 使用的是 Faster Rcnn，在目标检测领域还包括 Yolo 系列，不同的是 Faster Rcnn 是 two-stage，Yolo 系列采用 one-stage，有关 Yolov3 的使用参考 Python OpenCV 实现 Yolo v3，其中较为主要的区别在于 one-stage 网络速度要快很多，但two-stage网络的准确性要比one-stage网络要高
在这里插入图片描述

网络层数的递增，以及结构的调整，让目标检测的准确率逐步上升

除了以上领域，CNN(Convolutional Neural Network) 还在示例分割、看图说话、看图问答应用广泛，在自动驾驶领域，CNN 同样展现了独特的魅力

卷积神经网络原理

全连接网络的局限性

在前面的 mnist 手写数字识别中(参见 Tensorflow 笔记 Ⅴ——mnist全连接神经网络)我们直接搭建的一个全连接神经网络，将图像的每一个像素点都作为数据输入进全连接分类网络进行训练，这得益于 mnist 数据的图像很小，一个 28x28x1 的单通道灰度图，所以每一张图像只有 784 个像素点下面我们通过数据说明全连接网络的局限性
对 mnist建立一个单隐层的全连接神经网络，则需要拟合的参数个数： $28 ×28 × 1×500+500 ≈40w$
对 cifar 数据集建立一个单隐层的全连接神经网络，则需要拟合的参数个数： $32 ×32 × 3×500+500 ≈150w$
对一个 960x720x3 的彩色图数据集建立一个单隐层的全连接神经网络，则需要拟合的参数个数： $960 ×3720× 3×500+500 ≈1,036,80w≈10亿$
在这里插入图片描述

训练网络参数的增多会导致最直接的两个问题：计算速度减缓、网络过拟合，我们需要减少参数，CNN 应运而生

卷积神经网络的提出

•1962 年 Hubel 和 Wiesel 通过对猫视觉皮层细胞的研究，并在 1968 Hubel 和 Wiesel 的论中讲述猫和猴的视觉皮层含有对视野的小区域单独反应的神经元，如果眼睛没有移动，则视觉刺激影响单个神经元的视觉空间区域被称为其感受野(Receptive Field)。认为视觉皮层的神经元就是局部接受信息的，只受某些特定区域刺激的响应，而不是对全局图像进行感知
•1984年日本学者Fukushima基于感受野概念提出神经认知机
(neocognitron)，这被认为是 CNN 架构的起源
•后续网络的提出逐渐优化和壮大了 CNN 例如时延神经网络、不变位移神经网络等，CNN 逐渐完善，更多 CNN 起源点击这里，后来硬件资源的发展，大数据的发展都加速了 CNN 的进程，不仅保证了算力，也保证了充足的数据
在这里插入图片描述

卷积神经网络结构

在这里插入图片描述
•输入层：将图像每一个像素点输入进网络
•卷积层：图像卷积操作，提取图像特征，使原始信号特征增强，降低噪音
•降采样层：包括池化层以及(步长＞1)的卷积操作，目的是减少训练参数，加快训练进程，防止过拟合
•全连接层：训练特征权值
所以 CNN 的大致步骤可表示为：输入→卷积/池化(多层)→全连接(多层)→输出层

卷积神经网络中的操作

卷积

（1）求点积：将5×5输入矩阵中3×3深蓝色区域中每个元素分别与其对应位置的权值（红色数字）相乘，然后再相加，所得到的值作为3×3输出矩阵（绿色）的第一个元素。
（2）滑动窗口：将3×3权值矩阵向右移动一个格（即，步长为1）
（3）重复操作：同样地，将此时深色区域内每个元素分别与对应的权值相乘然后再相加，所得到的值作为输出矩阵的第二个元素；重复上述“求点积-滑动窗口”操作，直至输出矩阵所有值被填满，卷积可以选择步长，即卷积核滑动的像素点个数

特点：
加权求和：每个输出都是所有输入数据的加权求和
局部连接：每个输出特性不用查看每个输入特征，而只需查看部分输入特征
权值共享：卷积核在图像上滑动过程中保持不变

动图展示

在这里插入图片描述

0 填充(Padding)

在卷积核滑动的过程中图像的边缘会被裁剪掉，将5×5特征矩阵转换为 3×3的特征矩阵。用额外的“假”像素（通常值为 0）填充边缘。这样，在滑动时的卷积核可以允许原始边缘像素位于卷积核的中心，同时延伸到边缘之外的假像素，从而产生与输入（5×5蓝色）相同大小的输出（5×5绿色）
在这里插入图片描述

多通道卷积

对图像用一个卷积核进行卷积运算，实际上是一个滤波的过程。每个卷积核都是一种特征提取方式，就像是一个筛子，将图像中符合条件的部分筛选出来，每个卷积核都会将图像生成为另一幅特征映射图，即：一个卷积核提取一种特征。为了使特征提取更充分，可以添加多个卷积核以提取不同的特征，也就是，多通道卷积。然后将这些特征图相同位置上的值相加，生成一张特征图(feature map)，再对每个 feature map 加一个偏置项以便产生最终的输出特征图

在这里插入图片描述

池化

•在卷积层之后常常紧接着一个降采样层，通过减小矩阵的长和宽，从而达到减少参数的目的
•降采样是降低特定信号的采样率的过程
•计算图像一个区域上的某个特定特征的平均值或最大值，这种聚合操作就叫做池化(pooling)
• 卷积层的作用是探测上一层特征的局部连接，而池化的作用是在语义上把相似的特征合并起来，从而达到降维目的
(1) 均值池化：对池化区域内的像素点取均值，这种方法得到的特征数据对背景信息更敏感
(2) 最大池化：对池化区域内所有像素点取最大值，这种方法得到的特征对纹理特征信息更加敏感
(3)池化可以选择步长，即下一个区域与上一个区域之间经过的像素点个数
在这里插入图片描述

操作可视化

在这里插入图片描述
是我将 stanford 网页上的可视化文件打包成 exe 的可执行文件，点击这里可以可视化并手动操作图像卷积

在其网页上还有一些卷积操作的原理

Cifar10 TensorFLow 1.x 低阶 API 实现

前情函数

np.random.rand(d0, d1, d2…dn)

通过本函数可以返回一个或一组服从 0~1 均匀分布的随机样本值。随机样本取值范围是[0,1)，不包括1，[d0, d1,d2…dn]表示生成数据的维度

import tensorflow as tf
from sklearn.preprocessing import OneHotEncoder
import numpy as np
import matplotlib.pyplot as plt
import pickle as p
import math
from time import time
import cv2

matrix = np.random.rand(5, 5)
print('matrix value:\n', matrix)

matrix value:
 [[0.78158819 0.4468711  0.46988561 0.54041034 0.62936808]
 [0.38545259 0.77640702 0.28374112 0.73323367 0.88914412]
 [0.29065348 0.73538156 0.21179341 0.22007978 0.91955227]
 [0.99769469 0.61212948 0.45331221 0.52345979 0.71789871]
 [0.59133094 0.66875222 0.3672258  0.63603345 0.88606763]]

通过图像容易发现这是一个产生均匀分布的随机数的函数

data = np.random.rand(10000)
plt.hist(data, bins=50)
plt.show()

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-2TrPDm29-1588178231931)(output_5_0.png)]

tf.nn.conv2d() 函数

tf.nn.conv2d(
input,
filter=None,
strides=None,
padding=None,
use_cudnn_on_gpu=True,
data_format=‘NHWC’,
dilations=[1, 1, 1, 1],
name=None,
filters=None
)
• input：需要做卷积的输入数据。注意：这是一个4维的张量（[batch, in_height, in_width, in_channels])，要求类型为 float32 或 float64 其中之一

• filter：卷积核。[filter_height, filter_width, in_channels, out_channels]
• strides：图像每一维的步长，是一个一维向量，长度为4

• padding：定义元素边框与元素内容之间的空间。“SAME” 或 “VALID”，这个值决定了不同的卷积方式。当为"SAME"时，表示边缘填充，适用于全尺寸作；当为"VALID"时，表示边缘不填充。

• use_cudnn_on_gpu：bool类型，是否使用cudnn加速

• name：该操作的名称

• 返回值：返回一个tensor，即feature map

matrix = np.random.rand(1, 5, 5, 1)
kernel = np.random.rand(3, 3, 1, 1)
input_data = tf.Variable(matrix, dtype=tf.float32)
filter_data = tf.Variable(kernel, dtype=tf.float32)

y_same = tf.nn.conv2d(input_data, filter_data, strides=[1, 1, 1, 1], use_cudnn_on_gpu=True, padding='SAME')
y_valid = tf.nn.conv2d(input_data, filter_data, strides=[1, 1, 1, 1], use_cudnn_on_gpu=True, padding='VALID')

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    y_1 = sess.run(y_same)
    y_2 = sess.run(y_valid)
    y_3 = sess.run(input_data)

print('The input data:\n', y_3,
      '\nThe input data shape:\n', y_3.shape,
      '\nThe \'SAME\' conv:\n', y_1,
      '\nThe \'SAME\' shape:\n', y_1.shape,
      '\nThe \'VALID\' conv:\n', y_2,
      '\nThe \'VALID\' shape:\n', y_2.shape,)

The input data:
 [[[[0.83288574]
   [0.7839328 ]
   [0.78448915]
   [0.27689502]
   [0.69270074]]

  [[0.1779074 ]
   [0.06221103]
   [0.80587375]
   [0.79026604]
   [0.7586798 ]]

  [[0.404163  ]
   [0.3389252 ]
   [0.80162454]
   [0.7276099 ]
   [0.04177679]]

  [[0.32564884]
   [0.70365876]
   [0.4756321 ]
   [0.5159088 ]
   [0.22641866]]

  [[0.42596477]
   [0.9774817 ]
   [0.6001973 ]
   [0.83533174]
   [0.28074622]]]] 
The input data shape:
 (1, 5, 5, 1) 
The 'SAME' conv:
 [[[[0.64746857]
   [1.5796132 ]
   [2.012709  ]
   [2.2981427 ]
   [1.5702264 ]]

  [[2.0084417 ]
   [2.713911  ]
   [2.7838511 ]
   [2.7585385 ]
   [1.6316812 ]]

  [[1.2610247 ]
   [2.3821883 ]
   [3.1639755 ]
   [2.8346455 ]
   [1.3397884 ]]

  [[1.9888229 ]
   [3.0812583 ]
   [3.632366  ]
   [2.2990842 ]
   [1.1451306 ]]

  [[1.1747122 ]
   [1.5842456 ]
   [1.5154519 ]
   [1.1904594 ]
   [0.5789158 ]]]] 
The 'SAME' shape:
 (1, 5, 5, 1) 
The 'VALID' conv:
 [[[[2.713911 ]
   [2.7838511]
   [2.7585385]]

  [[2.3821883]
   [3.1639755]
   [2.8346455]]

  [[3.0812583]
   [3.632366 ]
   [2.2990842]]]] 
The 'VALID' shape:
 (1, 3, 3, 1)

tf.nn.max_pool() 函数

tf.nn.max_pool(
value,
ksize,
strides,
padding,
data_format=‘NHWC’,
name=None,
input=None
)

tf.nn.avg_pool() 函数

tf.nn.avg_pool(
value,
ksize,
strides,
padding,
data_format=‘NHWC’,
name=None,
input=None
)
• value：需要池化的输入。一般池化层接在卷积层后面，所以输入通常是conv2d，所输出的 feature map，依然是4维的张量（[batch, height, width, channels]）
• ksize：池化窗口的大小，由于一般不在 batch 和 channel 上做池化，所以 ksize 一般是 [1,height, width,1]
• strides：图像每一维的步长，是一个一维向量，长度为4
• padding：和卷积函数中padding含义一样
• data_format：一个字符串。支持 NHWC 和 NCHW
• name：该操作的名称
• 返回值：返回一个tensor

下面仅以 tf.nn.max_pool() 函数函数为例

matrix = np.random.rand(1, 5, 5, 1)
kernel = np.random.rand(3, 3, 1, 1)
input_data = tf.Variable(matrix, dtype=tf.float32)
filter_data = tf.Variable(kernel, dtype=tf.float32)

y = tf.nn.conv2d(input_data, filter_data, strides=[1, 1, 1, 1], use_cudnn_on_gpu=True, padding='SAME')
output_same = tf.nn.max_pool(value=y, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
output_valid = tf.nn.max_pool(value=y, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='VALID')

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    y0 = sess.run(y)
    y1 = sess.run(output_same)
    y2 = sess.run(output_valid)
    
print('y shape:', y0.shape,
      '\noutput_same value:\n', y1,
      '\noutput_same shape:', y1.shape,
      '\noutput_valid value:\n', y2,
      '\noutput_valid shape:', y2.shape,)

y shape: (1, 5, 5, 1) 
output_same value:
 [[[[1.5538048 ]
   [1.5930182 ]
   [0.6173942 ]]

  [[1.8491311 ]
   [2.1718354 ]
   [1.7262425 ]]

  [[0.98157835]
   [1.1464852 ]
   [1.2454841 ]]]] 
output_same shape: (1, 3, 3, 1) 
output_valid value:
 [[[[1.5538048]
   [1.5930182]]

  [[1.8491311]
   [2.1718354]]]] 
output_valid shape: (1, 2, 2, 1)

二维卷积示例

定义卷积核

kernel_1 = np.array(
            [[-1, 0, 1],
            [-2, 0, 2],
            [-1, 0, 1]])

kernel_2 = np.array(
            [[-1, -2, -1],
            [0, 0, 0],
            [1, 2, 1]])

kernel_3 = np.array(
            [[1, 1, 1],
            [1, -15, 1],
            [1, 1, 1]])

定义卷积函数

def Conv(image, kernel):
    img_width, img_height = image.shape
    k_dim1, k_dim2 = kernel.shape
    
    AddW = int((k_dim1-1) / 2)
    AddH = int((k_dim2-1) / 2)

    image = np.row_stack((np.zeros([1, image.shape[1]]), image, np.zeros([1, image.shape[1]])))
    image = np.column_stack((np.zeros([image.shape[0], 1]), image, np.zeros([image.shape[0], 1])))
    
    output = np.zeros_like(image)
    
    for i in range(AddW, AddW + img_width):
        for j in range(AddH, AddH + img_height):
            output[i][j] = int(np.sum(image[i - AddW:i + AddW + 1,
                                          j - AddW:j + AddW + 1] * kernel))
    
    return output[AddW:AddW + img_width, AddH:AddH + img_height]

图像卷积

image_row = cv2.cvtColor(cv2.imread('./data/demo.jpg'), cv2.COLOR_BGR2RGB)
image = cv2.cvtColor(image_row, cv2.COLOR_BGR2GRAY)

sobel_x = Conv(image, kernel_1)
sobel_y = Conv(image, kernel_2)
laplace = Conv(image, kernel_3)

可视化

pic = [image_row, sobel_x, sobel_y, laplace]
fig = plt.gcf()
fig.set_size_inches(20, 20)
for i in range(len(pic)):
    ax_img = plt.subplot(1, 4, i + 1)
    plt_img = pic[i]
    if i == 0:
        ax_img.imshow(plt_img)
    else:
        ax_img.imshow(plt_img, cmap='gray')
plt.show()

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-TMKfeyVE-1588178231933)(output_18_0.png)]

pickle 模块简单的数据读入

pickle模块只能在Python中使用，python中几乎所有的数据类型（列表，字典，集合，类等）都可以用pickle来序列化，pickle序列化后的数据，可读性差，人一般无法识别，pickle 只能读取由 pickle 写入的数据，如下所示的字符串I am Iron Man!在pickle写入的文件内容为€X I am Iron Man!q .

file_data = 'I am Iron Man!'

print(file_data)
with open('./data/demo.txt', 'wb', 0) as f:
    p.dump(file_data,f)

print('*******line*******')

with open('./data/demo.txt', 'rb', 0) as f:
    data = p.load(f)
print(data)

I am Iron Man!
*******line*******
I am Iron Man!

numpy.concatenate()函数

numpy.concatenate((a1，a2，…), axis = 0, out = None)
numpy提供了numpy.concatenate((a1,a2,…), axis=0)函数。能够一次完成多个数组的拼接。其中a1,a2,…是数组类型的参数

参量：
a1，a2，…： array_like 序列
数组必须具有相同的形状，除了对应轴的尺寸（默认为第一个）
axis：int，可选
阵列将沿其连接的轴。如果axis为None，则在使用前将数组展平。默认值为 0，其值代表了沿着什么维度连接
out：ndarray，可选
如果提供，则为放置结果的参数。形状必须正确，并且与未指定out参数的串联连接的形状匹配
返回值：ndarray
串联后的值

a=np.array([1,2,3])
b=np.array([11,22,33])
c=np.array([44,55,66])
result = np.concatenate((a,b,c),axis=0)
result

array([ 1,  2,  3, 11, 22, 33, 44, 55, 66])

行拼接

a=np.array([[1,2,3],[4,5,6]])
b=np.array([[11,21,31],[7,8,9]])
result = np.concatenate((a,b),axis=0)
result

array([[ 1,  2,  3],
       [ 4,  5,  6],
       [11, 21, 31],
       [ 7,  8,  9]])

列拼接

a=np.array([[1,2,3],[4,5,6]])
b=np.array([[11,21,31],[7,8,9]])
result = np.concatenate((a,b),axis=1)
result

array([[ 1,  2,  3, 11, 21, 31],
       [ 4,  5,  6,  7,  8,  9]])

numpy.transpose()函数

numpy.transpose(a，axes = None)

反转或排列数组的轴；返回修改后的数组
参量：
a： array_like 序列
axes：输入元组，用于反转的轴
返回 array_like 值
简单来说，对于矩阵就是转置，对于一般输入数据，根据元组值进行位置调整
矩阵

x = np.arange(4).reshape((2,2))
print('x:\n', x)
print('x transpose:\n', x.transpose())

x:
 [[0 1]
 [2 3]]
x transpose:
 [[0 2]
 [1 3]]

多维数组

解析一下这个 cell，x=np.ones(1, 2, 3) 代表维度为 1x2x3，其中可假设映射关系为 1-axis0，2-axis1，3-axis2，那么通过 transpose(1, 0, 2)，就意味着 transpose(axis1, axis0, axis2)，映射回去就有 x_transpose 维度为 2x1x3

x = np.ones((1, 2, 3))
print('x:\n', x)
print('x shape:\n', x.shape)
x_transpose = x.transpose(1, 0, 2)
print('x transpose:\n', x_transpose)
print('x transpose shape:\n', x_transpose.shape)

x:
 [[[1. 1. 1.]
  [1. 1. 1.]]]
x shape:
 (1, 2, 3)
x transpose:
 [[[1. 1. 1.]]

 [[1. 1. 1.]]]
x transpose shape:
 (2, 1, 3)

对下同样有映射关系 10000-axis0，3-axis1，32-axis2，32-axis3，transpose(0, 2, 3, 1)意味着transpose(axis0, axis2, axis3, axis1)，可得 x_transpose 维度为 10000x32x32x3

x = np.ones((10000, 3, 32, 32))
print('x shape:\n', x.shape)
x_transpose = x.transpose(0, 2, 3, 1)
print('x transpose shape:\n', x_transpose.shape)

x shape:
 (10000, 3, 32, 32)
x transpose shape:
 (10000, 32, 32, 3)

sklearn.preprocessing 的 OneHotEncoder

OneHotEncoder 用于进行独热编码

a = np.array([[1], [2], [3], [4], [5]])

one_hot = [[0],[1],[2],[3],[4],[5],[6],[7],[8],[9]]

encoder = OneHotEncoder(sparse=False)
encoder.fit(one_hot)

OneHotEncoder(categories='auto', drop=None, dtype=<class 'numpy.float64'>,
              handle_unknown='error', sparse=False)

encoder.transform(a)

array([[0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 1., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 1., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.]])

正式开始

下载数据

解压数据分布如下所示

batches.meta

data_batch_1

data_batch_2

data_batch_3

data_batch_4

data_batch_5

test_batch

import urllib.request
import os
import tarfile

url = 'https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz'
filepath = './data/cifar-10-python.tar.gz'
if not os.path.exists('./data'):
    os.makedirs('./data')
if not os.path.isfile(filepath):
    result = urllib.request.urlretrieve(url, filepath)
    print('downloaded:', result)
else:
    print('Data file already exists.')

if not os.path.exists('./data/cifar-10-batches-py'):
    tfile = tarfile.open('./data/cifar-10-python.tar.gz', 'r:gz')
    result = tfile.extractall('./data/')
    print('Extracted to .cifar-10-batches-py/')
else:
    print('Directory already exists.')

Data file already exists.
Directory already exists.

定义加载每个 batch 的函数

def load_CIFAR_batch(filename):
    with open(filename, 'rb') as f:
        data_dict = p.load(f, encoding='bytes')
        images = data_dict[b'data']
        labels = data_dict[b'labels']

        images = images.reshape(10000, 3, 32, 32)
        images = images.transpose(0, 2, 3, 1)
        labels = np.array(labels)

        return images, labels

定义数据加载函数，连接整个 batch

del 语句可以减少内存占用，提高执行效率，取消没有太大的影响

def load_CIFAR_data(data_dir):
    images_train = []
    labels_train = []
    for i in range(5):
        f = os.path.join(data_dir, 'data_batch_%d' % (i + 1))
        print('loading', f)
        image_batch, label_batch = load_CIFAR_batch(f)
        images_train.append(image_batch)
        labels_train.append(label_batch)
        print('train images nums:', len(images_train))
        Xtrain = np.concatenate(images_train)
        Ytrain = np.concatenate(labels_train)
        
        # del image_batch, label_batch

    Xtest, Ytest = load_CIFAR_batch(os.path.join(data_dir, 'test_batch'))
    print('finished loadding CIFAR-10 data')

    return Xtrain, Ytrain, Xtest, Ytest

读取数据，查看数据信息

data_dir = 'data/cifar-10-batches-py'
Xtrain, Ytrain, Xtest, Ytest = load_CIFAR_data(data_dir)

loading data/cifar-10-batches-py\data_batch_1
train images nums: 1
loading data/cifar-10-batches-py\data_batch_2
train images nums: 2
loading data/cifar-10-batches-py\data_batch_3
train images nums: 3
loading data/cifar-10-batches-py\data_batch_4
train images nums: 4
loading data/cifar-10-batches-py\data_batch_5
train images nums: 5
finished loadding CIFAR-10 data

print('training data shape:', Xtrain.shape)
print('training labels shape:', Ytrain.shape)
print('test data shape:', Xtest.shape)
print('test labels shape:', Ytest.shape)

training data shape: (50000, 32, 32, 3)
training labels shape: (50000,)
test data shape: (10000, 32, 32, 3)
test labels shape: (10000,)

定义图像可视化函数

label_dict = {0:"airplane", 1:"automobile", 2:"bird", 3:"cat", 4:"deer",
              5:"dog", 6:"frog", 7:"horse", 8:"ship", 9:"truck"}


def plot_images_labels(images, labels, num):
    total = len(images)
    fig = plt.gcf()
    fig.set_size_inches(15, math.ceil(num / 10) * 7)
    for i in range(0, num):
        choose_n = np.random.randint(0, total)
        ax = plt.subplot(math.ceil(num / 5), 5, 1 + i)
        ax.imshow(images[choose_n], cmap='binary')

        title = str(i) + ',' + label_dict[labels[choose_n]]
        ax.set_title(title, fontsize=10)
    plt.show()

plot_images_labels(Xtrain, Ytrain, 20)

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-RS7Y0RoH-1588178231935)(output_50_0.png)]

数据预处理

Xtrain_normalize = Xtrain.astype('float32') / 255.0
Xtest_normalize = Xtest.astype('float32') / 255.0

encoder = OneHotEncoder(sparse=False)

one_hot = [[0],[1],[2],[3],[4],[5],[6],[7],[8],[9]]
encoder.fit(one_hot)
Ytrain_reshape = Ytrain.reshape(-1, 1)
Ytrain_onehot = encoder.transform(Ytrain_reshape)
Ytest_reshape = Ytest.reshape(-1, 1)
Ytest_onehot = encoder.transform(Ytest_reshape)

Ytrain_reshape.shape

(50000, 1)

one_hot = [[0],[1],[2],[3],[4],[5],[6],[7],[8],[9]]

encoder.fit(one_hot)

OneHotEncoder(categories='auto', drop=None, dtype=<class 'numpy.float64'>,
              handle_unknown='error', sparse=False)

Ytrain_reshape[:6]

array([[6],
       [9],
       [9],
       [4],
       [1],
       [1]])

encoder.transform(Ytrain_reshape[:6])

array([[0., 0., 0., 0., 0., 0., 1., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 1.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 1.],
       [0., 0., 0., 0., 1., 0., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.]])

定义网络结构

def weight(shape):
    return tf.Variable(tf.truncated_normal(shape, stddev=0.1), name='W')


def bias(shape):
    return tf.Variable(tf.constant(0.1, shape=shape), name='b')


def conv2d(x,W):
    return tf.nn.conv2d(x, W, strides=[1,1,1,1], padding='SAME')


def max_pool_2x2(x):
    return tf.nn.max_pool(x, ksize=[1,2,2,1], strides=[1,2,2,1], padding='SAME')

with tf.name_scope('input_layer'):
    x = tf.placeholder('float32', shape=[None, 32, 32, 3], name='x')

    
with tf.name_scope('conv_1'):
    # k_width, k_height, input_chn, output_chn
    W1 = weight([3, 3, 3, 32])
    b1 = bias([32])
    conv_1 = conv2d(x, W1) + b1
    conv_1 = tf.nn.relu(conv_1)


with tf.name_scope('pool_1'):
    pool_1 = max_pool_2x2(conv_1)


with tf.name_scope('conv_2'):
    W2 = weight([3, 3, 32, 64])
    b2 = bias([64])
    conv_2 = conv2d(pool_1, W2) + b2
    conv_2 = tf.nn.relu(conv_2)


with tf.name_scope('pool_2'):
    pool_2 = max_pool_2x2(conv_2)


with tf.name_scope('fcn'):
    W3 = weight([4096, 128])
    b3 = bias([128])
    flat = tf.reshape(pool_2, [-1, 4096])
    h = tf.nn.relu(tf.matmul(flat, W3) + b3)
    h_dropout = tf.nn.dropout(h, rate=0.2)


with tf.name_scope('output_layer'):
    W4 = weight([128, 10])
    b4 = bias([10])
    forward = tf.matmul(h_dropout, W4) + b4
    pred = tf.nn.softmax(forward)

定义优化器

with tf.name_scope('optimizer'):
    y = tf.placeholder('float32', shape=[None, 10], name='label')
    loss_function = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=forward, labels=y))
    optimizer = tf.train.AdamOptimizer(
        learning_rate=0.0001).minimize(loss_function)

定义准确率

with tf.name_scope('evalution'):
    correct_prediction = tf.equal(tf.argmax(pred, 1),
                   tf.argmax(y, 1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction,'float32'))

定义训练参数

train_epochs = 100
batch_size = 20
total_batch = int(len(Xtrain)/batch_size)
epoch_list = []
accuracy_list = []
loss_list = []

epoch = tf.Variable(0, name='epoch', trainable=False)

startTime = time()

sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)

ckpt_dir = './CIFAR10_ckpt/'
if not os.path.exists(ckpt_dir):
    os.makedirs(ckpt_dir)

log_dir = './CIFAR10_log/'
if not os.path.exists(log_dir):
    os.makedirs(log_dir)

saver = tf.train.Saver(max_to_keep = 5)
writer = tf.summary.FileWriter(log_dir, sess.graph)

image_shape_input = tf.reshape(x, [-1, 32, 32, 1])
tf.summary.image('input', image_shape_input, 9)
tf.summary.histogram('forward', forward)
tf.summary.scalar('loss', loss_function)
tf.summary.scalar('accuracy', accuracy)
merged_summary_op = tf.summary.merge_all()

ckpt = tf.train.latest_checkpoint(ckpt_dir)
if ckpt != None:
    saver.restore(sess, ckpt)
else:
    print("Training from scratch.")

start = sess.run(epoch)
print("Training starts from {} epoch.".format(start + 1))

Training from scratch.
Training starts from 1 epoch.

开始训练

def get_train_batch(number, batch_size):
    return Xtrain_normalize[number*batch_size:(number+1)*batch_size],\
              Ytrain_onehot[number*batch_size:(number+1)*batch_size]

for ep in range(start, train_epochs):
    for i in range(total_batch):
        batch_x, batch_y = get_train_batch(i,batch_size)
        sess.run(optimizer, feed_dict={x:batch_x,y:batch_y})
    if i % 100 == 0:
        print("Step {}".format(i), "finished")
  
    summary_str, loss, acc = sess.run([merged_summary_op, loss_function, accuracy],feed_dict={x:batch_x, y:batch_y})
    writer.add_summary(summary_str, ep)
    epoch_list.append(ep+1)
    loss_list.append(loss)
    accuracy_list.append(acc)

    print("Train epoch:", '%02d' % (sess.run(epoch) + 1),
        "Loss=", "{:.6f}".format(loss), "Accuracy=", acc)
  
    saver.save(sess, ckpt_dir + "CIFAR10_cnn_model.ckpt", global_step=ep+1)
    sess.run(epoch.assign(ep + 1))

duration = time() - startTime
print("Train finished takes:", duration)

部分训练如下
Train epoch: 80 Loss= 0.093959 Accuracy= 1.0
Train epoch: 81 Loss= 0.131211 Accuracy= 0.95
Train epoch: 82 Loss= 0.112236 Accuracy= 0.95
Train epoch: 83 Loss= 0.085586 Accuracy= 1.0
Train epoch: 84 Loss= 0.095652 Accuracy= 1.0
Train epoch: 85 Loss= 0.095077 Accuracy= 1.0
Train epoch: 86 Loss= 0.175582 Accuracy= 0.95
Train epoch: 87 Loss= 0.025582 Accuracy= 1.0
Train epoch: 88 Loss= 0.495855 Accuracy= 0.85
Train epoch: 89 Loss= 0.190243 Accuracy= 0.95
Train epoch: 90 Loss= 0.237407 Accuracy= 0.95
Train epoch: 91 Loss= 0.102441 Accuracy= 0.95
Train epoch: 92 Loss= 0.096885 Accuracy= 1.0
Train epoch: 93 Loss= 0.300082 Accuracy= 0.9
Train epoch: 94 Loss= 0.083318 Accuracy= 0.95
Train epoch: 95 Loss= 0.245765 Accuracy= 0.85
Train epoch: 96 Loss= 0.073099 Accuracy= 1.0
Train epoch: 97 Loss= 0.036506 Accuracy= 1.0
Train epoch: 98 Loss= 0.030477 Accuracy= 1.0
Train epoch: 99 Loss= 0.028218 Accuracy= 1.0
Train epoch: 100 Loss= 0.058660 Accuracy= 1.0
Train finished takes: 1393.7970790863037

训练可视化

plt.plot(epoch_list, accuracy_list, label='accuracy')
plt.plot(epoch_list, loss_list, label='loss')
fig = plt.gcf()
fig.set_size_inches(4, 2)
plt.ylim(0.1, 1)
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend()
plt.show()

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-BcWF5Lqp-1588178231936)(output_72_0.png)]

模型测试

test_total_batch = int(len(Xtest_normalize) / batch_size)
test_acc_sum = 0.0
for i in range(test_total_batch):
    test_image_batch = Xtest_normalize[i*batch_size:(i+1)*batch_size]
    test_label_batch = Ytest_onehot[i*batch_size:(i+1)*batch_size]
    test_batch_acc = sess.run(accuracy, feed_dict={x:test_image_batch,y:test_label_batch})
    test_acc_sum += test_batch_acc
test_acc = float(test_acc_sum / test_total_batch)
print("Test accuracy:{:.6f}".format(test_acc))

Test accuracy:0.670200

预测可视化

def plot_images_prediction_labels(images, labels, num):
    total = len(images)
    fig = plt.gcf()
    fig.set_size_inches(15, math.ceil(num / 10) * 7)
    for i in range(0, num):
        choose_n = np.random.randint(0, total)
        ax = plt.subplot(math.ceil(num / 5), 5, 1 + i)
        ax.imshow(images[choose_n], cmap='binary')
        test_pred = sess.run(pred, feed_dict={x:(Xtest.astype('float32') / 255.0)[choose_n:choose_n + 1]})
        title = 'label:' + label_dict[labels[choose_n]] + \
                ' pred:' + label_dict[np.argmax(test_pred, 1)[0]]
        ax.set_title(title, fontsize=10)
    plt.show()

plot_images_prediction_labels(Xtest, Ytest, 10)

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-hq8htWSZ-1588178231938)(output_77_0.png)]

Cifar10 TensorFLow 2.x 低阶 API 实现

前情函数

import numpy as np
import matplotlib.pyplot as plt
import math
import tensorflow as tf

tf.__version__

'2.0.0'

numpy.random.sample(size=None)

np.random.sample(size=None) 是产生随机数的函数，size 用来指定产生的维度，如果不指定，则产生一个标量，产生数据的范围为 [0, 1)，size 的参数为整数或整数型元组

scalar = np.random.sample()
print('scalar value:', scalar)

matrix = np.random.sample((4, 4))
print('matrix value:\n', matrix)

scalar value: 0.8916937281165576
matrix value:
 [[0.21804301 0.23273453 0.1989421  0.91833431]
 [0.96301494 0.02083241 0.44543199 0.52984906]
 [0.59214349 0.20801405 0.41085002 0.01566343]
 [0.2869535  0.22004528 0.91109743 0.24408777]]

tensorflow.data.Dataset.from_tensor_slices

tf.data.Dataset.from_tensor_slices((train, label)) 是将数据的特征与标签相对应，比如创建一个 6 行的矩阵，列数随便，那么就需要 6 列标签来与 6 行矩阵的每一行数据相对应，可以发现经过 tf.data.Dataset.from_tensor_slices() 后，可以发现每一行对应一个标签，每一行有三个数据，所以 data 的维度是 ((3,), (1,))

features, labels = (np.random.sample((6, 3)), np.random.sample((6, 1)))
 
print('features:\n', features, 
      '\nlabels:\n', labels)

data = tf.data.Dataset.from_tensor_slices((features, labels))
print(data)

features:
 [[0.79680143 0.7589638  0.21623016]
 [0.02987241 0.83818346 0.02416548]
 [0.3339702  0.7937907  0.36895411]
 [0.93513608 0.64110223 0.149116  ]
 [0.02685498 0.32784008 0.26686279]
 [0.92430634 0.76099528 0.08407499]] 
labels:
 [[0.76455789]
 [0.37542639]
 [0.93846915]
 [0.05775337]
 [0.48292575]
 [0.32334656]]
<TensorSliceDataset shapes: ((3,), (1,)), types: (tf.float64, tf.float64)>

我们以 cifar10 的数据为例，train一共有 50000 个数据，每一个数据是一个三通道的图像 32x32x3，train 的标签也是 50000 条数据，所以数据与标签之间一一对应，所以 32x32x3 的图像对应一个标签，data 的输出就是 (32, 32, 3), (1,)

cifar10 = tf.keras.datasets.cifar10
(x_train, y_train), (x_test, y_test) = cifar10.load_data()

print('x_train length:', len(x_train),
      '\nx_train shape:', x_train.shape,
      '\nx_test length:', len(x_test),
      '\nx_test shape:', x_test.shape)

data = tf.data.Dataset.from_tensor_slices((x_train, y_train))
print(data)

x_train length: 50000 
x_train shape: (50000, 32, 32, 3) 
x_test length: 10000 
x_test shape: (10000, 32, 32, 3)
<TensorSliceDataset shapes: ((32, 32, 3), (1,)), types: (tf.uint8, tf.uint8)>

TensorFlow dataset 中 shuffle()、repeat()、batch()、prefetch()

repeat(count=None) 重复此数据集 count 次，如果 count 是 None 或 -1 则是无限期重复的，返回一个 Dataset 类型，为了配合输出次数，一般默认 repeat() 为空

batch(batch_size, drop_remainder=False) 按照顺序取出 batch_size 大小数据，最后一次输出可能小于batch 如果程序指定了每次必须输入进批次的大小，那么应将drop_remainder 设置为 True 以防止产生较小的批次，默认为 False

shuffle(buffer_size, seed=None, reshuffle_each_iteration=None) 将数据打乱，数值越大，混乱程度越大，为了完全打乱，buffer_size 应等于数据集的数量，如果数据集包含 10000 个元素但 buffer_size 设置为 1000，则shuffle最初将仅打乱这前 1000 个，读取完毕后将在进行后续打乱
buffer_size：表示此数据集中要从中采样新数据集的元素数 int
seed：（可选）表示用于创建分布的随机种子 int
reshuffle_each_iteration：（可选）布尔值，如果为true，则表示每次迭代数据集时都应进行伪随机重排(默认为True)
返回值：
Dataset：A Dataset

prefetch(buffer_size) 创建一个Dataset从该数据集中预提取元素的，注意：examples.prefetch(2) 将预取2个元素（2个示例），而examples.batch(20).prefetch(2) 将预取2个元素（2个批次，每个20个示例），buffer_size 表示预取时将缓冲的最大元素数，返回 Dataset

repeat()不进行代码展示，因为不能可视化

features, labels = (np.random.sample((10, 3)), np.random.sample((10, 1)))

data = tf.data.Dataset.from_tensor_slices((features, labels))

data_batch = data.batch(4)
for step, (batch_x, batch_y) in enumerate(data_batch.take(1), 1):
    print('features:\n', batch_x, '\nlabels:\n', batch_y)

features:
 tf.Tensor(
[[0.55558959 0.51862803 0.79967034]
 [0.63969568 0.97605433 0.94048536]
 [0.1573988  0.02895822 0.40891006]
 [0.24109004 0.35168058 0.4326879 ]], shape=(4, 3), dtype=float64) 
labels:
 tf.Tensor(
[[0.67783353]
 [0.96823252]
 [0.47094708]
 [0.09315025]], shape=(4, 1), dtype=float64)

data_shuffle = data.shuffle(2)
for step, (batch_x, batch_y) in enumerate(data_shuffle.take(4), 1):
    print('features:', batch_x, 'labels:', batch_y)

print(('******************************************************************line'
       '******************************************************************'))

for step, (batch_x, batch_y) in enumerate(data.take(4), 1):
    print('features:', batch_x, 'labels:', batch_y)

features: tf.Tensor([0.63969568 0.97605433 0.94048536], shape=(3,), dtype=float64) labels: tf.Tensor([0.96823252], shape=(1,), dtype=float64)
features: tf.Tensor([0.1573988  0.02895822 0.40891006], shape=(3,), dtype=float64) labels: tf.Tensor([0.47094708], shape=(1,), dtype=float64)
features: tf.Tensor([0.55558959 0.51862803 0.79967034], shape=(3,), dtype=float64) labels: tf.Tensor([0.67783353], shape=(1,), dtype=float64)
features: tf.Tensor([0.24109004 0.35168058 0.4326879 ], shape=(3,), dtype=float64) labels: tf.Tensor([0.09315025], shape=(1,), dtype=float64)
******************************************************************line******************************************************************
features: tf.Tensor([0.55558959 0.51862803 0.79967034], shape=(3,), dtype=float64) labels: tf.Tensor([0.67783353], shape=(1,), dtype=float64)
features: tf.Tensor([0.63969568 0.97605433 0.94048536], shape=(3,), dtype=float64) labels: tf.Tensor([0.96823252], shape=(1,), dtype=float64)
features: tf.Tensor([0.1573988  0.02895822 0.40891006], shape=(3,), dtype=float64) labels: tf.Tensor([0.47094708], shape=(1,), dtype=float64)
features: tf.Tensor([0.24109004 0.35168058 0.4326879 ], shape=(3,), dtype=float64) labels: tf.Tensor([0.09315025], shape=(1,), dtype=float64)

只提取两个数据，每次打出一个，一个只有一行

data_pre = data.prefetch(2)
for step, (batch_x, batch_y) in enumerate(data_pre.take(1), 1):
    print('features:', batch_x, 'labels:', batch_y)

features: tf.Tensor([0.55558959 0.51862803 0.79967034], shape=(3,), dtype=float64) labels: tf.Tensor([0.67783353], shape=(1,), dtype=float64)

只提取两个数据，每次打出一个，但是具有 batch，一个有 5 行数据

data_bat_pre = data.batch(5).prefetch(2)
for step, (batch_x, batch_y) in enumerate(data_bat_pre.take(1), 1):
    print('features:\n', batch_x, '\nlabels:\n', batch_y)

features:
 tf.Tensor(
[[0.55558959 0.51862803 0.79967034]
 [0.63969568 0.97605433 0.94048536]
 [0.1573988  0.02895822 0.40891006]
 [0.24109004 0.35168058 0.4326879 ]
 [0.32295129 0.4973173  0.20489247]], shape=(5, 3), dtype=float64) 
labels:
 tf.Tensor(
[[0.67783353]
 [0.96823252]
 [0.47094708]
 [0.09315025]
 [0.17250509]], shape=(5, 1), dtype=float64)

正式开始

import numpy as np
import matplotlib.pyplot as plt
import math
import tensorflow as tf

tf.__version__

'2.0.0'

读取数据

cifar10 = tf.keras.datasets.cifar10
(x_train, y_train), (x_test, y_test) = cifar10.load_data()

数据预处理与参数设置

x_train = x_train.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0
train_num = len(x_train)
num_classes = 10

learning_rate = 0.0001
batch_size = 64
training_steps = 20000
display_step = 20

conv1_filters = 32
conv2_filters = 64
fc1_units = 256

设置读入的模式

train_data = tf.data.Dataset.from_tensor_slices((x_train, y_train))
train_data = train_data.repeat().shuffle(5000).batch(batch_size).prefetch(1)

定义卷积，池化操作

def conv2d(x, W, b, strides=1):
    x = tf.nn.conv2d(x, W, strides=[1, strides, strides, 1], padding='SAME')
    x = tf.nn.bias_add(x, b)
    return tf.nn.relu(x)


def maxpool2d(x, k=2):
    return tf.nn.max_pool(x, ksize=[1, k, k, 1], strides=[1, k, k, 1], padding='SAME')

定义全连接层参数

random_normal = tf.initializers.RandomNormal()

weights = {
    'wc1': tf.Variable(random_normal([3, 3, 3, conv1_filters])),
    'wc2': tf.Variable(random_normal([3, 3, conv1_filters, conv2_filters])),
    'wd1': tf.Variable(random_normal([4096, fc1_units])),
    'out': tf.Variable(random_normal([fc1_units, num_classes]))
}

biases = {
    'bc1': tf.Variable(tf.zeros([conv1_filters])),
    'bc2': tf.Variable(tf.zeros([conv2_filters])),
    'bd1': tf.Variable(tf.zeros([fc1_units])),
    'out': tf.Variable(tf.zeros([num_classes]))
}

定义前向运算

def conv_net(x):
    x = tf.reshape(x, [-1, 32, 32, 3])
    conv1 = conv2d(x, weights['wc1'], biases['bc1'])
    pool1 = maxpool2d(conv1, k=2)
    conv2 = conv2d(pool1, weights['wc2'], biases['bc2'])
    pool2 = maxpool2d(conv2, k=2)
    flat = tf.reshape(pool2, [-1, weights['wd1'].get_shape().as_list()[0]])
    fc1 = tf.add(tf.matmul(flat, weights['wd1']), biases['bd1'])
    fc1 = tf.nn.relu(fc1)
    out = tf.add(tf.matmul(fc1, weights['out']), biases['out'])
    
    return tf.nn.softmax(out)

定义损失函数与准确率

def cross_entropy(y_pred, y_true):
    y_pred = tf.clip_by_value(y_pred, 1e-9, 1.)
    loss_ = tf.keras.losses.sparse_categorical_crossentropy(y_true=y_true, y_pred=y_pred)
    
    return tf.reduce_mean(loss_)


def accuracy(y_pred, y_true):
    correct_prediction = tf.equal(tf.argmax(y_pred, 1), tf.reshape(tf.cast(y_true, tf.int64), [-1]))

    return tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

optimizer = tf.optimizers.Adam(learning_rate)

梯度下降

def run_optimization(x, y):
    with tf.GradientTape() as g:
        pred = conv_net(x)
        loss = cross_entropy(pred, y)
        
    trainable_variables = list(weights.values()) + list(biases.values())

    gradients = g.gradient(loss, trainable_variables)

    optimizer.apply_gradients(zip(gradients, trainable_variables))

开始训练

train_loss_list = []
train_acc_list = []

for step, (batch_x, batch_y) in enumerate(train_data.take(training_steps), 1):
    run_optimization(batch_x, batch_y)
    
    if step % display_step == 0:
        pred = conv_net(batch_x)
        loss = cross_entropy(pred, batch_y)
        acc = accuracy(pred, batch_y)
        train_loss_list.append(loss)
        train_acc_list.append(acc)
        print("step: %i, loss: %f, accuracy: %f" % (step, loss, acc))

部分训练如下
step: 18000, loss: 0.739505, accuracy: 0.750000
step: 18020, loss: 0.656400, accuracy: 0.796875
step: 18040, loss: 0.653373, accuracy: 0.812500
step: 18060, loss: 0.625071, accuracy: 0.812500
step: 18080, loss: 0.511153, accuracy: 0.875000
step: 18100, loss: 0.576216, accuracy: 0.843750
step: 18120, loss: 0.752268, accuracy: 0.750000
step: 18140, loss: 0.892846, accuracy: 0.718750
step: 18160, loss: 0.894469, accuracy: 0.703125
step: 18180, loss: 0.690331, accuracy: 0.796875
step: 18200, loss: 0.516968, accuracy: 0.796875
step: 18220, loss: 0.672123, accuracy: 0.765625
step: 18240, loss: 0.803179, accuracy: 0.703125
step: 18260, loss: 0.687715, accuracy: 0.796875
step: 18280, loss: 0.767748, accuracy: 0.750000
step: 18300, loss: 0.895533, accuracy: 0.718750
step: 18320, loss: 0.602742, accuracy: 0.796875
step: 18340, loss: 0.659845, accuracy: 0.750000
step: 18360, loss: 0.894678, accuracy: 0.687500
step: 18380, loss: 0.702030, accuracy: 0.734375
step: 18400, loss: 0.739138, accuracy: 0.687500
step: 18420, loss: 0.694246, accuracy: 0.796875
step: 18440, loss: 0.749393, accuracy: 0.687500
step: 18460, loss: 0.612656, accuracy: 0.796875
step: 18480, loss: 0.441922, accuracy: 0.796875
step: 18500, loss: 0.870660, accuracy: 0.718750
step: 18520, loss: 0.705603, accuracy: 0.734375
step: 18540, loss: 0.656777, accuracy: 0.765625
step: 18560, loss: 0.918324, accuracy: 0.718750
step: 18580, loss: 0.774196, accuracy: 0.703125
step: 18600, loss: 0.730768, accuracy: 0.781250
step: 18620, loss: 0.850247, accuracy: 0.703125
step: 18640, loss: 0.495881, accuracy: 0.843750
step: 18660, loss: 0.789647, accuracy: 0.625000
step: 18680, loss: 0.424281, accuracy: 0.890625
step: 18700, loss: 0.649424, accuracy: 0.765625
step: 18720, loss: 0.444974, accuracy: 0.843750
step: 18740, loss: 0.656202, accuracy: 0.765625
step: 18760, loss: 0.439435, accuracy: 0.859375
step: 18780, loss: 0.645392, accuracy: 0.718750
step: 18800, loss: 0.667705, accuracy: 0.734375
step: 18820, loss: 0.743165, accuracy: 0.765625
step: 18840, loss: 0.665061, accuracy: 0.781250
step: 18860, loss: 0.737702, accuracy: 0.734375
step: 18880, loss: 0.756636, accuracy: 0.765625
step: 18900, loss: 0.677796, accuracy: 0.812500
step: 18920, loss: 0.832764, accuracy: 0.734375
step: 18940, loss: 0.580744, accuracy: 0.843750
step: 18960, loss: 0.746122, accuracy: 0.750000
step: 18980, loss: 0.660987, accuracy: 0.703125
step: 19000, loss: 0.505439, accuracy: 0.812500
step: 19020, loss: 0.664603, accuracy: 0.796875
step: 19040, loss: 0.698589, accuracy: 0.750000
step: 19060, loss: 0.730500, accuracy: 0.750000
step: 19080, loss: 0.601688, accuracy: 0.781250
step: 19100, loss: 0.769905, accuracy: 0.718750
step: 19120, loss: 0.667629, accuracy: 0.796875
step: 19140, loss: 0.675744, accuracy: 0.781250
step: 19160, loss: 0.551231, accuracy: 0.781250
step: 19180, loss: 0.568437, accuracy: 0.828125
step: 19200, loss: 0.862550, accuracy: 0.703125
step: 19220, loss: 0.649084, accuracy: 0.828125
step: 19240, loss: 0.692427, accuracy: 0.734375
step: 19260, loss: 0.631112, accuracy: 0.734375
step: 19280, loss: 0.772355, accuracy: 0.765625
step: 19300, loss: 0.628002, accuracy: 0.781250
step: 19320, loss: 0.615280, accuracy: 0.781250
step: 19340, loss: 0.736602, accuracy: 0.734375
step: 19360, loss: 0.767952, accuracy: 0.765625
step: 19380, loss: 0.890533, accuracy: 0.703125
step: 19400, loss: 0.738237, accuracy: 0.750000
step: 19420, loss: 0.619683, accuracy: 0.796875
step: 19440, loss: 0.664785, accuracy: 0.828125
step: 19460, loss: 0.998178, accuracy: 0.656250
step: 19480, loss: 0.640450, accuracy: 0.828125
step: 19500, loss: 0.678016, accuracy: 0.765625
step: 19520, loss: 0.738600, accuracy: 0.734375
step: 19540, loss: 0.567391, accuracy: 0.859375
step: 19560, loss: 0.560070, accuracy: 0.812500
step: 19580, loss: 0.526925, accuracy: 0.812500
step: 19600, loss: 0.755863, accuracy: 0.718750
step: 19620, loss: 0.760550, accuracy: 0.687500
step: 19640, loss: 0.645194, accuracy: 0.765625
step: 19660, loss: 0.779244, accuracy: 0.765625
step: 19680, loss: 0.692456, accuracy: 0.765625
step: 19700, loss: 0.977755, accuracy: 0.671875
step: 19720, loss: 0.566006, accuracy: 0.828125
step: 19740, loss: 0.673255, accuracy: 0.781250
step: 19760, loss: 0.654281, accuracy: 0.750000
step: 19780, loss: 0.708996, accuracy: 0.750000
step: 19800, loss: 0.462604, accuracy: 0.843750
step: 19820, loss: 0.700370, accuracy: 0.750000
step: 19840, loss: 0.731024, accuracy: 0.734375
step: 19860, loss: 0.733060, accuracy: 0.734375
step: 19880, loss: 0.899182, accuracy: 0.718750
step: 19900, loss: 0.974528, accuracy: 0.687500
step: 19920, loss: 0.830438, accuracy: 0.812500
step: 19940, loss: 0.599154, accuracy: 0.875000
step: 19960, loss: 0.740960, accuracy: 0.796875
step: 19980, loss: 0.811868, accuracy: 0.703125
step: 20000, loss: 0.721007, accuracy: 0.796875

plt.title('Train and Validation Picture')
plt.xlabel('Times')
plt.ylabel('Loss value')
plt.plot(train_loss_list, color=(1, 0, 0), label='Loss train')
plt.plot(train_acc_list, color=(0, 0, 1), label='Accuracy train')
plt.legend(loc='best')
plt.show()

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-p3tOjeGD-1588178445228)(output_36_0.png)]

Cifar10 TensorFLow 2.x 高阶 API Keras 实现

导入必要包

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
import math
import datetime
import os

数据读取

cifar10 = tf.keras.datasets.cifar10
(x_train, y_train), (x_test, y_test) = cifar10.load_data()

print('training data shape:', x_train.shape)
print('training labels shape:', y_train.shape)
print('test data shape:', x_test.shape)
print('test labels shape:', y_test.shape)

training data shape: (50000, 32, 32, 3)
training labels shape: (50000, 1)
test data shape: (10000, 32, 32, 3)
test labels shape: (10000, 1)

模型构建

model = tf.keras.models.Sequential()

model.add(tf.keras.layers.Conv2D(filters=32,
                                 kernel_size=(3, 3),
                                 input_shape=(32, 32, 3),
                                 activation='relu',
                                 padding='same'))

model.add(tf.keras.layers.Dropout(rate=0.3))

model.add(tf.keras.layers.MaxPooling2D(pool_size=(2, 2)))

model.add(tf.keras.layers.Conv2D(filters=64,
                                 kernel_size=(3, 3),
                                 input_shape=(32, 32, 3),
                                 activation='relu',
                                 padding='same'))

model.add(tf.keras.layers.Dropout(rate=0.3))

model.add(tf.keras.layers.MaxPooling2D(pool_size=(2, 2)))

model.add(tf.keras.layers.Flatten())

model.add(tf.keras.layers.Dense(10, activation='softmax'))

model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 32, 32, 32)        896       
_________________________________________________________________
dropout (Dropout)            (None, 32, 32, 32)        0         
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 16, 16, 32)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 16, 16, 64)        18496     
_________________________________________________________________
dropout_1 (Dropout)          (None, 16, 16, 64)        0         
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 8, 8, 64)          0         
_________________________________________________________________
flatten (Flatten)            (None, 4096)              0         
_________________________________________________________________
dense (Dense)                (None, 10)                40970     
=================================================================
Total params: 60,362
Trainable params: 60,362
Non-trainable params: 0
_________________________________________________________________

epochs = 150
batch_size = 100

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

checkpoint_dir = './checkpoint2.x/'

model_filename = tf.train.latest_checkpoint(checkpoint_dir)

if model_filename != None:
    model.load_weights(model_filename)
    print('Load weights successful'.format(model_filename))
else:
    print('No weights saved, train from scratch!')

Load weights successful

log_dir = os.path.join(
    'logs2.x',
    'train',
    'plugins',
    'profile',
    datetime.datetime.now().strftime('%Y-%m-%d_%H-%M-%S'))

checkpoint_path = './checkpoint2.x/Cifar10.{epoch:02d}.ckpt'

if not os.path.exists('./checkpoint2.x'):
    os.mkdir('./checkpoint2.x')


callbacks = [
    tf.keras.callbacks.TensorBoard(log_dir=log_dir,
                         histogram_freq=2),
    tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_path,
                                       save_weights_only=True,
                                       verbose=0,
                                       save_freq='epoch'),
    tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=5)
]

开始训练

train_history = model.fit(x_train, y_train,
                          validation_split=0.2,
                          epochs=epochs,
                          batch_size=batch_size,
                          callbacks=callbacks,
                          verbose=1)

Train on 40000 samples, validate on 10000 samples
Epoch 1/150
  100/40000 [..............................] - ETA: 13:24 - loss: 1.1153 - accuracy: 0.6500WARNING:tensorflow:Method (on_train_batch_end) is slow compared to the batch update (0.123614). Check your callbacks.
40000/40000 [==============================] - 11s 268us/sample - loss: 1.0233 - accuracy: 0.6450 - val_loss: 1.1700 - val_accuracy: 0.6077
Epoch 2/150
40000/40000 [==============================] - 8s 207us/sample - loss: 1.0237 - accuracy: 0.6435 - val_loss: 1.1060 - val_accuracy: 0.6231
Epoch 3/150
40000/40000 [==============================] - 8s 211us/sample - loss: 1.0076 - accuracy: 0.6500 - val_loss: 1.1132 - val_accuracy: 0.6252
Epoch 4/150
40000/40000 [==============================] - 8s 207us/sample - loss: 1.0128 - accuracy: 0.6500 - val_loss: 1.2060 - val_accuracy: 0.5831
Epoch 5/150
40000/40000 [==============================] - 8s 211us/sample - loss: 0.9775 - accuracy: 0.6612 - val_loss: 1.1151 - val_accuracy: 0.6189
Epoch 6/150
40000/40000 [==============================] - 8s 212us/sample - loss: 0.9689 - accuracy: 0.6634 - val_loss: 1.1025 - val_accuracy: 0.6330
Epoch 7/150
40000/40000 [==============================] - 8s 209us/sample - loss: 0.9699 - accuracy: 0.6620 - val_loss: 1.1830 - val_accuracy: 0.5909
Epoch 8/150
40000/40000 [==============================] - 8s 207us/sample - loss: 0.9497 - accuracy: 0.6681 - val_loss: 1.1083 - val_accuracy: 0.6277
Epoch 9/150
40000/40000 [==============================] - 8s 209us/sample - loss: 0.9633 - accuracy: 0.6704 - val_loss: 1.1267 - val_accuracy: 0.6142
Epoch 10/150
40000/40000 [==============================] - 8s 206us/sample - loss: 0.9462 - accuracy: 0.6721 - val_loss: 1.1273 - val_accuracy: 0.6175
Epoch 11/150
40000/40000 [==============================] - 8s 211us/sample - loss: 0.9384 - accuracy: 0.6759 - val_loss: 1.1619 - val_accuracy: 0.5958

训练可视化

fig = plt.gcf()
fig.set_size_inches(10, 5)
ax1 = fig.add_subplot(111)
ax1.set_title('Train and Validation Picture')
ax1.set_ylabel('Loss value')
line1, = ax1.plot(train_history.history['loss'], color=(0.5, 0.5, 1.0), label='Loss train')
line2, = ax1.plot(train_history.history['val_loss'], color=(0.5, 1.0, 0.5), label='Loss valid')
ax2 = ax1.twinx()
ax2.set_ylabel('Accuracy value')
line3, = ax2.plot(train_history.history['accuracy'], color=(0.5, 0.5, 0.5), label='Accuracy train')
line4, = ax2.plot(train_history.history['val_accuracy'], color=(1, 0, 0), label='Accuracy valid')
plt.legend(handles=(line1, line2, line3, line4), loc='best')
plt.show()

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-ZkQGu99U-1588178576071)(output_18_0.png)]

test_loss, test_acc = model.evaluate(x_test, y_test, verbose=2)
print('test_loss:', test_loss,
      '\ntest_acc:', test_acc,
      '\nmetrics_names:', model.metrics_names)

10000/1 - 1s - loss: 1.0161 - accuracy: 0.5990
test_loss: 1.1660289595603943 
test_acc: 0.599 
metrics_names: ['loss', 'accuracy']

预测可视化

label_dict = {0:"airplane", 1:"automobile", 2:"bird", 3:"cat", 4:"deer",
              5:"dog", 6:"frog", 7:"horse", 8:"ship", 9:"truck"}


def plot_images_prediction_labels(images, labels, num):
    total = len(images)
    fig = plt.gcf()
    fig.set_size_inches(15, math.ceil(num / 10) * 7)
    for i in range(0, num):
        choose_n = np.random.randint(0, total)
        ax = plt.subplot(math.ceil(num / 5), 5, 1 + i)
        ax.imshow(images[choose_n], cmap='binary')
        test_pred = model.predict_classes(images.astype('float32')[choose_n:choose_n + 1])
        title = 'label:' + label_dict[labels[choose_n][0]] + ' pred:' + label_dict[test_pred[0]]
        ax.set_title(title, fontsize=10)
    plt.show()

plot_images_prediction_labels(x_test, y_test, 10)

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-0P5dG6AT-1588178576073)(output_21_0.png)]

注意

1.由于 TensorFlow2.x 的高阶 API容易上手所以我没有太多的说明，在 Keras 代码中设置了断点续训以及 Keras 训练早停功能，代码段如下

checkpoint_dir = './checkpoint2.x/'

model_filename = tf.train.latest_checkpoint(checkpoint_dir)

if model_filename != None:
    model.load_weights(model_filename)
    print('Load weights successful'.format(model_filename))
else:
    print('No weights saved, train from scratch!')

log_dir = os.path.join(
    'logs2.x',
    'train',
    'plugins',
    'profile',
    datetime.datetime.now().strftime('%Y-%m-%d_%H-%M-%S'))

checkpoint_path = './checkpoint2.x/Cifar10.{epoch:02d}.ckpt'

if not os.path.exists('./checkpoint2.x'):
    os.mkdir('./checkpoint2.x')


callbacks = [
    tf.keras.callbacks.TensorBoard(log_dir=log_dir,
                         histogram_freq=2),
    tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_path,
                                       save_weights_only=True,
                                       verbose=0,
                                       save_freq='epoch'),
    tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=5)
]

2.tf.train.latest_checkpoint() 函数只对保存为 .ckpt 的文件有效
3.产生的日志文件，以及如何在 Jupyter 内查看 TensorBoard 参见 Tensorflow 笔记 Ⅰ——TensorFlow 编程基础

总结

相关文件与代码已放在下载区，整个 TensorFlow 笔记完成后我将将所有文件上传至我的 GitHub，下载区的文件土豪随意

Tensorflow 笔记 Ⅶ——cifar10卷积神经网络

文章目录

数据集

数据集来源

数据集简介

数据集下载

数据集读取

卷积神经网络

卷积神经网络应用

图像分类(本篇所述)

目标检测(将在后续版本给出)

卷积神经网络原理

全连接网络的局限性

卷积神经网络的提出

卷积神经网络结构

卷积神经网络中的操作

卷积

0 填充(Padding)

多通道卷积

池化

操作可视化

Cifar10 TensorFLow 1.x 低阶 API 实现

前情函数

np.random.rand(d0, d1, d2…dn)

tf.nn.conv2d() 函数

tf.nn.max_pool() 函数

tf.nn.avg_pool() 函数

二维卷积示例

定义卷积核

定义卷积函数

图像卷积

可视化

pickle 模块简单的数据读入

numpy.concatenate()函数

numpy.transpose()函数

sklearn.preprocessing 的 OneHotEncoder

正式开始

下载数据

定义加载每个 batch 的函数

定义数据加载函数，连接整个 batch

读取数据，查看数据信息

定义图像可视化函数

数据预处理

定义网络结构

定义优化器

定义准确率

定义训练参数

开始训练

训练可视化

模型测试

预测可视化

Cifar10 TensorFLow 2.x 低阶 API 实现

前情函数

numpy.random.sample(size=None)

tensorflow.data.Dataset.from_tensor_slices

TensorFlow dataset 中 shuffle()、repeat()、batch()、prefetch()

正式开始

读取数据

数据预处理与参数设置

设置读入的模式

定义卷积，池化操作

定义全连接层参数

定义前向运算

定义损失函数与准确率

梯度下降

开始训练

Cifar10 TensorFLow 2.x 高阶 API Keras 实现

导入必要包

数据读取

模型构建

开始训练

训练可视化

预测可视化

注意

总结

猜你喜欢