Keras入门教程_00_mnist_mlp

学习目标

  1. 使用plot_model查看模型类型
  2. 使用TensorBoard查看模型训练情况
  3. 学会序列模型基本训练
  4. 学会如何看帮助文档

需要基础

  1. Python编程基础
  2. TensorFlow基础(帮助理解,因为keras仍旧以TensorFlow为基础)
  3. keras(一点点,把keras中文文档扫过一遍)
  4. 深度学习基础,至少要知道的一些概念,例如梯度下降、正则化等,推荐吴恩达的深度学习视频
  5. numpy、matplotlib、pandas等

导入模块

keras首先要做的是导入模块:

import keras
from keras.datasets import mnist
from keras.models import Sequential # 序贯模型,keras总共有两种模型.
from keras.layers import Dense, Dropout
from keras.callbacks import TensorBoard # 用TensorBoard时候需要用回调函数
from keras.utils import plot_model
output: Using TensorFlow backend.

设置超参数

第一步要做的就是设置模型的超参数,最简单的超参数有每个批次训练的数目、分类类别,训练轮数(训练轮数指的是完整跑完整个数据集的次数):

batch_size = 128 # 批量梯度下降每批的批次数
num_classes = 10
epochs = 20

数据集制作

加载数据

数据集需要进行预处理,黑白图像数据常见的操作是归一化,也就是除以255,首先加载数据集.

(x_train, y_train), (x_test, y_test) = mnist.load_data()

查看数据

我们来看一下数据,刚把数据加载进来的时候,常常不知道数据长什么样子,这样当然无从处理。

print("训练集X的形状:",x_train.shape)
print("训练集X的数据类型:",x_train.dtype) # numpy数组中的数值类型使用dtype查看
print("训练集Y的形状:",y_train.shape)
print("训练集Y的数据类型:",y_train.dtype)
训练集X的形状: (60000, 28, 28)
训练集X的数据类型: uint8
训练集Y的形状: (60000,)
训练集Y的数据类型: uint8

转换数据类型、展平、归一化

由于X的数据类型为unit8,范围在0-255之间,因此直接用来归一化会出现“诡异”的现象,为保证精度,必须做处理。另外,全连接层输入必须展平.

x_train = x_train.reshape(x_train.shape[0], -1).astype('float32')
x_test = x_test.reshape(x_test.shape[0], -1).astype('float32') # 测试集同样需要做处理

再来看一遍:

print("训练集X的形状:",x_train.shape)
print("训练集X的数据类型:",x_train.dtype) # numpy数组中的数值类型使用dtype查看
print("训练集Y的形状:",y_train.shape)
print("训练集Y的数据类型:",y_train.dtype)
训练集X的形状: (60000, 784)
训练集X的数据类型: float32
训练集Y的形状: (60000,)
训练集Y的数据类型: uint8

归一化

x_train /=255
x_test /=255

onehot标签处理

我们的Y标签还没有做处理,在做交叉熵运算的时候,标签需要独热编码,我们先来看一下Y吧.

print("y_train:",y_train[0:10])
print("y_test:",y_test[0:10])
y_train: [5 0 4 1 9 2 1 3 1 4]
y_test: [7 2 1 0 4 1 4 9 5 9]

可以看到为0-9的10个数字,我们可以很方便地用keras处理.

y_train = keras.utils.to_categorical(y=y_train, num_classes=num_classes)
y_test = keras.utils.to_categorical(y=y_test, num_classes=num_classes)

再来看一下Y标签

print("y_train:",y_train[0:10])
print("y_test:",y_test[0:10])
y_train: [[0. 0. 0. 0. 0. 1. 0. 0. 0. 0.]
 [1. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]
 [0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 1. 0. 0. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]]
y_test: [[0. 0. 0. 0. 0. 0. 0. 1. 0. 0.]
 [0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
 [1. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]
 [0. 0. 0. 0. 0. 1. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]]

Y标签变成了独热编码,这样最简单的图像预处理就完成了。

模型搭建

这里使用序贯模型,这是最简单的。API的使用查看keras中文文档,有些简单地直接add就OK了,复杂的API怎么用,需要大家不断地读大神的代码、项目等逐渐掌握。因此一定要锻炼自己的自学能力:

# 实例化一个Sequential
model = Sequential()
# 模型输入的时候必须指定input_shape,后面则会自动推断
# input_shape并不固定,这方面要掌握自学能力,查官方文档、百度。
# 不过类型也没有几样,一般就Dense全连接层input_shape,卷积input_shape
model.add(Dense(units=512, activation='relu', input_shape=(x_train.shape[1],)))
model.add(Dropout(rate=0.2)) 
model.add(Dense(units=512, activation='relu'))
model.add(Dropout(rate=0.2))

# 最后一层为输入层,必须是预测类别数,激活函数用softmax将输出范围限定在0-1之内
model.add(Dense(units=num_classes, activation='softmax'))

编译模型

打印出模型概况

model.summary() #打印出模型概况
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_1 (Dense)              (None, 512)               401920    
_________________________________________________________________
dropout_1 (Dropout)          (None, 512)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 512)               262656    
_________________________________________________________________
dropout_2 (Dropout)          (None, 512)               0         
_________________________________________________________________
dense_3 (Dense)              (None, 10)                5130      
=================================================================
Total params: 669,706
Trainable params: 669,706
Non-trainable params: 0
_________________________________________________________________
# 编译模型
model.compile(loss='categorical_crossentropy',
              optimizer='RMSprop',
              metrics=['accuracy']) #可以看官方文档,每个API的用法

模型编译完之后就开始训练,这和sklearn比较相似

训练模型

Tensorboard

这里需要调用回调函数,后面会讲,Tensorboard具体操作后面也会讲。但基本的操作是这样的:

# 生成tbck信息
tbck = TensorBoard(log_dir='../logs/00mnist_mpl')
# 在fit训练的时候传入回调函数
history = model.fit(x=x_train, y=y_train,
                    batch_size=batch_size,
                    epochs=epochs,
                    verbose=1,
                    validation_data=(x_test, y_test),
                    callbacks=[tbck])# callbacks接受列表为参数,可以有多个回调函数
Train on 60000 samples, validate on 10000 samples
Epoch 1/20
60000/60000 [==============================] - 5s 80us/step - loss: 0.2438 - acc: 0.9247 - val_loss: 0.1146 - val_acc: 0.9643
Epoch 2/20
60000/60000 [==============================] - 4s 66us/step - loss: 0.1045 - acc: 0.9681 - val_loss: 0.0867 - val_acc: 0.9751
Epoch 3/20
60000/60000 [==============================] - 4s 67us/step - loss: 0.0763 - acc: 0.9771 - val_loss: 0.0727 - val_acc: 0.9792
Epoch 4/20
60000/60000 [==============================] - 4s 67us/step - loss: 0.0604 - acc: 0.9826 - val_loss: 0.0852 - val_acc: 0.9766
Epoch 5/20
60000/60000 [==============================] - 4s 67us/step - loss: 0.0508 - acc: 0.9846 - val_loss: 0.0774 - val_acc: 0.9813
Epoch 6/20
60000/60000 [==============================] - 4s 69us/step - loss: 0.0434 - acc: 0.9872 - val_loss: 0.0820 - val_acc: 0.9794
Epoch 7/20
60000/60000 [==============================] - 4s 66us/step - loss: 0.0382 - acc: 0.9887 - val_loss: 0.0856 - val_acc: 0.9793
Epoch 8/20
60000/60000 [==============================] - 4s 67us/step - loss: 0.0339 - acc: 0.9896 - val_loss: 0.0831 - val_acc: 0.9807
Epoch 9/20
60000/60000 [==============================] - ETA: 0s - loss: 0.0316 - acc: 0.990 - 4s 67us/step - loss: 0.0316 - acc: 0.9907 - val_loss: 0.0782 - val_acc: 0.9816
Epoch 10/20
60000/60000 [==============================] - 4s 68us/step - loss: 0.0281 - acc: 0.9921 - val_loss: 0.0891 - val_acc: 0.9809
Epoch 11/20
60000/60000 [==============================] - 4s 67us/step - loss: 0.0287 - acc: 0.9921 - val_loss: 0.0854 - val_acc: 0.9828
Epoch 12/20
60000/60000 [==============================] - 4s 68us/step - loss: 0.0267 - acc: 0.9926 - val_loss: 0.0873 - val_acc: 0.9837
Epoch 13/20
60000/60000 [==============================] - 4s 70us/step - loss: 0.0228 - acc: 0.9938 - val_loss: 0.0917 - val_acc: 0.9834
Epoch 14/20
60000/60000 [==============================] - 4s 68us/step - loss: 0.0216 - acc: 0.9941 - val_loss: 0.1013 - val_acc: 0.9829
Epoch 15/20
60000/60000 [==============================] - 5s 85us/step - loss: 0.0217 - acc: 0.9939 - val_loss: 0.1001 - val_acc: 0.9840
Epoch 16/20
60000/60000 [==============================] - 4s 71us/step - loss: 0.0186 - acc: 0.9946 - val_loss: 0.1039 - val_acc: 0.9835
Epoch 17/20
60000/60000 [==============================] - 6s 100us/step - loss: 0.0193 - acc: 0.9949 - val_loss: 0.1110 - val_acc: 0.9838
Epoch 18/20
60000/60000 [==============================] - 5s 81us/step - loss: 0.0189 - acc: 0.9948 - val_loss: 0.1154 - val_acc: 0.9830
Epoch 19/20
60000/60000 [==============================] - 4s 70us/step - loss: 0.0203 - acc: 0.9950 - val_loss: 0.1078 - val_acc: 0.9844
Epoch 20/20
60000/60000 [==============================] - 4s 72us/step - loss: 0.0187 - acc: 0.9955 - val_loss: 0.1102 - val_acc: 0.9832

之后再终端输入tensorboard – logdir path即可查看loss情况,推荐用pycharm,直接在终端输入即可.

评估模型

实际上已经用过了这些数据做验证集了,这里只是演示

score = model.evaluate(x_test, y_test, verbose=1)
print('Test loss:', score[0])
print('Test accuracy:', score[1])
10000/10000 [==============================] - 1s 74us/step
Test loss: 0.11015431576028721
Test accuracy: 0.9832

用plot_mode()查看模型情况

因为用TensorBoard没有进入TensorFlow底层去with tf.name_scope(),因此Graph比较乱,因此需要用到keras自带的·plot_mode·,再最后添加plot_model即可.

    plot_model(model=model, to_file='wdcnn.png', show_shapes=True)

如何安装这些东西,请等待下一次更新.

猜你喜欢

转载自blog.csdn.net/weixin_40920290/article/details/80594573