Article Directory
foreword
It is recorded in tf1.x and tf2.x that the convolutional neural network is used to complete the CIFAR-10 data set recognition multi-category task, and the breakpoint training is continued.
1. Convolutional neural network CNN
1. Fully connected network: more parameters, slower speed, overfitting
2. Convolutional neural network: each layer is composed of multiple two-dimensional planes, and each plane is composed of multiple independent neurons.
Input layer → (convolutional layer +→Pooling layer?)+→Fully connected layer+
(1) Convolutional layer: Enhance features, reduce noise
#卷积
y = tf.nn.conv2d(x, w, strides, padding) + b
'''
x:输入4维张量[batch,height,weight,channel]
w:[height,weight,input_channel,output_channel]
b:output_channel
strides:每一维步长
'''
#卷积层
tf.keras.layers.Conv2D(filters,#卷积核数目
kernel_size,#卷积核大小
activation,#激活函数
padding#填充方式))
Image⟶ convolution kernel weight matrix feature map image \underset{weight matrix}{\overset{convolution kernel}{\longrightarrow}}feature mapimageweight matrix⟶convolution kernelFeature map
local connection, weight sharing
step size: the number of moving cells of the convolution kernel to obtain output of different sizes (down-sampling)
[ N 1 , N 1 ] ⟶ [ N 1 , N 1 ] S [ ( N 1 − N 2 ) / S + 1 , ( N 1 − N 2 ) / S + 1 ] [N1,N1]\underset{S}{\overset{[N1,N1]}{\longrightarrow}}[(N1-N2)/S +1,(N1-N2)/S+1][ No. 1 ,N 1 ]S⟶[ N1 , N1 ] . _[( No. 1−N2)/S+1,( No. 1−N2)/S+1 ]
0 padding: fill the edges with 0 to make the input and output the same size
Multi-channel convolution: use multiple convolution kernels to extract features
(2) Downsampling layer: reduce parameters and reduce overfitting
#池化
y1 = tf.nn.max_pool(y, ksize, strides, padding)
'''
y:输入4维张量[batch,height,weight,channel]
ksize:池化窗口[1,height,weight,1]
'''
#池化层
tf.keras.layers.MaxPooling2D(pool_size)
Pooling: calculate the average value (background feature) or maximum value (texture feature) of the area, and merge features
Two, Tensorflow1.x
1. Load the dataset
Call CIFAR-10 dataset 10 classification 32x32RGB color pictures in tensorflow2.x
, training set 50000, test set 10000
x_train: (50000, 32, 32, 3) y_train:
(10000, 1)
x_test: (10000, 32 , 32, 3)
y_test: (10000, 1)
import tensorflow as tf2
import matplotlib.pyplot as plt
import numpy as np
cifar10 = tf2.keras.datasets.cifar10
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
#显示16张图片
def show(images, labels, preds):
label_dict = {
0:'plane', 1:'car', 2:'bird', 3:'cat',\
4:'deer', 5:'dog', 6:'frog', 7:'horse',\
8:'ship', 9:'trunk'}
fig1 = plt.figure(1, figsize=(12, 12))
for i in range(16):
ax = fig1.add_subplot(4, 4, i+1)
ax.imshow(images[i], cmap='binary')
label = label_dict[np.argmax(labels[i])]
pred = label_dict[np.argmax(preds[i])]
title = 'label:%s,pred:%s' % (label, pred)
ax.set_title(title)
ax.set_xticks([])
ax.set_yticks([])
2. Data processing
import tensorflow.compat.v1 as tf
from sklearn.preprocessing import OneHotEncoder
from sklearn.utils import shuffle
from time import time
import os
tf.disable_eager_execution()
#维度转换,灰度值归一化,标签独热编码
x_train = (x_train/255.0).astype(np.float32)
x_test = (x_test/255.0).astype(np.float32)
y_train = OneHotEncoder().fit_transform(y_train).toarray()
y_test = OneHotEncoder().fit_transform(y_test).toarray()
#训练集50000个样本,取5000个样本作为验证集;测试集10000个样本
x_valid, y_valid = x_train[45000:], y_train[45000:]
x_train, y_train = x_train[:45000], y_train[:45000]
3. Define the model
#输入层
#输入通道3,图像大小32*32
with tf.name_scope('Input'):
x = tf.placeholder(tf.float32, [None, 32, 32, 3], name='X')
#卷积层1
#输出通道32,卷积核3*3,步长为1,图像大小32*32
with tf.name_scope('Conv_1'):
w1 = tf.Variable(\
tf.truncated_normal((3,3,3,32), stddev=0.1), name='W1')
b1 = tf.Variable(tf.zeros(32), name='B1')
y1 = tf.nn.conv2d(x, w1, strides=[1,1,1,1], padding='SAME') + b1
y1 = tf.nn.relu(y1)
#池化层1
#输出通道32,最大值池化2*2,步长为2,图像大小16*16
with tf.name_scope('Pool_1'):
y2 = tf.nn.max_pool(y1, ksize=[1,2,2,1], strides=[1,2,2,1], padding='SAME')
#卷积层2
#输出通道64,卷积核3*3,步长为1,图像大小32*32
with tf.name_scope('Conv_1'):
w3 = tf.Variable(\
tf.truncated_normal((3,3,32,64), stddev=0.1), name='W3')
b3 = tf.Variable(tf.zeros(64), name='B3')
y3 = tf.nn.conv2d(y2, w3, strides=[1,1,1,1], padding='SAME') + b3
y3 = tf.nn.relu(y3)
#池化层2
#输出通道64,最大值池化2*2,步长为2,图像大小8*8
with tf.name_scope('Pool_1'):
y4 = tf.nn.max_pool(y3, ksize=[1,2,2,1], strides=[1,2,2,1], padding='SAME')
#全连接层
#输入8*8*64,128个神经元
with tf.name_scope('Output'):
w5 = tf.Variable(\
tf.truncated_normal((8*8*64, 128), stddev=0.1), name='W5')
b5 = tf.Variable(tf.zeros((128)), name='B5')
y5 = tf.reshape(y4, (-1, 8*8*64))
y5 = tf.matmul(y5, w5) + b5
y5 = tf.nn.relu(y5)
y5 = tf.nn.dropout(y5, keep_prob=0.8)
#输出层
#输出10个神经元
with tf.name_scope('Output'):
w6 = tf.Variable(\
tf.truncated_normal((128, 10), stddev=0.1), name='W6')
b6 = tf.Variable(tf.zeros((10)), name='B6')
pred = tf.matmul(y5, w6) + b6
#pred = tf.nn.softmax(y6)
#优化器
with tf.name_scope('Optimizer'):
y = tf.placeholder(tf.float32, [None, 10], name='Y')
loss_function = tf.reduce_mean(\
tf.nn.softmax_cross_entropy_with_logits(\
logits=pred, labels=y))
optimizer = tf.train.AdamOptimizer(learning_rate=0.001)\
.minimize(loss_function)
#准确率
equal = tf.equal(tf.argmax(y, axis=1), tf.argmax(pred, axis=1))
accuracy = tf.reduce_mean(tf.cast(equal, tf.float32))
4. Training model
training parameters
train_epoch = 10
batch_size = 1000
batch_num = x_train.shape[0] // batch_size
#损失函数与准确率
step = 0
display_step = 10
loss_list = []
acc_list = []
epoch = tf.Variable(0, name='epoch', trainable=False)
#变量初始化
sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)
resume training
ckpt_dir = './ckpt_dir/cifar10'
if not os.path.exists(ckpt_dir):
os.makedirs(ckpt_dir)
vl = [v for v in tf.global_variables() if 'Adam' not in v.name]
saver = tf.train.Saver(var_list=vl, max_to_keep=1)
ckpt = tf.train.latest_checkpoint(ckpt_dir)
if ckpt != None:
saver.restore(sess, ckpt)
start_ep = sess.run(epoch)
iterative training
start_time = time()
for ep in range(start_ep, train_epoch):
#打乱顺序
x_train, y_train = shuffle(x_train, y_train)
print('epoch:{}/{}'.format(ep+1, train_epoch))
for batch in range(batch_num):
xi = x_train[batch*batch_size:(batch+1)*batch_size]
yi = y_train[batch*batch_size:(batch+1)*batch_size]
sess.run(optimizer, feed_dict={
x:xi, y:yi})
step = step + 1
if step % display_step == 0:
loss, acc = sess.run([loss_function, accuracy],\
feed_dict={
x:x_valid, y:y_valid})
loss_list.append(loss)
acc_list.append(acc)
#保存检查点
sess.run(epoch.assign(ep+1))
saver.save(sess, os.path.join(ckpt_dir,\
'cifar10_model.ckpt'), global_step=ep+1)
5. Results Visualization
end_time = time()
y_pred, acc = sess.run([pred, accuracy],\
feed_dict={
x:x_test, y:y_test})
fig2 = plt.figure(2, figsize=(12, 6))
ax = fig2.add_subplot(1, 2, 1)
ax.plot(loss_list, 'r-')
ax.set_title('loss')
ax = fig2.add_subplot(1, 2, 2)
ax.plot(acc_list, 'b-')
ax.set_title('acc')
print('用时%.1fs' % (end_time - start_time))
print('Accuracy:{:.2%}'.format(acc))
show(x_test[0:16], y_test[0:16], y_pred[0:16])
Interrupted after the 8th training, run again, and start training from the 9th
The loss and accuracy on the validation set
Labels and predicted values of the first 16 pictures
Two, Tensorflow2.x
1. Load the dataset
Label values are integers, not one-hot encoded
import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np
from time import time
cifar10 = tf.keras.datasets.cifar10
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
#显示16张图片
def show(images, labels, preds):
label_dict = {
0:'plane', 1:'car', 2:'bird', 3:'cat',\
4:'deer', 5:'dog', 6:'frog', 7:'horse',\
8:'ship', 9:'trunk'}
fig1 = plt.figure(1, figsize=(12, 12))
for i in range(16):
ax = fig1.add_subplot(4, 4, i+1)
ax.imshow(images[i], cmap='binary')
label = label_dict[labels[i]]
pred = label_dict[preds[i]]
title = 'label:%s,pred:%s' % (label, pred)
ax.set_title(title)
ax.set_xticks([])
ax.set_yticks([])
2. Data processing
No validation set split
#维度转换,灰度值归一化,标签独热编码
x_train = (x_train/255.0).astype(np.float32)
x_test = (x_test/255.0).astype(np.float32)
3. Define the model
create model
model = tf.keras.models.Sequential()
#添加层
model.add(tf.keras.layers.Conv2D(filters=32, kernel_size=(3,3),\
input_shape=(32,32,3), activation='relu', padding='same'))
model.add(tf.keras.layers.Dropout(rate=0.3))
model.add(tf.keras.layers.MaxPooling2D(pool_size=(2,2)))
model.add(tf.keras.layers.Conv2D(filters=64, kernel_size=(3,3),\
activation='relu', padding='same'))
model.add(tf.keras.layers.Dropout(rate=0.3))
model.add(tf.keras.layers.MaxPooling2D(pool_size=(2,2)))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(units=10,\
kernel_initializer='normal', activation='softmax'))
#模型摘要
model.summary()
load model
'''
#加载模型(整个模型)
mpath = './model/cifar10.h5'
model.load_weights(mpath)
'''
#加载模型(检查点)
cdir = './model/'
ckpt = tf.train.latest_checkpoint(cdir)
if ckpt != None:
model.load_weights(ckpt)
training mode
#整数类型作标签
model.compile(optimizer='adam',\
loss='sparse_categorical_crossentropy',\
metrics=['accuracy'])
4. Training model
#回调参数设置检查点与早停
cpath = './model/cifar10.{epoch:02d}-{val_loss:.4f}.H5'
callbacks = [tf.keras.callbacks.ModelCheckpoint(filepath=cpath,
save_weights_only=True,
verbose=1,
save_freq='epoch'),
tf.keras.callbacks.EarlyStopping(monitor='val_loss',
patience=3)]
#模型训练
start_time = time()
history = model.fit(x_train, y_train,\
validation_split=0.2, epochs=10, batch_size=1000,\
callbacks=callbacks, verbose=1)
end_time = time()
print('用时%.1fs' % (end_time-start_time))
5. Results Visualization
history.history: dictionary type data
loss, accuracy, val_loss, val_accuracy
fig2 = plt.figure(2, figsize=(12, 6))
ax = fig2.add_subplot(1, 2, 1)
ax.plot(history.history['val_loss'], 'r-')
ax.set_title('loss')
ax = fig2.add_subplot(1, 2, 2)
ax.plot(history.history['val_accuracy'], 'b-')
ax.set_title('acc')
model evaluation
test_loss, test_acc = model.evaluate(x_train, y_train, verbose=1)
print('Loss:%.2f' % test_loss)
print('Accuracy:{:.2%}'.format(test_acc))
model prediction
#分类预测
preds = model.predict_classes(x_test)
show(x_test[0:16], y_test[0:16].flatten(), preds[0:16])
#保存模型
#model.save_weights(mpath)
Network structure
Save the model after training
Validation set loss value and accuracy rate
Label value and prediction value of the first 16 pictures
Import the model, the accuracy rate is the same as before
Stop the program after the second batch of training
The accuracy rate is improved on the basis of the previous training
Summarize
The convolutional neural network can extract features from the data, reduce the number of parameters, and improve the training speed;
after using OneHotEncoder for one-hot encoding, you need to use toarray() to convert it into an array form to participate in the operation;
when the training time is long, it can be saved regularly for interrupting Click to continue training;
when saving a checkpoint in tf1.x, multiple weights with the same name will be saved, but weights with Adam in the name may not be saved, and the console needs to be restarted before and after saving to avoid that the weight name does not match when loading.