代码实践 | 卷积神经网络之图像分类

这一小节，我们是基于fashion MNIST数据的图像分类去做实验。在2017年8月份，德国研究机构Zalando Research在GitHub上推出了一个全新的数据集，其中训练集包含60000个样例，测试集包含10000个样例，分为10类，每一类的样本训练样本数量和测试样本数量相同。样本都来自日常穿着的衣裤鞋包，每个都是28×28的灰度图像，其中总共有10类标签，每张图像都有各自的标签。fashion MNIST数据集如图 4.59所示。

图 4.59 fashion MNIST数据集

使用这个数据集的目的是为了让大家了解整个图像分类的处理流程，即如何将图像数据转成计算机能够读懂的格式，并灌入神经网络模型中训练，最后得到我们想要的分类结果。

那为什么不用Keras自带的数据集呢？那是因为如果我们单纯用Keras自带的数据集如Cifar-10，这些数据集都是已经被处理好的了，我们直接调用即可，这样大家的知识体系就少了预处理的过程，而且对于后面的迁移学习做图像分类，我们同样也是用fashion MNIST这个数据集，确保大家都对整体的图像分类流程有个明确的思路。

图像预处理相对文本预处理要简单一些，只需要用Python将数据读入，然后将其转换成同样大小的矩阵格式即可，然后对矩阵进行归一化，这样就可以被神经网络所读入。这些操作均在后面代码中体现。

这里还需要讲的是Keras的图像生成器ImageDataGenerator。这个生成器有很多操作如翻转、旋转和缩放等，目的是生成更加多且不一样的图像数据，这样我们得到的训练模型泛化性更加的好，从而得到的模型更加准确。

1.  datagen = ImageDataGenerator(  
2.          featurewise_center=False, # 将数据集上的输入均值设置为0  
3.          samplewise_center=False, # 将每个样本均值设置为0  
4.          featurewise_std_normalization=False, # 将输入除以数据集的std  
5.          samplewise_std_normalization=False,# 将每个输入除以其std
6.          zca_whitening=False,  #使用ZCA白化
7.          zca_epsilon=1e-06,  # 使用ZCA白化的eposilon
8.          rotation_range=0,    
9.          validation_split=0.0)

上面就是图像增强的核心代码，这只是对图像的一部分操作，更多的操作我们可以通过官网去查询，每个参数的意思在官网已经有详细描述，因此笔者就不进行太多的赘述了。

这里笔者给大家附上链接:

https://keras.io/zh/preprocessing/image/

2. 实验流程

(1) 加载图像数据

(2) 图像数据预处理

(3) 训练模型

(4) 保存模型与模型可视化

(5) 训练过程可视化

3. 代码

1.  # chapter4/4_7_4_Tradition_cnn_image.ipynb
2.  from tensorflow.python.keras.utils import get_file  
3.  import gzip  
4.  import numpy as np  
5.  import keras  
6.  from keras.preprocessing.image import ImageDataGenerator  
7.  from keras.models import Sequential  
8.  from keras.layers import Dense, Dropout, Activation, Flatten  
9.  from keras.layers import Conv2D, MaxPooling2D  
10.  import os  
11.  import functools  
12.  # os.environ["CUDA_VISIBLE_DEVICES"] = "2"  # 使用第3块显卡

1) 读取数据与数据预处理

1.  # 数据集和代码放一起即可  
2.  def load_data():  
3.      paths = [  
4.          'train-labels-idx1-ubyte.gz', 'train-images-idx3-ubyte.gz',  
5.          't10k-labels-idx1-ubyte.gz', 't10k-images-idx3-ubyte.gz'  
6.      ]  
7.    
8.      with gzip.open(paths[0], 'rb') as lbpath:  
9.          y_train = np.frombuffer(lbpath.read(), np.uint8, offset=8)  
10.    
11.      with gzip.open(paths[1], 'rb') as imgpath:  
12.          x_train = np.frombuffer(  
13.              imgpath.read(), np.uint8, offset=16).reshape(len(y_train), 28, 28, 1)  
14.    
15.      with gzip.open(paths[2], 'rb') as lbpath:  
16.          y_test = np.frombuffer(lbpath.read(), np.uint8, offset=8)  
17.    
18.      with gzip.open(paths[3], 'rb') as imgpath:  
19.          x_test = np.frombuffer(  
20.              imgpath.read(), np.uint8, offset=16).reshape(len(y_test), 28, 28, 1)  
21.      return (x_train, y_train), (x_test, y_test)  
22.  (x_train, y_train), (x_test, y_test) = load_data()  
23.    
24.  batch_size = 32  
25.  num_classes = 10  
26.  epochs = 5  
27.  data_augmentation = True  # 图像增强  
28.  num_predictions = 20  
29.  save_dir = os.path.join(os.getcwd(), 'saved_models_cnn')  
30.  model_name = 'keras_fashion_trained_model.h5'  
31.    
32.  # 将类别转换成独热编码  
33.  y_train = keras.utils.to_categorical(y_train, num_classes)  
34.  y_test = keras.utils.to_categorical(y_test, num_classes)  
35.   
36.  x_train = x_train.astype('float32')  
37.  x_test = x_test.astype('float32')  
38.    
39.  x_train /= 255  # 归一化  
40.  x_test /= 255  # 归一化

2) 搭建传统CNN模型

1.  model = Sequential()  
2.  model.add(Conv2D(32, (3, 3), padding='same',  
3.  # 32，(3,3)是卷积核数量和大小  
4.                   input_shape=x_train.shape[1:]))  
5.  # 第一层需要指出图像的大小  
6.  model.add(Activation('relu'))  
7.  model.add(Conv2D(32, (3, 3)))  
8.  model.add(Activation('relu'))  
9.  model.add(MaxPooling2D(pool_size=(2, 2)))  
10.  model.add(Dropout(0.25))  
11.    
12.  model.add(Conv2D(64, (3, 3), padding='same'))  
13.  model.add(Activation('relu'))  
14.  model.add(Conv2D(64, (3, 3)))  
15.  model.add(Activation('relu'))  
16.  model.add(MaxPooling2D(pool_size=(2, 2)))  
17.  model.add(Dropout(0.25))  
18.    
19.  model.add(Flatten())  
20.  model.add(Dense(512))  
21.  model.add(Activation('relu'))  
22.  model.add(Dropout(0.5))  
23.  model.add(Dense(num_classes))  
24.  model.add(Activation('softmax'))  
25.    
26.  # 初始化 RMSprop 优化器  
27.  opt = keras.optimizers.rmsprop(lr=0.0001, decay=1e-6)  
28.    
29.  # 使用 RMSprop优化器 
30.  model.compile(loss='categorical_crossentropy',  
31.                optimizer=opt,  
32.                metrics=['accuracy'])

3) 训练

1.  if not data_augmentation:  
2.      print('Not using data augmentation.')  
3.      history = model.fit(x_train, y_train,  
4.                batch_size=batch_size,  
5.                epochs=epochs,  
6.                validation_data=(x_test, y_test),  
7.                shuffle=True)  
8.  else:  
9.      print('Using real-time data augmentation.')  
10.      # 数据预处理与实时数据增强  
11.      datagen = ImageDataGenerator(  
12.          featurewise_center=False,  
13.          samplewise_center=False,  
14.          featurewise_std_normalization=False,  
15.          samplewise_std_normalization=False,  
16.          zca_whitening=False,   
17.          zca_epsilon=1e-06,    
18.          rotation_range=0, 
19.          width_shift_range=0.1,  
20.          height_shift_range=0.1,  
21.          shear_range=0.,  
22.          zoom_range=0.,    
23.          channel_shift_range=0.,    
24.          fill_mode='nearest',  
25.          cval=0.,  
26.          horizontal_flip=True,  
27.          vertical_flip=False,
28.          rescale=None,  
29.          preprocessing_function=None,  
30.          data_format=None,  
31.          validation_split=0.0)  
32.    
33.    
34.      datagen.fit(x_train)  
35.      print(x_train.shape[0]//batch_size)  # 取整  
36.      print(x_train.shape[0]/batch_size)  # 保留小数  
37.      # 拟合模型  
38.      history = model.fit_generator(datagen.flow(x_train, y_train,  
39.  # 按batch_size大小从x,y生成增强数据  
40.                          batch_size=batch_size),    
41.   # flow_from_directory()从路径生成增强数据,和flow方法相比最大的优点在于不用  
42.   # 一次将所有的数据读入内存当中,这样减小内存压力，这样不会发生OOM  
43.                          epochs=epochs,  
44.                          steps_per_epoch=x_train.shape[0]//batch_size,  
45.                          validation_data=(x_test, y_test),  
46.                          workers=10  
47.  # 在使用基于进程的线程时，最多需要启动的进程数量。  
48.                         )

4) 模型可视化与保存模型

1.  model.summary()  
2.  # 保存模型 
3.  if not os.path.isdir(save_dir):  
4.      os.makedirs(save_dir)  
5.  model_path = os.path.join(save_dir, model_name)  
6.  model.save(model_path)  
7.  print('Saved trained model at %s ' % model_path)

5) 训练过程可视化

1.  import matplotlib.pyplot as plt  
2.  # 绘制训练 & 验证的准确率值  
3.  plt.plot(history.history['acc'])  
4.  plt.plot(history.history['val_acc'])  
5.  plt.title('Model accuracy')  
6.  plt.ylabel('Accuracy')  
7.  plt.xlabel('Epoch')  
8.  plt.legend(['Train', 'Valid'], loc='upper left')  
9.  plt.savefig('tradition_cnn_valid_acc.png')  
10.  plt.show()  
11.    
12.  # 绘制训练 & 验证的损失值  
13.  plt.plot(history.history['loss'])  
14.  plt.plot(history.history['val_loss'])  
15.  plt.title('Model loss')  
16.  plt.ylabel('Loss')  
17.  plt.xlabel('Epoch')  
18.  plt.legend(['Train', 'Valid'], loc='upper left')  
19.  plt.savefig('tradition_cnn_valid_loss.png')  
20.  plt.show()