一. 简述
VGG卷积神经网络是牛津大学在2014年提出来的模型。当这个模型被提出时,由于它的简洁性和实用性,马上成为了当时最流行的卷积神经网络模型。它在图像分类和目标检测任务中都表现出非常好的结果。在2014年的ILSVRC比赛中,VGG 在Top-5中取得了92.3%的正确率。
VGG结构图
VGG-16
VGG模型有一些变种,其中最受欢迎的当然是 VGG-16,这是一个拥有16层的模型。你可以看到它需要维度是 224*224*3 的输入数据。
二. Keras实现VGG16
from keras import Sequential from keras.layers import Dense, Activation, Conv2D, MaxPooling2D, Flatten, Dropout from keras.layers import Input from keras.optimizers import SGD
model = Sequential() # BLOCK 1 model.add(Conv2D(filters = 64, kernel_size = (3, 3), activation = 'relu', padding = 'same', name = 'block1_conv1', input_shape = (224, 224, 3))) model.add(Conv2D(filters = 64, kernel_size = (3, 3), activation = 'relu', padding = 'same', name = 'block1_conv2')) model.add(MaxPooling2D(pool_size = (2, 2), strides = (2, 2), name = 'block1_pool')) # BLOCK2 model.add(Conv2D(filters = 128, kernel_size = (3, 3), activation = 'relu', padding = 'same', name = 'block2_conv1')) model.add(Conv2D(filters = 128, kernel_size = (3, 3), activation = 'relu', padding = 'same', name = 'block2_conv2')) model.add(MaxPooling2D(pool_size = (2, 2), strides = (2, 2), name = 'block2_pool')) # BLOCK3 model.add(Conv2D(filters = 256, kernel_size = (3, 3), activation = 'relu', padding = 'same', name = 'block3_conv1')) model.add(Conv2D(filters = 256, kernel_size = (3, 3), activation = 'relu', padding = 'same', name = 'block3_conv2')) model.add(Conv2D(filters = 256, kernel_size = (3, 3), activation = 'relu', padding = 'same', name = 'block3_conv3')) model.add(MaxPooling2D(pool_size = (2, 2), strides = (2, 2), name = 'block3_pool')) # BLOCK4 model.add(Conv2D(filters = 512, kernel_size = (3, 3), activation = 'relu', padding = 'same', name = 'block4_conv1')) model.add(Conv2D(filters = 512, kernel_size = (3, 3), activation = 'relu', padding = 'same', name = 'block4_conv2')) model.add(Conv2D(filters = 512, kernel_size = (3, 3), activation = 'relu', padding = 'same', name = 'block4_conv3')) model.add(MaxPooling2D(pool_size = (2, 2), strides = (2, 2), name = 'block4_pool')) # BLOCK5 model.add(Conv2D(filters = 512, kernel_size = (3, 3), activation = 'relu', padding = 'same', name = 'block5_conv1')) model.add(Conv2D(filters = 512, kernel_size = (3, 3), activation = 'relu', padding = 'same', name = 'block5_conv2')) model.add(Conv2D(filters = 512, kernel_size = (3, 3), activation = 'relu', padding = 'same', name = 'block5_conv3')) model.add(MaxPooling2D(pool_size = (2, 2), strides = (2, 2), name = 'block5_pool')) model.add(Flatten()) model.add(Dense(4096, activation = 'relu', name = 'fc1')) model.add(Dropout(0.5)) model.add(Dense(4096, activation = 'relu', name = 'fc2')) model.add(Dropout(0.5)) model.add(Dense(1000, activation = 'softmax', name = 'prediction'))
model.summary()
到这里,自己用Keras实现VGG16模型就完成了,接下来就是对模型进行训练!
不过,Kera的应用模块Application提供了带有预训练权重的Keras模型,这些模型可以用来进行预测、特征提取和finetune。
三. Keras.application中的VGG16模型
1. 模型信息
VGG16模型的权重由ImageNet训练而来。该模型在Theano和TensorFlow后端均可使用,并接受channels_first和channels_last两种输入维度顺序,模型的默认输入尺寸是224x224。
import keras model = keras.applications.vgg16.VGG16(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)
参数
- include_top:是否保留顶层的3个全连接网络
- weights:None代表随机初始化,即不加载预训练权重。'imagenet'代表加载预训练权重
- input_tensor:可填入Keras tensor作为模型的图像输出tensor
- input_shape:可选,仅当include_top=False有效,应为长为3的tuple,指明输入图片的shape,图片的宽高必须大于48,如(200,200,3)
返回值
- pooling:当include_top=False时,该参数指定了池化方式。None代表不池化,最后一个卷积层的输出为4D张量。‘avg’代表全局平均池化,‘max’代表全局最大值池化。
- classes:可选,图片分类的类别数,仅当include_top=True并且不加载预训练权重时可用。
Keras 模型对象
2. 预训练权重
注意:模型参数.h5文件要放在/.keras/models下,如果该目录下没有文件,将会在上一步的载入模型时自动下载!
3. 模型预测
from keras.preprocessing import image from keras.applications.imagenet_utils import preprocess_input, decode_predictions import numpy as np import time t0 = time.time() img = image.load_img('VGG_16_CAT.jpg', target_size = (224, 224)) x = image.img_to_array(img) # 三维(224,224,3) x = np.expand_dims(x, axis = 0) # 四维(1,224,224,3) x = preprocess_input(x) # 预处理 print(x.shape) y_pred = model.predict(x)# 预测概率 t1 = time.time() print("测试图:", decode_predictions(y_pred)) # 输出五个最高概率(类名, 语义概念, 预测概率) print("耗时:", str((t1-t0)*1000), "ms")
(1, 224, 224, 3) 测试图: [[('n02123045', 'tabby', 0.73477006), ('n02124075', 'Egyptian_cat', 0.07941937), ('n02123159', 'tiger_cat', 0.07054488), ('n02883205', 'bow_tie', 0.019230891), ('n04553703', 'washbasin', 0.013854385)]] 耗时: 539.731502532959 ms
4. 使用Keras保存和恢复预训练的模型
一旦你利用Keras完成了训练,你可以将你的网络保存在HDF5里面。当然,你需要先安装 h5py。HDF5 格式非常适合存储大量的数字收,并从 numpy 处理这些数据。比如,我们可以轻松的将存储在磁盘上的多TB数据集进行切片,就好像他们是真正的 numpy 数组一样。你还可以将多个数据集存储在单个文件中,遍历他们或者查看 .shape 和 .dtype 属性。
保存权重
如果你要保存训练好的权重,那么你可以直接使用 save_weights 函数。
model.save_weights("my_model.h5")
载入预训练权重
model.load_weights("my_model_weights.h5")
参考文献:
Very Deep Convolutional Networks for Large-Scale Image Recognition
如何从零使用 Keras + TensorFlow 开发一个复杂深度学习模型