Article directory
Complete cat and dog recognition using Keras pre-trained model
VGG16 is a deep convolutional neural network proposed by the Oxford University Computer Vision Research Group in 2014. It is the champion of the ImageNet image recognition competition and has better image recognition and classification effects. The VGG16 architecture is very simple. The feature extraction part consists of 13 convolutional layers and 5 pooling layers, and the classifier part has 3 fully connected layers. The convolution layers in VGG16 are all 3×3 convolution kernels, and the pooling layers are 2×2 maximum pooling. The number of convolution kernels increases layer by layer to extract more and more complex features.
VGG16 can be divided into two parts: feature extraction and classification. The feature extraction part includes 13 convolutional layers and 5 pooling layers. The first 12 convolutional layers are composed of two convolutional layers and a pooling layer. Each convolutional layer has 64 convolution kernels and the activation function is ReLU. This design makes VGG16's feature extraction capability more powerful and can extract more complex features. The 13th convolutional layer has 512 convolution kernels, and the activation function is also ReLU. The function of this layer is to abstract the image features more deeply. After the feature extraction part, VGG16 also includes a classifier part, namely 3 fully connected layers, of which the first fully connected layer has 4096 nodes and the second fully connected layer also has 4096 nodes. The last fully connected layer has 1000 nodes, corresponding to the 1000 categories of ImageNet.
The advantage of VGG16 is that it has good performance, but it has many model parameters and requires large storage space and computing resources. In response to this problem, the author of VGG16 proposed the VGG19 model, which adds several convolutional layers and pooling layers based on VGG16, but the model has more parameters and consumes more computing resources.
In general, VGG16 is a simple and effective deep convolutional neural network with strong feature extraction capabilities. It can effectively extract feature information of images, thereby obtaining better image recognition and classification effects.
1. Import the Keras library
from keras import layers
import tensorflow as tf
import keras
import numpy as np
import os
import shutil
import warnings
warnings.filterwarnings('ignore')
Using TensorFlow backend.
2. Import the data set
base_dir = './dataset/cat_dog'
train_dir = base_dir + '/train'
train_dog_dir = train_dir + '/dog'
train_cat_dir = train_dir + '/cat'
test_dir = base_dir + '/test'
test_dog_dir = test_dir + '/dog'
test_cat_dir = test_dir + '/cat'
dc_dir = './dataset/dc/train'
if not os.path.exists(base_dir):
os.mkdir(base_dir)
os.mkdir(train_dir)
os.mkdir(train_dog_dir)
os.mkdir(train_cat_dir)
os.mkdir(test_dir)
os.mkdir(test_dog_dir)
os.mkdir(test_cat_dir)
fnames = ['cat.{}.jpg'.format(i) for i in range(1000)]
for fname in fnames:
src = os.path.join(dc_dir, fname)
dst = os.path.join(train_cat_dir, fname)
shutil.copyfile(src, dst)
fnames = ['cat.{}.jpg'.format(i) for i in range(1000, 1500)]
for fname in fnames:
src = os.path.join(dc_dir, fname)
dst = os.path.join(test_cat_dir, fname)
shutil.copyfile(src, dst)
fnames = ['dog.{}.jpg'.format(i) for i in range(1000)]
for fname in fnames:
src = os.path.join(dc_dir, fname)
dst = os.path.join(train_dog_dir, fname)
shutil.copyfile(src, dst)
fnames = ['dog.{}.jpg'.format(i) for i in range(1000, 1500)]
for fname in fnames:
src = os.path.join(dc_dir, fname)
dst = os.path.join(test_dog_dir, fname)
shutil.copyfile(src, dst)
from keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(rescale=1./255)
test_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
train_dir,
target_size=(200, 200),
batch_size=20,
class_mode='binary'
)
test_generator = test_datagen.flow_from_directory(
test_dir,
target_size=(200, 200),
batch_size=20,
class_mode='binary'
)
Found 2000 images belonging to 2 classes.
Found 1000 images belonging to 2 classes.
3. Keras built-in classic network implementation
covn_base = keras.applications.VGG16(weights=None, include_top=False)
WARNING:tensorflow:From /home/nlp/anaconda3/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:4070: The name tf.nn.max_pool is deprecated. Please use tf.nn.max_pool2d instead.
covn_base.summary()
Model: "vgg16"
_________________________________________________________________
Layer (type) Output Shape Param #
_________________________________________________________________
input_1 (InputLayer) (None, None, None, 3) 0
_________________________________________________________________
block1_conv1 (Conv2D) (None, None, None, 64) 1792
_________________________________________________________________
block1_conv2 (Conv2D) (None, None, None, 64) 36928
_________________________________________________________________
block1_pool (MaxPooling2D) (None, None, None, 64) 0
_________________________________________________________________
block2_conv1 (Conv2D) (None, None, None, 128) 73856
_________________________________________________________________
block2_conv2 (Conv2D) (None, None, None, 128) 147584
_________________________________________________________________
block2_pool (MaxPooling2D) (None, None, None, 128) 0
_________________________________________________________________
block3_conv1 (Conv2D) (None, None, None, 256) 295168
_________________________________________________________________
block3_conv2 (Conv2D) (None, None, None, 256) 590080
_________________________________________________________________
block3_conv3 (Conv2D) (None, None, None, 256) 590080
_________________________________________________________________
block3_pool (MaxPooling2D) (None, None, None, 256) 0
_________________________________________________________________
block4_conv1 (Conv2D) (None, None, None, 512) 1180160
_________________________________________________________________
block4_conv2 (Conv2D) (None, None, None, 512) 2359808
_________________________________________________________________
block4_conv3 (Conv2D) (None, None, None, 512) 2359808
_________________________________________________________________
block4_pool (MaxPooling2D) (None, None, None, 512) 0
_________________________________________________________________
block5_conv1 (Conv2D) (None, None, None, 512) 2359808
_________________________________________________________________
block5_conv2 (Conv2D) (None, None, None, 512) 2359808
_________________________________________________________________
block5_conv3 (Conv2D) (None, None, None, 512) 2359808
_________________________________________________________________
block5_pool (MaxPooling2D) (None, None, None, 512) 0
_________________________________________________________________
Total params: 14,714,688
Trainable params: 14,714,688
Non-trainable params: 0
model = keras.Sequential()
model.add(covn_base)
model.add(layers.GlobalAveragePooling2D())
model.add(layers.Dense(512, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))
model.summary()
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
_________________________________________________________________
vgg16 (Model) (None, None, None, 512) 14714688
_________________________________________________________________
global_average_pooling2d_1 ( (None, 512) 0
_________________________________________________________________
dense_1 (Dense) (None, 512) 262656
_________________________________________________________________
dense_2 (Dense) (None, 1) 513
_________________________________________________________________
Total params: 14,977,857
Trainable params: 14,977,857
Non-trainable params: 0
covn_base.trainable = False #设置权重不可变,卷积基不可变
model.summary()
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
_________________________________________________________________
vgg16 (Model) (None, None, None, 512) 14714688
_________________________________________________________________
global_average_pooling2d_1 ( (None, 512) 0
_________________________________________________________________
dense_1 (Dense) (None, 512) 262656
_________________________________________________________________
dense_2 (Dense) (None, 1) 513
_________________________________________________________________
Total params: 14,977,857
Trainable params: 263,169
Non-trainable params: 14,714,688
model.compile(optimizer=keras.optimizers.Adam(lr=0.001),
loss='binary_crossentropy',
metrics=['acc'])
WARNING:tensorflow:From /home/nlp/anaconda3/lib/python3.7/site-packages/tensorflow/python/ops/nn_impl.py:180: add_dispatch_support.<locals>.wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
4. Training model
history = model.fit_generator(
train_generator,
steps_per_epoch=10,
epochs=15,
validation_data=test_generator,
validation_steps=50)
WARNING:tensorflow:From /home/nlp/anaconda3/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:422: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.
Epoch 1/15
9/10 [==========================>...] - ETA: 5s - loss: 0.6912 - acc: 0.5500
……
5. Analysis model
import matplotlib.pyplot as plt
%matplotlib inline
plt.plot(history.epoch, history.history['loss'], 'r', label='loss')
plt.plot(history.epoch, history.history['val_loss'], 'b--', label='val_loss')
plt.plot(history.epoch, history.history['acc'], 'r')
plt.plot(history.epoch, history.history['val_acc'], 'b--')
plt.legend()