Classification of Dogs and Cats Dataset with Convolutional Neural Networks

Environment build

Install Anaconda

For the specific installation process, please Baidu

Configure TensorFlow, Keras

Create a virtual environment

Enter the following command:

conda create -n ppqppl_tfl python=3.6
#tf1是自己为创建虚拟环境取的名字，后面python的版本可以根据自己需求进行选择

Activate the environment

Use the following command:

activate
conda activate ppqppl_tfl

Install tensorflow and keras

Use the following command:

pip install 包名
#直接这样安装可以由于网络的原因，安装失败或者安装很慢
#解决方式：
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple 包名
#此次安装命令如下：
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple tensorflow==1.14.0
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple keras==2.2.5
# 安装 nb_conda_kernels 包
conda install nb_conda_kernels
# 如果上述命令报错，请更换下载源
# 经检测，这里清华源下载最快，注意，这里不能使用梯子

After the installation is complete, we need to create a new jupyter notebook project in the specified virtual environment, and execute the following command to add the specified environment to the jupyter notebook:

# 第一步，切换到想要添加的虚拟环境：
conda activate data

# 第二步，安装ipykernel包
conda install ipykernel

# 第三部，执行如下命令，并确定环境的名称(此处设置为data)
python -m ipykernel install --name data

Then we start jupyter notebook and create a file in the specified environment

In this way, our environment configuration has been successful.

Cat and dog data classification modeling

Cat and dog picture data set download: https://pan.baidu.com/s/1f-MvZl7_J6DF7P9CGBY3SQ —— Extraction code: ruyn

After the data set is downloaded, decompress it and put it in a path without Chinese, as shown in the following figure:

Cat and dog image preprocessing

To classify cat and dog images, the code is as follows:

import os, shutil 
# 原始目录所在的路径
original_dataset_dir = 'D:\\code-file\\conda\\kaggle_Dog&Cat\\train\\'

# 数据集分类后的目录
base_dir = 'D:\\code-file\\conda\\kaggle_Dog&Cat\\train1'
os.mkdir(base_dir)

# # 训练、验证、测试数据集的目录
train_dir = os.path.join(base_dir, 'train')
os.mkdir(train_dir)
validation_dir = os.path.join(base_dir, 'validation')
os.mkdir(validation_dir)
test_dir = os.path.join(base_dir, 'test')
os.mkdir(test_dir)

# 猫训练图片所在目录
train_cats_dir = os.path.join(train_dir, 'cats')
os.mkdir(train_cats_dir)

# 狗训练图片所在目录
train_dogs_dir = os.path.join(train_dir, 'dogs')
os.mkdir(train_dogs_dir)

# 猫验证图片所在目录
validation_cats_dir = os.path.join(validation_dir, 'cats')
os.mkdir(validation_cats_dir)

# 狗验证数据集所在目录
validation_dogs_dir = os.path.join(validation_dir, 'dogs')
os.mkdir(validation_dogs_dir)

# 猫测试数据集所在目录
test_cats_dir = os.path.join(test_dir, 'cats')
os.mkdir(test_cats_dir)

# 狗测试数据集所在目录
test_dogs_dir = os.path.join(test_dir, 'dogs')
os.mkdir(test_dogs_dir)

# 将前1000张猫图像复制到train_cats_dir
fnames = ['cat.{}.jpg'.format(i) for i in range(1000)]
for fname in fnames:
    src = os.path.join(original_dataset_dir, fname)
    dst = os.path.join(train_cats_dir, fname)
    shutil.copyfile(src, dst)

# 将下500张猫图像复制到validation_cats_dir
fnames = ['cat.{}.jpg'.format(i) for i in range(1000, 1500)]
for fname in fnames:
    src = os.path.join(original_dataset_dir, fname)
    dst = os.path.join(validation_cats_dir, fname)
    shutil.copyfile(src, dst)
    
# 将下500张猫图像复制到test_cats_dir
fnames = ['cat.{}.jpg'.format(i) for i in range(1500, 2000)]
for fname in fnames:
    src = os.path.join(original_dataset_dir, fname)
    dst = os.path.join(test_cats_dir, fname)
    shutil.copyfile(src, dst)
    
# 将前1000张狗图像复制到train_dogs_dir
fnames = ['dog.{}.jpg'.format(i) for i in range(1000)]
for fname in fnames:
    src = os.path.join(original_dataset_dir, fname)
    dst = os.path.join(train_dogs_dir, fname)
    shutil.copyfile(src, dst)
    
# 将下500张狗图像复制到validation_dogs_dir
fnames = ['dog.{}.jpg'.format(i) for i in range(1000, 1500)]
for fname in fnames:
    src = os.path.join(original_dataset_dir, fname)
    dst = os.path.join(validation_dogs_dir, fname)
    shutil.copyfile(src, dst)
    
# 将下500张狗图像复制到test_dogs_dir
fnames = ['dog.{}.jpg'.format(i) for i in range(1500, 2000)]
for fname in fnames:
    src = os.path.join(original_dataset_dir, fname)
    dst = os.path.join(test_dogs_dir, fname)
    shutil.copyfile(src, dst)

After classification, it is shown in the figure below:

After viewing the classification, the number of pictures in the corresponding directory:

#输出数据集对应目录下图片数量
print('total training cat images:', len(os.listdir(train_cats_dir)))
print('total training dog images:', len(os.listdir(train_dogs_dir)))
print('total validation cat images:', len(os.listdir(validation_cats_dir)))
print('total validation dog images:', len(os.listdir(validation_dogs_dir)))
print('total test cat images:', len(os.listdir(test_cats_dir)))
print('total test dog images:', len(os.listdir(test_dogs_dir)))

1000 cat and dog training pictures, 500 verification pictures, and 500 test pictures

Example of Cat and Dog Classification - Benchmark Model

Build a network model:

#网络模型构建
from keras import layers
from keras import models
#keras的序贯模型
model = models.Sequential()
#卷积层，卷积核是3*3，激活函数relu
model.add(layers.Conv2D(32, (3, 3), activation='relu',
                        input_shape=(150, 150, 3)))
#最大池化层
model.add(layers.MaxPooling2D((2, 2)))
#卷积层，卷积核2*2，激活函数relu
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
#最大池化层
model.add(layers.MaxPooling2D((2, 2)))
#卷积层，卷积核是3*3，激活函数relu
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
#最大池化层
model.add(layers.MaxPooling2D((2, 2)))
#卷积层，卷积核是3*3，激活函数relu
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
#最大池化层
model.add(layers.MaxPooling2D((2, 2)))
#flatten层，用于将多维的输入一维化，用于卷积层和全连接层的过渡
model.add(layers.Flatten())
#全连接，激活函数relu
model.add(layers.Dense(512, activation='relu'))
#全连接，激活函数sigmoid
model.add(layers.Dense(1, activation='sigmoid'))

After running I get the following error:

This is due to the mismatch between the version of numpy and the version of tensorflow. Just install numpy version 1.16.4. The command is as follows:

pip install numpy==1.16.4 -i "https://pypi.doubanio.com/simple/"

Run the program again, there will be no errors, but there will be warnings (because the version is relatively old), nothing wrong

View the parameter status of each layer of the model:

#输出模型各层的参数状况
model.summary()

The result is shown in the figure below:

Configure the optimizer:

loss: Calculate the loss, here is the cross entropy loss

metrics: a list containing metrics to evaluate the performance of the model during training and testing

from keras import optimizers

model.compile(loss='binary_crossentropy',
              optimizer=optimizers.RMSprop(lr=1e-4),
              metrics=['acc'])

Image format conversion

All pictures (2000 pictures) are resized to 150x150, and use the ImageDataGenerator tool to convert the local pictures.jpg format into RGB pixel grids, and then convert them into floating-point tensors and upload them to the network

from keras.preprocessing.image import ImageDataGenerator

# 所有图像将按1/255重新缩放
train_datagen = ImageDataGenerator(rescale=1./255)
test_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory(
        # 这是目标目录
        train_dir,
        # 所有图像将调整为150x150
        target_size=(150, 150),
        batch_size=20,
        # 因为我们使用二元交叉熵损失，我们需要二元标签
        class_mode='binary')

validation_generator = test_datagen.flow_from_directory(
        validation_dir,
        target_size=(150, 150),
        batch_size=20,
        class_mode='binary')

Output result:

View the output of the generator during the above image preprocessing

#查看上面对于图片预处理的处理结果
for data_batch, labels_batch in train_generator:
    print('data batch shape:', data_batch.shape)
    print('labels batch shape:', labels_batch.shape)
    break

如果出现错误：ImportError: Could not import PIL.Image. The use of load_img requires PIL，是因为没有安装 pillow 库导致的，使用如下命令在 ppqppl_tfl 虚拟环境中安装：

pip install pillow -i https://pypi.tuna.tsinghua.edu.cn/simple

安装完毕后，关闭 Jupyter Notebook 重新打开，重新运行一遍程序即可

输出结果如下：

开始训练模型(此时，我的电脑在飞速运转…)

#模型训练过程
history = model.fit_generator(
      train_generator,
      steps_per_epoch=100,
      epochs=30,
      validation_data=validation_generator,
      validation_steps=50)

电脑性能越好，它训练得越快

保存模型

#保存训练得到的的模型
model.save('G:\\Cat_And_Dog\\kaggle\\cats_and_dogs_small_1.h5')

结果可视化

#对于模型进行评估，查看预测的准确性
import matplotlib.pyplot as plt

acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']

epochs = range(len(acc))

plt.plot(epochs, acc, 'bo', label='Training acc')
plt.plot(epochs, val_acc, 'b', label='Validation acc')
plt.title('Training and validation accuracy')
plt.legend()

plt.figure()

plt.plot(epochs, loss, 'bo', label='Training loss')
plt.plot(epochs, val_loss, 'b', label='Validation loss')
plt.title('Training and validation loss')
plt.legend()

plt.show()

训练结果如上图所示，很明显模型上来就过拟合了，主要原因是数据不够，或者说相对于数据量，模型过复杂(训练损失在第 30 个 epoch 就降为 0 了)，训练精度随着时间线性增长，直到接近 100%，而我们的验证精度停留在 70-72%。我们的验证损失在 5 个 epoch 后达到最小，然后停止，而训练损失继续线性下降，直到接近 0

数据增强

什么是数据增强？

数据集增强主要是为了减少网络的过拟合现象，通过对训练图片进行变换可以得到泛化能力更强的网络，更好的适应应用场景。

常用的数据增强方法有：

重新构建模型：

上面建完的模型就保留着，我们重新建一个 .ipynb 文件，重新开始建模。

首先猫狗图像预处理，只不过这里将分类好的数据集放在 train2 文件夹中，其它的都一样

然后配置网络模型、构建优化器，然后进行数据增强，代码如下：

图像数据生成器增强数据：

from keras.preprocessing.image import ImageDataGenerator
datagen = ImageDataGenerator(
      rotation_range=40,
      width_shift_range=0.2,
      height_shift_range=0.2,
      shear_range=0.2,
      zoom_range=0.2,
      horizontal_flip=True,
      fill_mode='nearest')

参数解释：

查看数据增强后的效果：

import matplotlib.pyplot as plt
# This is module with image preprocessing utilities
from keras.preprocessing import image
fnames = [os.path.join(train_cats_dir, fname) for fname in os.listdir(train_cats_dir)]
# We pick one image to "augment"
img_path = fnames[3]
# Read the image and resize it
img = image.load_img(img_path, target_size=(150, 150))
# Convert it to a Numpy array with shape (150, 150, 3)
x = image.img_to_array(img)
# Reshape it to (1, 150, 150, 3)
x = x.reshape((1,) + x.shape)
# The .flow() command below generates batches of randomly transformed images.
# It will loop indefinitely, so we need to `break` the loop at some point!
i = 0
for batch in datagen.flow(x, batch_size=1):
    plt.figure(i)
    imgplot = plt.imshow(image.array_to_img(batch[0]))
    i += 1
    if i % 4 == 0:
        break
plt.show()

图片格式转化

train_datagen = ImageDataGenerator(
    rescale=1./255,
    rotation_range=40,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,)
# Note that the validation data should not be augmented!
test_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
        # This is the target directory
        train_dir,
        # All images will be resized to 150x150
        target_size=(150, 150),
        batch_size=32,
        # Since we use binary_crossentropy loss, we need binary labels
        class_mode='binary')
validation_generator = test_datagen.flow_from_directory(
        validation_dir,
        target_size=(150, 150),
        batch_size=32,
        class_mode='binary')

开始训练并保存结果

history = model.fit_generator(
      train_generator,
      steps_per_epoch=100,
      epochs=100,
      validation_data=validation_generator,
      validation_steps=50)
model.save('E:\\Cat_And_Dog\\kaggle\\cats_and_dogs_small_2.h5')

结果可视化：

acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']
epochs = range(len(acc))
plt.plot(epochs, acc, 'bo', label='Training acc')
plt.plot(epochs, val_acc, 'b', label='Validation acc')
plt.title('Training and validation accuracy')
plt.legend()
plt.figure()
plt.plot(epochs, loss, 'bo', label='Training loss')
plt.plot(epochs, val_loss, 'b', label='Validation loss')
plt.title('Training and validation loss')
plt.legend()
plt.show()

如下图所示：

由于数据量的增加，对比基准模型，可以很明显的观察到曲线没有过度拟合了，训练曲线紧密地跟踪验证曲线，这也就是数据增强带来的影响，但是可以发现它的波动幅度还是比较大的

下面在此数据增强的基础上，再增加一层 dropout 层，再来训练看看

dropout 层

什么是dropout层？

Dropout层在神经网络层当中是用来干嘛的呢？它是一种可以用于减少神经网络过拟合的结构，那么它具体是怎么实现的呢？

Assume the image below is the original neural network we used for training:

There are four input \(x_i\) and one output y. Dropout is to randomly subtract some neurons in each batch of training, and as a programmer, we can set the probability of dropout (how many neurons are removed) for each layer. After setting, we can get The results of the first batch of training:

From the image above we can see that some neurons are disconnected, so they are dropped out! As the name suggests, dropout means to be removed. It is because we have removed some neurons in the neural network, so it is called the dropout layer.

Implementation:

On the basis of data enhancement, add a dropout layer

#退出层
model.add(layers.Dropout(0.5))

As shown in the figure below, only one layer is added when building the network model, and the rest remains unchanged:

Train the model again, and view the training results as follows:

Compared with the effect of only using data enhancement, adding an additional layer of dropout layer and comparing carefully, we can find that the training curve tracks the verification curve more closely, the fluctuation range is also reduced, and the training effect is better

References

How does Jupyter Notebook call the specified virtual environment

Implementation of convolutional neural network CNN based on Tensorflow and Keras