Deep Learning in Practice: Convolutional Neural Networks to Identify Cats and Dogs

Public number: You Er Hut
Author: Peter
Editor: Peter

Hello everyone, my name is Peter~

This paper documents the first application based on convolutional neural networks in the field of image recognition: cat and dog image recognition . The main contents include:

  • data processing
  • Neural network model construction
  • Data Augmentation Implementation

The deep learning framework used in this article is Keras;

Image data from kaggle official website: www.kaggle.com/c/dogs-vs-c…

data processing

The amount of data

The dataset contains 25,000 images, 12,500 for cats and dogs; create a training set of 1,000 samples per category, a validation set of 500 samples, and a test set of 500 samples

Note: only part of the data is taken out for modeling

Create a directory

In [1]:

import os, shutil
复制代码

In [2]:

current_dir = !pwd  # 当前目录
current_dir[0]
复制代码

Out[2]:

'/Users/peter/Desktop/kaggle/kaggle_12_dogs&cats/dogs-vs-cats'
复制代码

Create a new directory to store the required datasets:

base_dir = current_dir[0] + '/cats_dogs_small'
os.mkdir(base_dir)  # 创建目录
复制代码
# 分别创建训练集、验证集和测试集的目录

train_dir = os.path.join(base_dir,"train")
os.mkdir(train_dir)
validation_dir = os.path.join(base_dir,"validation")
os.mkdir(validation_dir)
test_dir = os.path.join(base_dir,"test")
os.mkdir(test_dir)

# 猫、狗的训练、验证、测试图像目录
train_cats_dir = os.path.join(train_dir, "cats")
os.mkdir(train_cats_dir)
train_dogs_dir = os.path.join(train_dir, "dogs")
os.mkdir(train_dogs_dir)

validation_cats_dir = os.path.join(validation_dir, "cats")
os.mkdir(validation_cats_dir)
validation_dogs_dir = os.path.join(validation_dir, "dogs")
os.mkdir(validation_dogs_dir)

test_cats_dir = os.path.join(test_dir, "cats")
os.mkdir(test_cats_dir)
test_dogs_dir = os.path.join(test_dir, "dogs")
os.mkdir(test_dogs_dir)
复制代码

Data set replication

In [5]:

# 1000张当做训练集train

fnames = ['cat.{}.jpg'.format(i) for i in range(1000)]
for fname in fnames:
    # 源目录文件
    src = os.path.join(current_dir[0] + "/train", fname)
    # 目标目录
    dst = os.path.join(train_cats_dir, fname)
    shutil.copyfile(src, dst)
复制代码

In [6]:

# 500张当做验证集valiation

fnames = ['cat.{}.jpg'.format(i) for i in range(1000,1500)]
for fname in fnames:
    src = os.path.join(current_dir[0] + "/train", fname)
    dst = os.path.join(validation_cats_dir, fname)
    shutil.copyfile(src, dst)
复制代码

In [7]:

# 500张当做测试集test

fnames = ['cat.{}.jpg'.format(i) for i in range(1500,2000)]
for fname in fnames:
    src = os.path.join(current_dir[0] + "/train", fname)
    dst = os.path.join(test_cats_dir, fname)
    shutil.copyfile(src, dst)
复制代码

In [8]:

# 针对dog的3个同样操作

# 1、1000张当做训练集train
fnames = ['dog.{}.jpg'.format(i) for i in range(1000)]
for fname in fnames:
    src = os.path.join(current_dir[0] + "/train", fname)
    dst = os.path.join(train_dogs_dir, fname)
    shutil.copyfile(src, dst)
    
# 2、500张当做验证集valiation
fnames = ['dog.{}.jpg'.format(i) for i in range(1000,1500)]
for fname in fnames:
    src = os.path.join(current_dir[0] + "/train", fname)
    dst = os.path.join(validation_dogs_dir, fname)
    shutil.copyfile(src, dst)
    
# 3、500张当做测试集test
fnames = ['dog.{}.jpg'.format(i) for i in range(1500,2000)]
for fname in fnames:
    src = os.path.join(current_dir[0] + "/train", fname)
    dst = os.path.join(test_dogs_dir, fname)
    shutil.copyfile(src, dst)
复制代码

Check the data

See how many images are in each set (train, validation, test) for both cat and dog categories:

Build Neural Networks

Review the composition of the convolutional neural network: Conv2D layers (using the relu activation function) + MaxPooling2D layers are alternately stacked .

When larger images and more complex problems are required, another Conv2D layer (using the relu activation function) + MaxPooling2D layer needs to be added.

The benefits of doing this:

  • increase network capacity
  • Reduce the size of the feature map

It should be noted that: cat and dog classification is a binary classification problem, so the last layer of the network is a single unit activated with sigmoid (Dense layer of size 1)

The depth of the feature map in the network is gradually increasing (from 32 to 128), but the size of the feature map is gradually decreasing (from 150-150 to 7-7)

  1. Increased depth: the original image is more complex and requires more filters
  2. Size reduction: more convolution and pooling layers are compressing and abstracting the image

Network construction

In [15]:

import tensorflow as tf
from keras import layers 
from keras import models

model = models.Sequential()
model.add(tf.keras.layers.Conv2D(32,(3,3),activation="relu",
                               input_shape=(150,150,3)))
model.add(tf.keras.layers.MaxPooling2D((2,2)))

model.add(tf.keras.layers.Conv2D(64,(3,3),activation="relu"))
model.add(tf.keras.layers.MaxPooling2D((2,2)))

model.add(tf.keras.layers.Conv2D(128,(3,3),activation="relu"))
model.add(tf.keras.layers.MaxPooling2D((2,2)))

model.add(tf.keras.layers.Conv2D(128,(3,3),activation="relu"))
model.add(tf.keras.layers.MaxPooling2D((2,2)))  # 

model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(512, activation="relu"))
model.add(tf.keras.layers.Dense(1, activation="sigmoid"))

model.summary()
复制代码

Model compilation (optimization)

网络最后一层是单一sigmoid单元,使用二元交叉熵作为损失函数

In [16]:

# 原文:from keras import optimizers

from tensorflow.keras import optimizers

model.compile(loss="binary_crossentropy",
             optimizer=optimizers.RMSprop(lr=1e-4),
             metrics=["acc"])
复制代码

数据预处理

数据输入到神经网络之前必须先转成浮点数张量

keras有个处理图像的模块:keras.preprocessing.image

它包含ImageDataGenerator类,可以快速创建Python生成器,将图形文件处理成张量批量

插播知识点:如何理解python中的生成器

数据预处理

  1. 读取文件
  2. 将文件JPEG文件转成RGB像素网络
  3. 像素网格转成浮点数张量

In [18]:

from keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(rescale=1./255)  # 进行缩放
test_datagen = ImageDataGenerator(rescale=1./255)  # 进行缩放

train_generator = train_datagen.flow_from_directory(
    train_dir,  # 待处理的目录
    target_size=(150,150),  # 图像大小设置
    batch_size=20,
    class_mode="binary"  # 损失函数是binary_crossentropy 所以使用二进制标签
)

validation_generator = test_datagen.flow_from_directory(
    validation_dir,  # 待处理的目录
    target_size=(150,150),  # 图像大小设置
    batch_size=20,
    class_mode="binary"  # 损失函数是binary_crossentropy 所以使用二进制标签
)
Found 2000 images belonging to 2 classes.
Found 1000 images belonging to 2 classes.
复制代码

In [19]:

for data_batch, labels_batch in train_generator:
    print(data_batch.shape)
    print(labels_batch.shape)
    break
(20, 150, 150, 3)
(20,)
复制代码

生成器的输出是150-150的RGB图像和二进制标签,形状为(20,)组成的批量。每个批量包含20个样本(批量的大小)。

生成器会不断地生成这些批量,不断地循环目标文件夹中的图像。

keras模型使用fit_generator方法来拟合生成器的效果。模型有个参数steps_per_epoch参数:从生成器中抽取steps_per_epoch个批量后,拟合进入下一轮。

本例中:总共是2000个样本,每个批量是20个样本,所以需要100个批量

模型拟合

In [20]:

history = model.fit_generator(
    train_generator,  # 第一个参数必须是Python生成器
    steps_per_epoch=100,  # 2000 / 20
    epochs=30,  # 迭代次数
    validation_data=validation_generator,  # 待验证的数据集
    validation_steps=50 
)
复制代码

保存模型

In [21]:

# 保存模型
# model.save("cats_and_dogs_small.h5")
复制代码

损失和精度曲线

In [22]:

import matplotlib.pyplot as plt
%matplotlib inline
复制代码

In [23]:

history_dict = history.history  # 字典形式
for key, _ in history_dict.items():
    print(key)
loss
acc
val_loss
val_acc
复制代码

In [24]:

acc = history_dict["acc"]
val_acc = history_dict["val_acc"]

loss = history_dict["loss"]
val_loss = history_dict["val_loss"]
复制代码

In [25]:

epochs = range(1, len(acc)+1)

# acc
plt.plot(epochs, acc, "bo", label="Training acc")
plt.plot(epochs, val_acc, "b", label="Validation acc")
plt.title("Training and Validation acc")
plt.legend()

plt.figure()

# loss
plt.plot(epochs, loss, "bo", label="Training loss")
plt.plot(epochs, val_loss, "b", label="Validation loss")
plt.title("Training and Validation loss")
plt.legend()

复制代码

小结:得到过拟合的结论

  1. 随着时间的增加,训练精度在不断增加,接近100%,而验证精度则停留在70%
  2. 验证的损失差不多在第6轮后达到最小值,后面一定轮数内保持不变,训练的损失一直下降,直接接近0

数据增强-data augmentation

什么是数据增强

数据增强也是解决过拟合的一种方法,另外两种是:

  1. dropout
  2. 权重衰减正则化

什么是数据增强:从现有的训练样本中生成更多的训练数据,利用多种能够生成可信图像的随机变化来增加数据样本。

模型在训练时候不会查看两个完全相同的图像

设置数据增强

In [26]:

datagen = ImageDataGenerator(
    rotation_range=40,  # 0-180的角度值
    width_shift_range=0.2,  # 水平和垂直方向的范围;相对于总宽度或者高度的比例
    height_shift_range=0.2,
    shear_range=0.2,  # 随机错切变换的角度
    zoom_range=0.2,  # 图像随机缩放的角度
    horizontal_flip=True,  # 随机将一半图像进行水平翻转
    fill_mode="nearest"  # 用于填充新创建像素的方法
)
复制代码

Display the enhanced image

In [27]:

from keras.preprocessing import image

fnames = [os.path.join(train_cats_dir,fname) for fname in os.listdir(train_cats_dir)]
img_path = fnames[3]
复制代码

In [28]:

# 读取图片并调整大小
img = image.load_img(img_path, target_size=(150,150))  
# 转成数组
x = image.img_to_array(img)

# shape转成(1,150,150,3)
x = x.reshape((1,) + x.shape)

i = 0
for batch in datagen.flow(x, batch_size=1):  # 生成随机变换后的图像批量
    plt.figure()   
    imgplot = plt.imshow(image.array_to_img(batch[0]))
    i += 1
    if i % 4 == 0:
        break  # 循环是无限,需要在某个时刻终止
        
plt.show()
复制代码

New Convolutional Neural Networks with Dropout Layers

With data augmentation to train the network, the network does not see the same input twice. But the inputs are still highly correlated and cannot completely eliminate overfitting.

Consider adding a dropout layer before the dense classification connector

In [29]:

import tensorflow as tf
from keras import layers 
from keras import models

model = models.Sequential()
model.add(tf.keras.layers.Conv2D(32,(3,3),activation="relu",
                               input_shape=(150,150,3)))
model.add(tf.keras.layers.MaxPooling2D((2,2)))

model.add(tf.keras.layers.Conv2D(64,(3,3),activation="relu"))
model.add(tf.keras.layers.MaxPooling2D((2,2)))

model.add(tf.keras.layers.Conv2D(128,(3,3),activation="relu"))
model.add(tf.keras.layers.MaxPooling2D((2,2)))

model.add(tf.keras.layers.Conv2D(128,(3,3),activation="relu"))
model.add(tf.keras.layers.MaxPooling2D((2,2)))  # 

model.add(tf.keras.layers.Flatten())

# 添加内容 
model.add(tf.keras.layers.Dropout(0.5))

model.add(tf.keras.layers.Dense(512, activation="relu"))
model.add(tf.keras.layers.Dense(1, activation="sigmoid"))

model.compile(loss="binary_crossentropy",
             optimizer=optimizers.RMSprop(lr=1e-4),
             metrics=["acc"])
复制代码

Using data augmenters to train convolutional neural networks (error resolution)

Regarding error resolution: we have 2000 training images, 1000 validation images, and 1000 test images.

  • steps_per_epoch=100, batch_size=32, so the data should be 3200, obviously the input training data is not enough.
  • validation_steps=50, batch_size=32, so the data should be 1600, obviously the validation data is not enough.

Therefore, change to steps_per_epoch=2000/32≈63, and validation_steps=1000/32≈32.

In [44]:

# 训练数据的增强
train_datagen = ImageDataGenerator(
    rescale=1. / 255,
    rotation_range=40,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True
)

# 不能增强验证数据
test_datagen = ImageDataGenerator(rescale=1.0 / 255)

train_generator = train_datagen.flow_from_directory(
    train_dir,  # 目标目录
    target_size=(150,150),  # 大小调整
    batch_size=32,
    class_mode="binary"
)

validation_generator = test_datagen.flow_from_directory(
    validation_dir,
    target_size=(150,150),
    batch_size=32,
    class_mode="binary"
)

# 优化:报错有修改
history = model.fit_generator(
    train_generator,
    # 原文 steps_per_epoch=100,
    steps_per_epoch=63,  # steps_per_epoch=2000/32≈63 取上限
    epochs=100,
    validation_data=validation_generator,
    # 原文 validation_steps=50
    validation_steps=32  # validation_steps=1000/32≈32
)
复制代码

Save the model:

# 保存模型
model.save("cats_and_dogs_small_2.h5")
复制代码

Loss and Accuracy Curves

In [46]:

history_dict = history.history  # 字典形式

acc = history_dict["acc"]
val_acc = history_dict["val_acc"]
loss = history_dict["loss"]
val_loss = history_dict["val_loss"]
复制代码

Specific drawing code:

epochs = range(1, len(acc)+1)

# acc
plt.plot(epochs, acc, "bo", label="Training acc")
plt.plot(epochs, val_acc, "b", label="Validation acc")
plt.title("Training and Validation acc")
plt.legend()

plt.figure()

# loss
plt.plot(epochs, loss, "bo", label="Training loss")
plt.plot(epochs, val_loss, "b", label="Validation loss")
plt.title("Training and Validation loss")
plt.legend()

plt.show()
复制代码

Conclusion: After using data augmentation, the model is no longer fitted, and the training set curve follows the validation curve; and the accuracy also becomes 81%, which is improved compared to before the regularization.

Guess you like

Origin juejin.im/post/7084597833340289038