Simple recognition model of cats and dogs

Disclaimer: This article is a blogger original article, follow the CC 4.0 BY-SA copyright agreement, reproduced, please attach the original source link and this statement.
This link: https://blog.csdn.net/hekaiyou/article/details/89256579

Download from Google cats and dogs training set and validation set of zip archive, extract to the project directory. The folder which contains training ( train) and validate ( validation) dataset subdirectory, and each subdirectory contains subdirectories cats and dogs.

image.png

You can create it directly in this directory a python file, called cats and dogs ( cats_and_dogs.py), and then configure the training set, validation set of directories.

base_dir = '../cats_and_dogs_filtered'
train_dir = os.path.join(base_dir, 'train')
validation_dir = os.path.join(base_dir, 'validation')

# 用于训练的猫图目录。
train_cats_dir = os.path.join(train_dir, 'cats')
# 用于训练的狗图目录。
train_dogs_dir = os.path.join(train_dir, 'dogs')

# 用于验证的猫图目录。
validation_cats_dir = os.path.join(validation_dir, 'cats')
# 用于验证的狗图目录。
validation_dogs_dir = os.path.join(validation_dir, 'dogs')

We can now look at a cat ( cats) and dog ( dogs) in training ( train) and validate ( validation) directory file naming convention is how.

train_cat_fnames = os.listdir(train_cats_dir)
print('猫的训练、验证目录中的文件命名约定:\n%s' % (train_cat_fnames[:10]))

train_dog_fnames = os.listdir(train_dogs_dir)
train_dog_fnames.sort()
print('狗的训练、验证目录中的文件命名约定:\n%s' % (train_dog_fnames[:10]))

'''
猫的训练、验证目录中的文件命名约定:
['cat.952.jpg', 'cat.946.jpg', 'cat.6.jpg', 'cat.749.jpg', 'cat.991.jpg', 'cat.985.jpg', 'cat.775.jpg', 'cat.761.jpg', 'cat.588.jpg', 'cat.239.jpg']
狗的训练、验证目录中的文件命名约定:
['dog.0.jpg', 'dog.1.jpg', 'dog.10.jpg', 'dog.100.jpg', 'dog.101.jpg', 'dog.102.jpg', 'dog.103.jpg', 'dog.104.jpg', 'dog.105.jpg', 'dog.106.jpg']
'''

You can also find out the total number of training and certification catalog of images of cats and dogs.

print('总共用于训练的猫图像:', len(os.listdir(train_cats_dir)))
print('总共用于训练的狗图像:', len(os.listdir(train_dogs_dir)))
print('总共用于验证的猫图像:', len(os.listdir(validation_cats_dir)))
print('总共用于验证的狗图像:', len(os.listdir(validation_dogs_dir)))

'''
总共用于训练的猫图像: 1000
总共用于训练的狗图像: 1000
总共用于验证的猫图像: 500
总共用于验证的狗图像: 500
'''

Now know, there are 1,000 cats and dogs trained pictures, there are 500 pictures of dogs and cats verification. We can now look at a few pictures, in order to advance to look at the data set cats and dogs are these pictures.

import matplotlib.pyplot as plt
import matplotlib.image as mpimg

# 输出图表的参数,将以4x4的配置输出猫狗数据集的部分图片。
nrows = 4
ncols = 4

# 迭代图像的当前索引。
pic_index = 0

# 设置matplotlib(Python的2D绘图库)图,并将其设置为适合4x4图片大小。
fig = plt.gcf()
fig.set_size_inches(ncols * 4, nrows * 4)

pic_index += 8
next_cat_pix = [os.path.join(train_cats_dir, fname)
                for fname in train_cat_fnames[pic_index-8:pic_index]]
next_dog_pix = [os.path.join(train_dogs_dir, fname)
                for fname in train_dog_fnames[pic_index-8:pic_index]]

for i, img_path in enumerate(next_cat_pix+next_dog_pix):
  # 设置子图,子图的索引从1开始。
  sp = plt.subplot(nrows, ncols, i + 1)
  # 不显示轴(或者说网格线)。
  sp.axis('Off')
  img = mpimg.imread(img_path)
  plt.imshow(img)

plt.show()

The actual effect of the output as shown below, will be re-run every time to see the new batch of pictures.

image.png

Now ready to build from scratch a small Convnet (open source convolutional neural network code), and expects to reach 72% accuracy rate. Image we are dealing with a color image is 150x150, coding, convolution three stacked layers ( convolution), modified linear activation function ( relu) and the maximum cell layer ( maxpooling) module.

We convolution layer ( convolution) of the filter size 3x3, we pooled the maximum layer ( maxpooling) filter is 2x2.

  • 第一个卷积层(convolution):设置16个3x3大小的过滤器,即识别16种轮廓及边缘,最后通过修正线性激活函数(relu)形成一个3x3x16的立方体。
  • 第二个卷积层(convolution):设置32个3x3大小的过滤器,即识别32种轮廓及边缘,最后通过修正线性激活函数(relu)形成一个3x3x32的立方体。
  • 第三个卷积层(convolution):设置64个3x3大小的过滤器,即识别64种轮廓及边缘,最后通过修正线性激活函数(relu)形成一个3x3x64的立方体。

每一个卷积层(convolution)成形之后,都会追加一个最大池化层(maxpooling),通过池化操作来降低卷积层输出的特征向量,同时改善结果,使其不易出现过拟合。

注意:这是一种广泛使用的配置,并且已知可以很好地用于图像分类。

此外,由于我们只有相对较少的训练样本(1,000张),仅使用三个卷积模块就可以保持模型较小,从而降低过度拟合(overfitting)的风险。过拟合(overfitting)的意思是:对见过的数据,分类效果极好;而对没见过的数据,表现很糟糕。

from tensorflow import keras

# 输入特征图是150x150x3,其中150x150用于图像像素,3用于三个颜色通道(R、G和B)。
img_input = keras.layers.Input(shape=(150, 150, 3))

# 第一个卷积层提取3x3x16的特征,使用修正线性激活函数(`relu`),然后是具有2x2大小的最大池化层。
x = keras.layers.Conv2D(16, 3, activation='relu')(img_input)
x = keras.layers.MaxPooling2D(2)(x)

# 第二个卷积层提取3x3x32的特征,使用修正线性激活函数(`relu`),然后是具有2x2大小的最大池化层。
x = keras.layers.Conv2D(32, 3, activation='relu')(x)
x = keras.layers.MaxPooling2D(2)(x)

# 第三个卷积层提取3x3x64的特征,使用修正线性激活函数(`relu`),然后是具有2x2大小的最大池化层。
x = keras.layers.Conv2D(64, 3, activation='relu')(x)
x = keras.layers.MaxPooling2D(2)(x)

然后是最重要一步,创建两个全连接层(dense),因为我们正面临着两个分类问题,即二元分类问题,我们将以sigmoid函数(一个常用的神经网络激励函数)激活我们的网络,以便我们的网络输出将是0~1之间的单个标量,当前编码的概率图像是一维(而不是零维)。

# 将特征图展平为一维数据(`1-dim`)张量,以便添加全连接层。
x = keras.layers.Flatten()(x)

# 使用修正线性激活函数(`relu`)和512个隐藏单元(或神经元)创建全连接层。
x = keras.layers.Dense(512, activation='relu')(x)

# 使用单个节点(或神经元)和`sigmoid`激活函数创建输出层。
output = keras.layers.Dense(1, activation='sigmoid')(x)

# 创建模型:
# input = 输入特征映射
# output = 输入特征映射 + 堆叠卷积层/最大池化层数 + 全连接层
# 全连接层 + `sigmoid`输出层
model = keras.Model(img_input, output)

最后,可以总结一下整个模型的框架,欣赏一下我们创建的卷积神经网络模型。

model.summary()

'''
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 150, 150, 3)       0         
_________________________________________________________________
conv2d (Conv2D)              (None, 148, 148, 16)      448       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 74, 74, 16)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 72, 72, 32)        4640      
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 36, 36, 32)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 34, 34, 64)        18496     
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 17, 17, 64)        0         
_________________________________________________________________
flatten (Flatten)            (None, 18496)             0         
_________________________________________________________________
dense (Dense)                (None, 512)               9470464   
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 513       
=================================================================
Total params: 9,494,561
Trainable params: 9,494,561
Non-trainable params: 0
_________________________________________________________________
'''

上面输出中的“Output Shape”列显示特征图的大小如何在每个连续图层中演变。很明显,由于没有设置填充(padding),卷积层(conv)会将特征映射的大小减少一点,并且每个最大池化层(max_pooling)将特征映射减半。

接下来,我们要配置模型训练的规范,我们将使用交叉熵损失函数(binary_crossentropy)训练我们的模型,因为这是一个二进制分类问题,我们的最终激活函数是一个sigmoid函数。

我们将使用学习率为0.001的rmsprop(一种常用的深度学习优化算法)优化器。同时,在训练期间,我们想监控分类准确性。

注意:在当前情况下,使用rmsprop优化算法优于随机梯度下降(SGD)算法,因为rmsprop会自动调整学习速率。

model.compile(loss='binary_crossentropy',
              optimizer=keras.optimizers.RMSprop(lr=0.001),
              metrics=['acc'])

现在,让我们设置数据生成器,它将读取源文件夹中的图片,将它们转换为float32的张量,并将它们附带上标签提供给我们的网络。这样的话,我们将有一个用于训练图像的生成器和一个用于验证图像的生成器,生成器将生产20个尺寸为150x150的图像及其标签(二进制)。

通常进入神经网络的数据应该以某种方式进行标准化,以使其更适合网络处理。(将原始像素直接输入网络的情况并不常见)在这里,我们通过将像素值归一化为0~1范围来预处理图像(最初所有值都在0~255范围内)。

在Keras(用Python编写的高级神经网络API)中,可以使用rescale参数通过keras.preprocessing.image.ImageDataGenerator类完成此操作。此类允许我们通过.flow(data, labels).flow_from_directory(directory)实例化增强图像批次(及其标签)的生成器。然后,这些生成器可以与接受数据生成器作为输入的Keras模型方法一起使用:fit_generatorevaluate_generatorpredict_generator

# 对所有图像按照指定的尺度因子,进行放大或缩小,设置值在0~1之间,通常为1./255。
train_datagen = keras.preprocessing.image.ImageDataGenerator(rescale=1./255)
test_datagen = keras.preprocessing.image.ImageDataGenerator(rescale=1./255)

# 使用train_datagen生成器分批轮流训练20张图像。
train_generator = train_datagen.flow_from_directory(
        # 这是训练图像的源目录。
        train_dir,
       # 所有图像将调整为150x150大小。
        target_size=(150, 150),
        batch_size=20,
        # 由于我们使用binary_crossentropy损失算法,我们需要二进制标签。
        class_mode='binary')

# 使用test_datagen生成器批量生成20个流程验证图像。
validation_generator = test_datagen.flow_from_directory(
        validation_dir,
        target_size=(150, 150),
        batch_size=20,
        class_mode='binary')

Now let all the training images, a total of 2000, points 15 times, and verify that all verification image, a 1000.

# 开始训练。
history = model.fit_generator(
      train_generator,
      # 2000张图片 = 批量大小 * 步进。
      steps_per_epoch=100,
      epochs=15,
      validation_data=validation_generator,
      # 1000张图片 = 批量大小 * 步进。
      validation_steps=50,
      verbose=2)

We can do an interesting thing, is to visualize how the model is how changes in training, let's pick a random cat or dog image from the training set, and then generate a pattern in which each line is the output layer, and the each row in the image is the output characteristic of a particular filter of FIG.

import numpy as np
import random

# 定义一个新的模型,它将图像作为输入,并在第一个模型之后输出前一个模型中所有图层的中间表示。
successive_outputs = [layer.output for layer in model.layers[1:]]
visualization_model = keras.Model(img_input, successive_outputs)

# 从训练集中准备一只猫或狗的随机输入图像。
cat_img_files = [os.path.join(train_cats_dir, f) for f in train_cat_fnames]
dog_img_files = [os.path.join(train_dogs_dir, f) for f in train_dog_fnames]
img_path = random.choice(cat_img_files + dog_img_files)

# 这是PIL图像。
img = keras.preprocessing.image.load_img(img_path, target_size=(150, 150))
# Numpy数组形状(150,150,3)。
x = keras.preprocessing.image.img_to_array(img)
# Numpy数组形状(1,150,150,3)。
x = x.reshape((1,) + x.shape)

# 重新缩放1/255。
x /= 255

# 通过神经网络运行我们的图像,从而获得该图像的所有中间表示。
successive_feature_maps = visualization_model.predict(x)

# 这些是图层的名称,因此可以将它们作为我们图表的一部分。
layer_names = [layer.name for layer in model.layers]

# 现在展示一下。
for layer_name, feature_map in zip(layer_names, successive_feature_maps):
  if len(feature_map.shape) == 4:
    # 只需对conv/maxpool(卷积/最大化池)图层执行此操作,而不是全连接层。
    # 特征图中的要素数量。
    n_features = feature_map.shape[-1]
    # 特征图具有形状(1,size,size,n_features)。
    size = feature_map.shape[1]
    # 我们将在此矩阵中平铺图像。
    display_grid = np.zeros((size, size * n_features))
    for i in range(n_features):
      # 对特征该进行后处理,使其在视觉上更加美观。
      x = feature_map[0, :, :, I]
      x -= x.mean()
      x /= x.std()
      x *= 64
      x += 128
      x = np.clip(x, 0, 255).astype('uint8')
      # 我们将每个过滤器平铺到这个大的水平网格中。
      display_grid[:, i * size : (i + 1) * size] = x
    # 显示网格。
    scale = 20. / n_features
    plt.figure(figsize=(scale * n_features, scale))
    plt.title(layer_name)
    plt.grid(False)
    plt.imshow(display_grid, aspect='auto', cmap='viridis')
    # 显示图像。
    plt.show()

Before the above code runs out, will in turn show the following few pictures.

Figure_1.png

Figure_2.png

Figure_3.png

Figure_4.png

Figure_5.png

Figure_6.png

As shown above, we see the image from the original pixel becomes increasingly abstract and compact display, downstream of the show began to highlight the concerns of the neural network content, and they show fewer and fewer features are "active"; most are set to zero, this is called "sparsity" sparsity is a key feature depth learning.

These display information about the original pixel of the image less and less, but more and more information about the image category of the fine, we can convnet (or convolution neural network is generally deep) information considered by distillation pipe.

Then we can evaluate the accuracy and loss models, drawing collected during the training exercise / verify the accuracy and loss.

# 检索每个训练时期的训练和验证数据集的准确度结果列表。
acc = history.history['acc']
val_acc = history.history['val_acc']

# 检索每个训练时期的训练和验证数据集的结果列表。
loss = history.history['loss']
val_loss = history.history['val_loss']

# 获取时期数。
epochs = range(len(acc))

# 绘制每个时期的训练和验证准确性。
plt.plot(epochs, acc)
plt.plot(epochs, val_acc)
plt.title('Training and validation accuracy')

plt.figure()

# 绘制每个时期的训练和验证损失。
plt.plot(epochs, loss)
plt.plot(epochs, val_loss)
plt.title('Training and validation loss')

plt.show()

Before the end of the above code is run, it will display the following picture.

Figure_7.png

Figure_8.png

As the image above shows, our model significantly over-fitting, and our training accuracy (blue line) close to 100%, and our verification accuracy (orange line) stagnated at around 70%. Our validation loss after only five times to reach the minimum, it is because we are a relatively small number of training samples (2000).

Therefore, overfitting should be our primary concern, when the model is too small feature samples, not extended to the new data model, that is, when to start using the model features unrelated to predict, it will occur overfitting. For example, if we, as humans, can only see images of three lumberjacks, and three sailors, of which only a lumberjack man wearing a hat, you might start to think that wearing a hat was a lumberjack and not a sign of sailors.

Then we will make a very bad lumberjack / sailor classifier overfitting is the core problem in machine learning: Suppose we will fit the parameters of the model to a given set of data, how do we ensure that the model is suitable for previous data never seen before? How do we avoid learning content-specific training data?

Finally clean up the data, run the following code to terminate the release of the kernel and memory resources.

os.kill(os.getpid(), signal.SIGKILL)

From the foregoing, it created a simple model for the identification of dogs and cats.

Guess you like

Origin blog.csdn.net/hekaiyou/article/details/89256579