Fine-tuning the model for the image classification task of hot dog recognition

Let's practice a concrete example: hot dog recognition. A ResNet model trained on the ImageNet dataset will be fine-tuned on a small dataset. This small dataset contains thousands of images of hot dogs or other things. We will use the fine-tuned model to recognize whether an image contains a hot dog.

First, import the toolkit needed for the experiment.

import tensorflow as tf
import numpy as np

get dataset

We first put the dataset under the path hotdog/data:

1678083514572_81.png

Inside each category folder are image files.

In the previous section, we introduced ImageDataGenerator for image enhancement. We can read image files through the following method. This method takes the folder path as a parameter to generate the result of image enhancement and generate batch data:

flow_from_directory(self, directory,
                            target_size=(256, 256), color_mode='rgb',
                            classes=None, class_mode='categorical',
                            batch_size=32, shuffle=True, seed=None,
                            save_to_dir=None)

The main parameters:

▪ directory: the target folder path, each class corresponds to a subfolder, any JPG, PNG, BNP, PPM pictures in this subfolder can be read.

▪ target_size: The default is (256, 256), the image will be resized to this size.

▪ batch_size: The size of batch data, default 32.

▪ shuffle: Whether to shuffle the data, the default is True.

We create two tf.keras.preprocessing.image.ImageDataGenerator instances to read all the image files in the training and testing datasets, respectively. All the images in the training set are processed as input with a height and width of 224 pixels. In addition, we normalize the values ​​of the three color channels of RGB (red, green, blue).

# 获取数据集
import pathlib
train_dir = 'transferdata/train'
test_dir = 'transferdata/test'
# 获取训练集数据
train_dir = pathlib.Path(train_dir)
train_count = len(list(train_dir.glob('*/*.jpg')))
# 获取测试集数据
test_dir = pathlib.Path(test_dir)
test_count = len(list(test_dir.glob('*/*.jpg')))
# 创建imageDataGenerator进行图像处理
image_generator = tf.keras.preprocessing.image.ImageDataGenerator(rescale=1./255)
# 设置参数
BATCH_SIZE = 32
IMG_HEIGHT = 224
IMG_WIDTH = 224
# 获取训练数据
train_data_gen = image_generator.flow_from_directory(directory=str(train_dir),
                                                    batch_size=BATCH_SIZE,
                                                    target_size=(IMG_HEIGHT, IMG_WIDTH),
                                                    shuffle=True)
# 获取测试数据
test_data_gen = image_generator.flow_from_directory(directory=str(test_dir),
                                                    batch_size=BATCH_SIZE,
                                                    target_size=(IMG_HEIGHT, IMG_WIDTH),
                                                    shuffle=True)

Next, we randomly take a batch of pictures and draw them.

import matplotlib.pyplot as plt
# 显示图像
def show_batch(image_batch, label_batch):
    plt.figure(figsize=(10,10))
    for n in range(15):
        ax = plt.subplot(5,5,n+1)
        plt.imshow(image_batch[n]
        plt.axis('off')
# 随机选择一个batch的图像        
image_batch, label_batch = next(train_data_gen)
# 图像显示
show_batch(image_batch, label_batch)

1678083965743_82.png

Model building and training

We use ResNet-50 pretrained on the ImageNet dataset as the source model. Here specify weights='imagenet' to automatically download and load the pre-trained model parameters. The model parameters need to be downloaded online when used for the first time.

The Keras application (keras.applications) is a fixed architecture with pre-trained weights. This class encapsulates many heavyweight network architectures, as shown in the following figure:

Model building and training

Instantiate the model schema when implementing:

tf.keras.applications.ResNet50(
    include_top=True, weights='imagenet', input_tensor=None, input_shape=None,
    pooling=None, classes=1000, **kwargs
)

The main parameters:

▪ include_top: whether to include the top fully connected layer.

▪ weights: None means random initialization, 'imagenet' means loading pre-trained weights on ImageNet.

▪ input_shape: optional, input size tuple, only valid when include_top=False, otherwise the input shape must be (224, 224, 3) (channels_last format) or (3, 224, 224) (channels_first format). It must be 3 input channels, and the width and height must not be less than 32, such as (200, 200, 3) is a legal input size.

In this case we use the resNet50 pre-trained model to build the model:

# 加载预训练模型
ResNet50 = tf.keras.applications.ResNet50(weights='imagenet', input_shape=(224,224,3))
# 设置所有层不可训练
for layer in ResNet50.layers:
    layer.trainable = False
# 设置模型
net = tf.keras.models.Sequential()
# 预训练模型
net.add(ResNet50)
# 展开
net.add(tf.keras.layers.Flatten())
# 二分类的全连接层
net.add(tf.keras.layers.Dense(2, activation='softmax'))

Next, we use the previously defined ImageGenerator to send the training set images to ResNet50 for training.

# 模型编译:指定优化器,损失函数和评价指标net.compile(optimizer='adam',
            loss='categorical_crossentropy',
            metrics=['accuracy'])# 模型训练:指定数据,每一个epoch中只运行10个迭代,指定验证数据集history = net.fit(
                    train_data_gen,
                    steps_per_epoch=10,
                    epochs=3,
                    validation_data=test_data_gen,
                    validation_steps=10
                    )
Epoch 1/3
10/10 [==============================] - 28s 3s/step - loss: 0.6931 - accuracy: 0.5031 - val_loss: 0.6930 - val_accuracy: 0.5094
Epoch 2/3
10/10 [==============================] - 29s 3s/step - loss: 0.6932 - accuracy: 0.5094 - val_loss: 0.6935 - val_accuracy: 0.4812
Epoch 3/3
10/10 [==============================] - 31s 3s/step - loss: 0.6935 - accuracy: 0.4844 - val_loss: 0.6933 - val_accuracy: 0.4875

Guess you like

Origin blog.csdn.net/cz_00001/article/details/131922395