Image classification: Use TensorFlow to build a convolutional neural network (CNN) to classify images, such as CIFAR-10 or ImageNet datasets

Table of contents

CIFAR-10 dataset

ImageNet dataset

Step 1: Import necessary libraries

Step 2: Load and preprocess data

Step 3: Build a convolutional neural network (CNN) model

Step 4: Compile and train the model

Step 5: Evaluate model performance

Step 6: Visualize training results

Step 7: Use the model to make predictions

Challenges of ImageNet dataset

Step 1: Load the ImageNet dataset

Step 2: Data preprocessing

Step 3: Build ResNet model

Step 4: Compile and train the model

Step 5: Fine-tune the model

Step 6: Evaluate model performance


Image classification is an important task in the field of computer vision, which involves assigning input images into different categories. Deep learning has achieved great success in image classification, and TensorFlow is a powerful deep learning framework. This blog will introduce how to use TensorFlow to build a convolutional neural network (CNN) for image classification. We will use two classic datasets: CIFAR-10 and ImageNet. First, let’s understand these two datasets.

CIFAR-10 dataset

The CIFAR-10 dataset contains 60,000 32x32 color images, divided into 10 different categories, with 6,000 images in each category. These categories include airplanes, cars, birds, cats, deer, dogs, frogs, horses, boats, and trucks. CIFAR-10 is a small dataset that is ideal for quickly validating and testing models.

ImageNet dataset

The ImageNet dataset is a massive image classification dataset containing over 14 million images and 1,000 different categories. This is a real-world challenge because it contains a wide variety of images, from animals to food to natural landscapes. The ImageNet dataset is widely used in the deep learning community for image classification tasks.

Now, let us build the image classification model step by step and train and test it first using the CIFAR-10 dataset.

Step 1: Import necessary libraries

First, we need to import the necessary Python libraries, including TensorFlow, NumPy, and Matplotlib.

import tensorflow as tf
from tensorflow.keras import datasets, layers, models
import numpy as np
import matplotlib.pyplot as plt

Step 2: Load and preprocess data

We will use TensorFlow's datasetsmodule to load the CIFAR-10 dataset and perform some basic preprocessing.

(train_images, train_labels), (test_images, test_labels) = datasets.cifar10.load_data()

# 将图像像素值缩放到0到1之间
train_images, test_images = train_images / 255.0, test_images / 255.0

Step 3: Build a convolutional neural network (CNN) model

We will create a simple CNN model, including convolutional layers, pooling layers, and fully connected layers.

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10))

Step 4: Compile and train the model

Before training, we need to compile the model and specify the loss function, optimizer, and evaluation metrics.

model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

history = model.fit(train_images, train_labels, epochs=10, 
                    validation_data=(test_images, test_labels))

Step 5: Evaluate model performance

Let's see how the model performs on test data.

test_loss, test_acc = model.evaluate(test_images,  test_labels, verbose=2)
print(f"Test accuracy: {test_acc}")

Step 6: Visualize training results

We can use Matplotlib to visualize the model's loss and accuracy on training and validation data.

 
 
plt.figure(figsize=(12, 4))
plt.subplot(1, 2, 1)
plt.plot(history.history['accuracy'], label='Training Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.legend()

plt.subplot(1, 2, 2)
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()
plt.show()

Step 7: Use the model to make predictions

Finally, we can use the trained model to make image classification predictions.

class_names = ['airplane', 'automobile', 'bird', 'cat', 'deer',
               'dog', 'frog', 'horse', 'ship', 'truck']

# 随机选择一张测试图像
index = np.random.randint(0, len(test_images))
test_image = test_images[index]

# 预测图像的类别
predictions = model.predict(np.expand_dims(test_image, axis=0))
predicted_class = class_names[np.argmax(predictions)]

# 显示图像和预测结果
plt.imshow(test_image)
plt.title(f"Predicted: {predicted_class}")
plt.show()

These are the basic steps for building a convolutional neural network for CIFAR-10 image classification using TensorFlow. Now, let us turn to the more challenging ImageNet dataset.

Challenges of ImageNet dataset

The ImageNet dataset is larger and more complex, thus requiring deeper and more powerful models to process it. In the ImageNet challenge, deep convolutional neural networks (such as AlexNet, VGG, ResNet, and Inception, etc.) have performed well, so we will build a simplified version of ResNet to handle the ImageNet dataset.

Step 1: Load the ImageNet dataset

First, we need to get a subset of the ImageNet dataset, here we will use TensorFlow Datasets to load it.

import tensorflow_datasets as tfds

# 加载ImageNet数据集的子集
(train_ds, validation_ds, test_ds), metadata = tfds.load(
    'imagenet2012_subset',
    split=['train[:80%]', 'train[80%:90%]', 'train[90%:]'],
    with_info=True,
    as_supervised=True,
)

Step 2: Data preprocessing

For the ImageNet dataset, we need to do more data preprocessing, including image resizing and normalization.

 
 
def preprocess_image(image, label):
    image = tf.image.resize(image, (224, 224))
    image = tf.keras.applications.resnet.preprocess_input(image)
    return image, label

batch_size = 64
train_ds = train_ds.map(preprocess_image).shuffle(1000).batch(batch_size)
validation_ds = validation_ds.map(preprocess_image).batch(batch_size)
test_ds = test_ds.map(preprocess_image).batch(batch_size)

Step 3: Build ResNet model

We will use the pretrained ResNet model provided in TensorFlow's Keras application, and then add a custom output layer to suit our task.

base_model = tf.keras.applications.ResNet50(input_shape=(224, 224, 3),
                                           include_top=False,
                                           weights='imagenet')

# 冻结预训练模型的权重
base_model.trainable = False

# 添加自定义输出层
global_average_layer = tf.keras.layers.GlobalAveragePooling2D()
prediction_layer = tf.keras.layers.Dense(1000)  # ImageNet有1000个类别

model = tf.keras.Sequential([
    base_model,
    global_average_layer,
    prediction_layer
])

Step 4: Compile and train the model

Similar to the previous example, we need to compile the model and specify the loss function, optimizer, and evaluation metrics.

 
 
base_learning_rate = 0.0001
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=base_learning_rate),
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

initial_epochs = 10
history = model.fit(train_ds,
                    epochs=initial_epochs,
                    validation_data=validation_ds)

Step 5: Fine-tune the model

Fine-tuning the model is an important step that can further improve the performance of the model. In fine-tuning, we unfreeze some layers of the pre-trained model and adjust their weights.

# 解冻部分预训练模型层
base_model.trainable = True

# 选择解冻的层
fine_tune_at = 100

# 冻结前fine_tune_at层
for layer in base_model.layers[:fine_tune_at]:
    layer.trainable = False

model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=base_learning_rate / 10),
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

fine_tune_epochs = 10
total_epochs = initial_epochs + fine_tune_epochs

history_fine = model.fit(train_ds,
                        epochs=total_epochs,
                        initial_epoch=history.epoch[-1],
                        validation_data=validation_ds)

Step 6: Evaluate model performance

Finally, let’s see how the fine-tuned model performs on the test data.

 
 
loss, accuracy = model.evaluate(test_ds)
print(f"Test accuracy after fine-tuning: {accuracy}")

At this point, we have completed a complete image classification task, including loading data, building a model, training the model, and evaluating performance. Using TensorFlow, we can easily handle image classification problems of varying scale and difficulty.

Guess you like

Origin blog.csdn.net/m0_68036862/article/details/133490703