[Image classification case] (2) DenseNet weather image four classification (weight transfer learning), with complete code of Tensorflow

Hello everyone, today I will share with you how to use Tensorflow to build a DenseNet convolutional neural network model , and use the weights of the pre-trained model to complete the classification of four weather images .

The complete code is in my Gitee, self-fetching if needed:

https://gitee.com/dgvv4/image-classification/tree/master


1. DenseNet

1. Network introduction

DenseNet adopts a dense connection mechanism, that is, all layers connected to each other, each layer will be stacked with all previous layers in the channel dimension (layers.Concate) , to achieve feature reuse, as the input of the next layer. In this way, it not only alleviates the phenomenon of gradient disappearance, but also enables it to achieve better performance than ResNet with less computation involved.

The dense connection method of DenseNet requires the feature map size to be consistent, so the structure of DenseBlock + Transition is used in the DenseNet network . DenseBlock is a module containing many layers, the feature map of each layer is the same size, and the layers are densely connected . The Transition module connects two adjacent DenseBlocks and performs downsampling through the pooling layer .


1.2 DenseBlock code reproduction

It is assumed that the number of channels of the feature map of the input layer is k0 , and the number of channels of the output feature map of the convolution of each layer in DenseBlock is k . Then the number of channels of the input feature map of the Lth layer is k0+(L-1)k , we will k be the growth rate of the network (growth_rate)

Because each layer receives the feature maps of all previous layers, that is, the feature transfer method is to directly transfer the feature maps of all previous layers (layers.Concate) to the next layer.

DenseBlock adopts the order of activation function in front and convolution layer in the back, that is, the order of BN+ReLU+Conv. In DenseNet, the performance of this sorting method network is better.

DenseBlock uses 1*1 convolution internally to reduce the amount of parameters, reducing the number of stacked channels in the previous layer to 4*k , reducing feature loss and improving computational efficiency.

Code display:

#(1)构建一个dense单元
def DenseLayer(x, growth_rate, dropout_rate=0.2):
    
    # BN+激活+1*1卷积降维
    x = layers.BatchNormalization()(x)
    x = layers.Activation('relu')(x)
    x = layers.Conv2D(filters = growth_rate*4,  # 降低特征图数量
                      kernel_size = (1,1),
                      strides = 1,
                      padding = 'same')(x)
    
    # BN+激活+3*3卷积
    x = layers.BatchNormalization()(x)
    x = layers.Activation('relu')(x)
    x = layers.Conv2D(filters = growth_rate,
                      kernel_size = (3,3),
                      strides = 1,
                      padding = 'same')(x)

    # 随机杀死神经元
    x = layers.Dropout(rate = dropout_rate)(x)

    return x

#(2)构建DenseBlock的多个卷积组合在一起的卷积块
def DenseBlock(x, num, growth_rate):

    # 重复执行多少次DenseLayer
    for _ in range(num):
        conv = DenseLayer(x, growth_rate)
        # 将前面所有层的特征堆叠后传到下一层
        x = layers.Concatenate()([x, conv])
    
    return x

1.3 Transition code display

The Transition module mainly connects two adjacent DenseBlocks, and downsamples to reduce the size of the feature map and compress the model. The Transition layer consists of a 1*1 convolutional layer and a 2*2 average pooling layer.

#(3)Transition层连接两个相邻的DenseBlock
def Transition(x, compression_rate=0.5):
    
    # 1*1卷积下降一半的通道数
    out_channel = int(x.shape[-1] * compression_rate)

    # BN+激活+1*1卷积+2*2平均池化
    x = layers.BatchNormalization()(x)
    x = layers.Activation('relu')(x)
    x = layers.Conv2D(filters = out_channel,  # 输出通道数
                      kernel_size = (1,1),
                      strides = 1,
                      padding = 'same')(x)
    x = layers.AveragePooling2D(pool_size = (2,2),
                                strides = 2,  # 下采样
                                padding = 'same')(x)
    return x

1.4 Construct the backbone network

The network structure diagram is as follows. I take DenseNet121 as an example to build the network backbone. 

#(4)主干网络架构
def densenet(input_shape, classes, growth_rate, include_top):

    # 构造输入层[224,224,3]
    inputs = keras.Input(shape=input_shape)

    # 卷积下采样[224,224,3]==>[112,112,64]
    x = layers.Conv2D(filters = 2*growth_rate,  # 输出特征图个数为两倍增长率
                      kernel_size = (7,7),
                      strides = 2,
                      padding = 'same')(inputs)
    
    x = layers.BatchNormalization()(x)
    x = layers.Activation('relu')(x)

    # 最大池化[112,112,64]==>[56,56,64]
    x = layers.MaxPooling2D(pool_size = (3,3), 
                            strides = 2, 
                            padding = 'same')(x)

    # [56,56,64]==>[56,56,64+6*32]
    x = DenseBlock(x, num=6,  growth_rate=growth_rate)
    # [56,56,256]==>[28,28,128]
    x = Transition(x)
    # [28,28,128]==>[28,28,128+12*32]
    x = DenseBlock(x, num=12, growth_rate=growth_rate)
    # [28,28,512]==>[14,14,256]
    x = Transition(x)
    # [14,14,256]==>[14,14,256+24*32]
    x = DenseBlock(x, num=24, growth_rate=growth_rate)
    # [14,14,1024]==>[7,7,512]
    x = Transition(x)
    # [7,7,512]==>[7,7,512+16*32]
    x = DenseBlock(x, num=16, growth_rate=growth_rate)

    # 导入模型时,是否包含输出层
    if include_top is True:

        # [7,7,1024]==>[None,1024]
        x = layers.GlobalAveragePooling2D()(x)
        # [None,1024]==>[None,classes]
        x = layers.Dense(classes)(x)  # 不经过softmax层转换成概率

    # 构造模型
    model = Model(inputs, x)

    return model

#(5)接收网络模型
if __name__ == '__main__':

    model = densenet(input_shape=[224,224,3],  # 输入图像的shape
                     classes = 1000,  # 分类数
                     growth_rate = 32,  # 设置增长率,即每个dense模块的输出通道数
                     include_top = True)  # 默认包含输出层

    model.summary()  # 查看网络架构

2. Network training

2.1 Load the dataset

The function tf.keras.preprocessing.image_dataset_from_directory() constructs the dataset,

Read the image data in batches , the parameter img_size will resize the read image to the specified size; in the parameter label_mode , 'int' represents the target value y is a numerical type index , that is, 0, 1, 2, 3 , etc.; 'categorical' Represents the onehot type , and the value of the index corresponding to the correct category is 1. If the image belongs to the second category, it is expressed as 0, 1, 0, 0, 0 ; 'binary' represents the second category .

#(1)加载数据集
def get_data(height, width, batchsz):
    
    # 训练集数据
    train_ds = keras.preprocessing.image_dataset_from_directory(
        directory = filepath + 'train',  # 训练集图片所在文件夹
        label_mode = 'categorical',  # onehot编码
        image_size = (height, width),  # 输入图象的size
        batch_size = batchsz)  # 每批次训练32张图片

    # 验证集数据
    val_ds = keras.preprocessing.image_dataset_from_directory(
        directory = filepath + 'val', 
        label_mode = 'categorical', 
        image_size = (height, width),  
        batch_size = batchsz)  

    # 返回划分好的数据集
    return train_ds, val_ds

# 读取数据集
train_ds, val_ds = get_data(height, width, batchsz) 

#(2)查看数据集信息
def check_data(train_ds):  # 传入训练集数据集
    
    # 查看数据集有几个分类类别
    class_names = train_ds.class_names
    print('classNames:', class_names)

    # 查看数据集的shape, x代表图片数据, y代表分类类别数据
    sample = next(iter(train_ds))  # 生成迭代器,每次取出一个batch的数据
    print('x_batch.shape:', sample[0].shape, 'y_batch.shape:', sample[1].shape)
    print('前五个目标值:', sample[1][:5])

# 是否查看数据集信息
if checkData is True:
    check_data(train_ds)

#(3)查看图像
def plot_show(train_ds):

    # 生成迭代器,每次取出一个batch的数据
    sample = next(iter(train_ds))  # sample[0]图像信息, sample[1]标签信息
    # 显示前5张图
    for i in range(5):
        plt.subplot(1,5,i+1)  # 在一块画板的子画板上绘制1行5列
        plt.imshow(sample[0][i]/255.0)  # 图像的像素值压缩到0-1
        plt.xticks([])  # 不显示xy坐标刻度
        plt.yticks([])
    plt.show()

# 是否展示图像信息
if plotShow is True:
    plot_show(train_ds)

The weather pictures are as follows:


2.2 Preprocessing Improvements

This article uses the transfer learning method and uses the weights of the DenseNet121 pre-training model to train the network . The general process is similar to the non-transfer learning training method in the previous article. Only the points that need to be changed are explained here. See the previous case: https://blog.csdn.net/dgvv4/article/details/123714507

First of all, since the weights trained by others are used to continue training, the same data preprocessing method as others must be used.

The data preprocessing method in imgnet is that the pixel values ​​of each channel of the input image are subtracted from the mean value , and the pixel value normalization method is not used.

# 数据预处理不使用归一化,调整RGB三通道颜色均值
_R_MEAN = 123.68
_G_MEAN = 116.78
_B_MEAN = 103.94

#(4)数据预处理
def processing(x,y):  # 定义预处理函数
    x = tf.cast(x, dtype=tf.float32) # 图片转换为tensor类型
    y = tf.cast(y, dtype=tf.int32)  # 分类标签转换成tensor类型
    # 对图像的各个通道减去均值,不进行归一化
    x = x - [_R_MEAN, _G_MEAN, _B_MEAN]

    return x,y

# 对所有数据预处理
train_ds = train_ds.map(processing).shuffle(10000)  # map调用自定义预处理函数, shuffle打乱数据集
val_ds = val_ds.map(processing)

2.3 Load the model and weights

Secondly, to load the network model and weights, we only need the feature extraction layer of the network, not the output layer of the network (the original network output is generally 1000 categories). By setting model.trainable = False , all weights of the network feature extraction layer can be frozen, and only the weights of the output layer will be optimized and updated during the forward and backward propagation process . It is recommended to freeze the feature extraction layer for the first 10 rounds of training, and then set  model.trainable = True for the subsequent training . All weights of the network can be updated.

Custom output layer, because the output layer of the original network is generally 1000 categories, and our own network generally has several categories, so the network structure of the output layer needs to be written by ourselves . Stack the layers of the network through the keras.Sequential() container.

model = densenet(input_shape=input_shape, # 网络的输入图像的size
                classes=classes,  # 分类数
                growth_rate = 32,  # 设置增长率,即每个dense模块的输出通道数
                include_top=False)  # 不载入输出层

# 报错解决:You are trying to load a weight file containing 242 layers into a model with
# 添加参数:by_name=True
model.load_weights(pre_weights_path, by_name=True)

# False冻结模型的所有参数,只能更新输出层的参数
model.trainable = True

# 自定义输出层,不经过softmax激活,有利于网络的稳定性
model = keras.Sequential([model,  # [7,7,1024]
                          layers.GlobalAveragePooling2D(), # ==>[None,1024]
                          layers.Dropout(rate=0.5),
                          layers.Dense(512),  # ==>[None,512]
                          layers.Dropout(rate=0.5),
                          layers.Dense(classes)])  #==>[None,classes] 


2.4 Training Results

Compared with the training method of non-transfer learning, there are only the above two points that need to be improved. Next, start the training and set the callback function to save the weight of the iteration with the lowest loss in the validation set.

#(7)保存权重文件
if not os.path.exists(weights_dir):  # 判断当前文件夹下有没有一个叫save_weights的文件夹
    os.makedirs(weights_dir)  # 如果没有就创建一个


#(8)模型编译
opt = optimizers.Adam(learning_rate=learning_rate)  # 设置Adam优化器

model.compile(optimizer=opt, #学习率
              loss=keras.losses.CategoricalCrossentropy(from_logits=True), # 交叉熵损失,logits层先经过softmax
              metrics=['accuracy']) #评价指标


#(9)定义回调函数,一个列表
# 保存模型参数
callbacks = [keras.callbacks.ModelCheckpoint(filepath = 'save_weights/densenet.h5',  # 参数保存的位置
                                            save_best_only = True,  # 保存最佳参数
                                            save_weights_only = True,  # 只保存权重文件
                                            monitor = 'val_loss')]  # 通过验证集损失判断是否是最佳参数


#(10)模型训练,history保存训练信息
history = model.fit(x = train_ds,  # 训练集
                    validation_data = val_ds,  # 验证集
                    epochs = epochs,  #迭代30次
                    callbacks = callbacks,
                    initial_epoch=0) 

After the training is completed, the training information is saved in the history, and the loss curve and the accuracy curve are drawn.

#(11)获取训练信息
history_dict = history.history  # 获取训练的数据字典
train_loss = history_dict['loss']  # 训练集损失
train_accuracy = history_dict['accuracy']  # 训练集准确率
val_loss = history_dict['val_loss']  # 验证集损失
val_accuracy = history_dict['val_accuracy']  # 验证集准确率

#(12)绘制训练损失和验证损失
plt.figure()
plt.plot(range(epochs), train_loss, label='train_loss')  # 训练集损失
plt.plot(range(epochs), val_loss, label='val_loss')  # 验证集损失
plt.legend()  # 显示标签
plt.xlabel('epochs')
plt.ylabel('loss')

#(13)绘制训练集和验证集准确率
plt.figure()
plt.plot(range(epochs), train_accuracy, label='train_accuracy')  # 训练集准确率
plt.plot(range(epochs), val_accuracy, label='val_accuracy')  # 验证集准确率
plt.legend()
plt.xlabel('epochs')
plt.ylabel('accuracy')

Accuracy and loss during training

Epoch 1/10
99/99 [==============================] - 64s 354ms/step - loss: 1.8832 - accuracy: 0.5543 - val_loss: 117.8199 - val_accuracy: 0.4000
Epoch 2/10
99/99 [==============================] - 16s 140ms/step - loss: 1.5482 - accuracy: 0.6619 - val_loss: 1.3441 - val_accuracy: 0.6800
Epoch 3/10
99/99 [==============================] - 16s 138ms/step - loss: 2.0702 - accuracy: 0.6129 - val_loss: 5.8502 - val_accuracy: 0.5289
Epoch 4/10
99/99 [==============================] - 16s 137ms/step - loss: 1.2101 - accuracy: 0.7477 - val_loss: 14.2739 - val_accuracy: 0.5956
Epoch 5/10
99/99 [==============================] - 15s 137ms/step - loss: 0.7785 - accuracy: 0.7687 - val_loss: 0.4700 - val_accuracy: 0.8356
Epoch 6/10
99/99 [==============================] - 16s 138ms/step - loss: 1.1095 - accuracy: 0.7349 - val_loss: 0.8931 - val_accuracy: 0.8089
Epoch 7/10
99/99 [==============================] - 16s 138ms/step - loss: 0.8705 - accuracy: 0.7835 - val_loss: 0.2821 - val_accuracy: 0.8978
Epoch 8/10
99/99 [==============================] - 16s 138ms/step - loss: 0.9680 - accuracy: 0.7760 - val_loss: 0.6360 - val_accuracy: 0.8533
Epoch 9/10
99/99 [==============================] - 16s 139ms/step - loss: 0.9522 - accuracy: 0.8041 - val_loss: 0.3987 - val_accuracy: 0.8844
Epoch 10/10
99/99 [==============================] - 16s 138ms/step - loss: 1.0262 - accuracy: 0.7649 - val_loss: 0.4000 - val_accuracy: 0.8578

3. Prediction Phase

Taking the image prediction of the entire test set as an example, test_ds stores the images and category labels of the test set, and performs the same preprocessing method as the training set, subtracting the pixel value of each channel of the image from the mean of each channel .

model.predict(img) returns the probability that each image belongs to each category. It is necessary to find the index corresponding to the maximum probability np.argmax(result) , and the category name corresponding to the index is the final prediction result.

import tensorflow as tf
from tensorflow import keras
from DenseNet import densenet
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt

# 报错解决:NotFoundError: No algorithm worked! when using Conv2D
from tensorflow.compat.v1 import ConfigProto
from tensorflow.compat.v1 import InteractiveSession
config = ConfigProto()
config.gpu_options.allow_growth = True
session = InteractiveSession(config=config)


# ------------------------------------ #
# 预测参数设置
# ------------------------------------ #
im_height = 224  # 输入图像的高
im_width = 224   # 输入图像的高
# 分类名称
class_names = ['cloudy', 'rain', 'shine', 'sunrise']
# 权重路径
weight_dir = 'save_weights/densenet_new.h5'
# ------------------------------------ #
# 数据预处理,调整RGB三通道颜色均值
# ------------------------------------ #
_R_MEAN = 123.68
_G_MEAN = 116.78
_B_MEAN = 103.94
# ------------------------------------ #
# 单张图片预测
# ------------------------------------ #
# 是否只预测一张图
single_pic = False
# 图像所在文件夹的路径
single_filepath = 'D:/deeplearning/test/....../001.jpg'  
# 指定某张图片
picture = single_filepath + 'rain94.jpg'

# ------------------------------------ #
# 对测试集图片预测
# ------------------------------------ #
test_pack = True
# 验证集文件夹路径
test_filepath = 'D:/deeplearning/test/数据集/...../test/'


#(1)载入模型,不载入输出层
model = densenet(input_shape=[224,224,3], classes=4, growth_rate = 32, include_top=False)
print('model is loaded')

#(2)构造输出层
model = keras.Sequential([model,
                          keras.layers.GlobalAveragePooling2D(), # ==>[None,1024]
                          keras.layers.Dropout(rate=0.5),
                          keras.layers.Dense(512),  # ==>[None,512]
                          keras.layers.Dropout(rate=0.5),
                          keras.layers.Dense(len(class_names)),  #==>[None,classes] 
                          keras.layers.Softmax()])  # 这里需要经过得到softmax分类概率

#(3)载入权重.h文件
model.load_weights(weight_dir, by_name=True)
print('weights is loaded')


#(4)只对单张图像预测
if single_pic is True:
    
    # 加载图片
    img = Image.open(picture)
    # 改变图片size
    img = img.resize((im_height, im_width))
    # 展示图像
    plt.figure()
    plt.imshow(img)
    plt.xticks([])
    plt.yticks([])
    
    # 预处理,图像像素值减去均值
    img = np.array(img).astype(np.float32)  # 变成tensor类型
    img = img - [_R_MEAN, _G_MEAN, _B_MEAN]

    # 输入网络的要求,给图像增加一个batch维度, [h,w,c]==>[b,h,w,c]
    img = np.expand_dims(img, axis=0)

    # 预测图片,返回结果包含batch维度[b,n]
    result = model.predict(img)
    # 转换成一维,挤压掉batch维度
    result = np.squeeze(result)
    
    # 找到概率最大值对应的索引
    predict_class = np.argmax(result)
    
    # 打印预测类别及概率
    print('class:', class_names[predict_class], 
          'prob:', result[predict_class])
    
    plt.title(f'{class_names[predict_class]}')
    plt.show()

#(5)对测试集图像预测
if test_pack is True:
    
    # 载入测试集
    test_ds = keras.preprocessing.image_dataset_from_directory(
        directory = test_filepath, 
        label_mode = 'int',  # 不经过ont编码, 1、2、3、4、、、 
        image_size = (im_height, im_width),  # 测试集的图像resize
        batch_size = 32)  # 每批次32张图
    
    # 测试集预处理
    #(2)数据预处理
    def processing(image, label): 
        
        image = tf.cast(image, tf.float32)  # 修改数据类型
        label = tf.cast(label, tf.int32)
        # 对图像的各个通道减去均值,不进行归一化
        image = image - [_R_MEAN, _G_MEAN, _B_MEAN]

        return (image, label)
 
    test_ds = test_ds.map(processing) # 预处理


    test_true = []  # 存放真实值
    test_pred = []  # 存放预测值
    
    # 遍历测试集所有的batch
    for imgs, labels in test_ds:
        # 每次每次取出一个batch的一张图像和一个标签
        for img, label in zip(imgs, labels):
            
            # 网络输入的要求,给图像增加一个维度[h,w,c]==>[b,h,w,c]
            image_array = tf.expand_dims(img, axis=0)
            # 预测某一张图片,返回图片属于许多类别的概率
            prediction = model.predict(image_array)

            # 找到预测概率最大的索引对应的类别
            test_pred.append(class_names[np.argmax(prediction)])
            # label是真实标签索引
            test_true.append(class_names[label])
            
    # 展示结果
    print('真实值: ', test_true[:10])
    print('预测值: ', test_pred[:10])
    
    # 绘制混淆矩阵
    from sklearn.metrics import confusion_matrix
    import seaborn as sns
    import pandas as pd
    plt.rcParams['font.sans-serif'] = ['SimSun']  #宋体
    plt.rcParams['font.size'] = 15  #设置字体大小
    
    # 生成混淆矩阵
    conf_numpy = confusion_matrix(test_true, test_pred)
    # 转换成DataFrame表格类型,设置行列标签
    conf_df = pd.DataFrame(conf_numpy, index=class_names, columns=class_names)
    
    # 创建绘图区
    plt.figure(figsize=(8,7))
    
    # 生成热力图
    sns.heatmap(conf_df, annot=True, fmt="d", cmap="BuPu")
    
    # 设置标签
    plt.title('Confusion_Matrix')
    plt.xlabel('Predict')
    plt.ylabel('True')

Guess you like

Origin blog.csdn.net/dgvv4/article/details/123753612