MobileNet combat: tensorflow2.X version, MobileNetV2 image classification task (large data set)

Get into the habit of writing together! This is the 10th day of my participation in the "Nuggets Daily New Plan · April Update Challenge", click to view the details of the event .

Summary

In this example, part of the data in the plant seedling dataset is extracted as a dataset. There are 12 categories of datasets. Today, I will implement the image classification task of tensorflow2.X version with you. The classification model uses MobileNetV2. The algorithm implemented in this paper has the following characteristics:

1. The image loading method is customized, which is more flexible and efficient. It does not need to load the image into the memory at one time, which saves memory and is suitable for large-scale data sets.

2. Load the pre-trained weights of the model, and the training time is shorter.

3. Albumentations are selected for data enhancement.

For a more detailed explanation of MobileNetV2, you can refer to the following articles:

wanghao.blog.csdn.net/article/det…

Project structure

MobileNetV2_demo
├─data
│  ├─test
│  └─train
│      ├─Black-grass
│      ├─Charlock
│      ├─Cleavers
│      ├─Common Chickweed
│      ├─Common wheat
│      ├─Fat Hen
│      ├─Loose Silky-bent
│      ├─Maize
│      ├─Scentless Mayweed
│      ├─Shepherds Purse
│      ├─Small-flowered Cranesbill
│      └─Sugar beet
├─train.py
├─test1.py
└─test.py
复制代码

train

New train.py

The first step is to import the required data package and set the global parameters

import numpy as np
from tensorflow.keras.optimizers import Adam
import cv2
from tensorflow.keras.preprocessing.image import img_to_array
from sklearn.model_selection import train_test_split
from tensorflow.python.keras.callbacks import ModelCheckpoint, ReduceLROnPlateau
from tensorflow.keras.applications import MobileNetV2
import os
from tensorflow.python.keras.layers import Dense
from tensorflow.python.keras.models import Sequential
import albumentations

norm_size = 224
datapath = 'data/train'
EPOCHS = 100
INIT_LR = 1e-3
labelList = []
dicClass = {'Black-grass': 0, 'Charlock': 1, 'Cleavers': 2, 'Common Chickweed': 3, 'Common wheat': 4, 'Fat Hen': 5, 'Loose Silky-bent': 6,
            'Maize': 7, 'Scentless Mayweed': 8, 'Shepherds Purse': 9, 'Small-flowered Cranesbill': 10, 'Sugar beet': 11}
classnum = 12
batch_size = 16
np.random.seed(42)
复制代码

It can be seen here that the version above tensorflow 2.0 integrates Keras. We do not need to install Keras separately when using it. The previous code is upgraded to the version above tensorflow 2.0 and tensorflow can be added in front of keras.

After tensorflow is finished, let me explain a few important global parameters:

  • norm_size = 224, set the size of the input image, the default image size of MobileNetV2 is 224×224.

  • datapath = 'data/train', set the path where the pictures are stored. Here, it should be explained that if there are many pictures, they must not be placed in the project directory, otherwise Pycharm will browse all the pictures when loading the project, which is very slow.

  • EPOCHS = 100, the number of epochs, how much is appropriate to set the epoch, this problem is very tangled, generally setting 300 is enough, if you feel that the training is not good, then load the model for training.

  • INIT_LR = 1e-3, the learning rate, in general, it gradually decreases from 0.001, and it should not be too small to 1e-6.

  • classnum = 12, 类别数量,数据集有两个类别,所有就分为两类。

  • batch_size =16, batchsize,根据硬件的情况和数据集的大小设置,太小了loss浮动太大,太大了收敛不好,根据经验来,一般设置为2的次方。windows可以通过任务管理器查看显存的占用情况。

    image-20220126135414054

    Ubuntu可以使用nvidia-smi查看显存的占用。

    image-20220120064407104

  • 定义numpy.random的随机因子。这样就可以固定随机的index

第二步 加载图片

和以前做法不同的是,这里不再处理图片,而是只返回图片路径的list列表。

具体做法详见代码:

def loadImageData():
    imageList = []
    listClasses = os.listdir(datapath)  # 类别文件夹
    print(listClasses)
    for class_name in listClasses:
        label_id = dicClass[class_name]
        class_path = os.path.join(datapath, class_name)
        image_names = os.listdir(class_path)
        for image_name in image_names:
            image_full_path = os.path.join(class_path, image_name)
            labelList.append(label_id)
            imageList.append(image_full_path)
    return imageList


print("开始加载数据")
imageArr = loadImageData()
labelList = np.array(labelList)
print("加载数据完成")
print(labelList)
复制代码

做好数据之后,我们需要切分训练集和测试集,一般按照4:1或者7:3的比例来切分。切分数据集使用train_test_split()方法,需要导入from sklearn.model_selection import train_test_split 包。例:

trainX, valX, trainY, valY = train_test_split(imageArr, labelList, test_size=0.2, random_state=42)
复制代码

第三步 图像增强

train_transform = albumentations.Compose([
        albumentations.OneOf([
            albumentations.RandomGamma(gamma_limit=(60, 120), p=0.9),
            albumentations.RandomBrightnessContrast(brightness_limit=0.2, contrast_limit=0.2, p=0.9),
            albumentations.CLAHE(clip_limit=4.0, tile_grid_size=(4, 4), p=0.9),
        ]),
        albumentations.HorizontalFlip(p=0.5),
        albumentations.ShiftScaleRotate(shift_limit=0.2, scale_limit=0.2, rotate_limit=20,
                                        interpolation=cv2.INTER_LINEAR, border_mode=cv2.BORDER_CONSTANT, p=1),
        albumentations.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225), max_pixel_value=255.0, p=1.0)
    ])
val_transform = albumentations.Compose([
        albumentations.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225), max_pixel_value=255.0, p=1.0)
    ])
复制代码

这个随意写的,具体的设置可以参考我以前写的文章:

图像增强库Albumentations使用总结_AI浩-CSDN博客_albumentations

写了两个数据增强,一个是用于训练,一个用于验证。验证集只需要对图片做归一化处理。

第四步 定义图像处理的方法

generator的主要作用是处理图像,并迭代的方式返回一个batch的图像以及对应的label。

思路:

在while循环:

  • 初始化input_samples和input_labels,连个list分别用来存放image和image对应的标签。

  • 循环batch_size次数:

    • 随机一个index
    • 分别从file_pathList和labels,得到图片的路径和对应的label
    • 读取图片
    • 如果是训练就训练的transform,如果不是就执行验证的transform。
    • resize图片
    • 将image转数组
    • 将图像和label分别放到input_samples和input_labels
  • 将list转numpy数组。

  • 返回一次迭代

def generator(file_pathList,labels,batch_size,train_action=False):
    L = len(file_pathList)
    while True:
        input_labels = []
        input_samples = []
        for row in range(0, batch_size):
            temp = np.random.randint(0, L)
            X = file_pathList[temp]
            Y = labels[temp]
            image = cv2.imdecode(np.fromfile(X, dtype=np.uint8), -1)
            if image.shape[2] > 3:
                image = image[:, :, :3]
            if train_action:
                image=train_transform(image=image)['image']
            else:
                image = val_transform(image=image)['image']
            image = cv2.resize(image, (norm_size, norm_size), interpolation=cv2.INTER_LANCZOS4)
            image = img_to_array(image)
            input_samples.append(image)
            input_labels.append(Y)
        batch_x = np.asarray(input_samples)
        batch_y = np.asarray(input_labels)
        yield (batch_x, batch_y)
复制代码

第五步 保留最好的模型和动态设置学习率

ModelCheckpoint:用来保存成绩最好的模型。

语法如下:

keras.callbacks.ModelCheckpoint(filepath, monitor='val_loss', verbose=0, save_best_only=False, save_weights_only=False, mode='auto', period=1)
复制代码

该回调函数将在每个epoch后保存模型到filepath

filepath可以是格式化的字符串,里面的占位符将会被epoch值和传入on_epoch_end的logs关键字所填入

例如,filepath若为weights.{epoch:02d-{val_loss:.2f}}.hdf5,则会生成对应epoch和验证集loss的多个文件。

参数

  • filename:字符串,保存模型的路径
  • monitor:需要监视的值
  • verbose:信息展示模式,0或1
  • save_best_only:当设置为True时,将只保存在验证集上性能最好的模型
  • mode:‘auto’,‘min’,‘max’之一,在save_best_only=True时决定性能最佳模型的评判准则,例如,当监测值为val_acc时,模式应为max,当检测值为val_loss时,模式应为min。在auto模式下,评价准则由被监测值的名字自动推断。
  • save_weights_only:若设置为True,则只保存模型权重,否则将保存整个模型(包括模型结构,配置信息等)
  • period:CheckPoint之间的间隔的epoch数

ReduceLROnPlateau:当评价指标不在提升时,减少学习率,语法如下:

keras.callbacks.ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=10, verbose=0, mode='auto', epsilon=0.0001, cooldown=0, min_lr=0)
复制代码

当学习停滞时,减少2倍或10倍的学习率常常能获得较好的效果。该回调函数检测指标的情况,如果在patience个epoch中看不到模型性能提升,则减少学习率

参数

  • monitor:被监测的量
  • factor:每次减少学习率的因子,学习率将以lr = lr*factor的形式被减少
  • patience:当patience个epoch过去而模型性能不提升时,学习率减少的动作会被触发
  • mode:‘auto’,‘min’,‘max’之一,在min模式下,如果检测值触发学习率减少。在max模式下,当检测值不再上升则触发学习率减少。
  • epsilon:阈值,用来确定是否进入检测值的“平原区”
  • cooldown: After the learning rate is reduced, the normal operation will be restarted after cooldown epochs
  • min_lr: the lower bound of the learning rate

The code for this example is as follows:

checkpointer = ModelCheckpoint(filepath='best_model.hdf5',
                               monitor='val_accuracy', verbose=1, save_best_only=True, mode='max')

reduce = ReduceLROnPlateau(monitor='val_accuracy', patience=10,
                           verbose=1,
                           factor=0.5,
                           min_lr=1e-6)
复制代码

Step 6 Build the model and train it

model = Sequential()
model.add(MobileNetV2(include_top=False, pooling='avg', weights='imagenet'))
model.add(Dense(classnum, activation='softmax'))

optimizer = Adam(learning_rate=INIT_LR)
model.compile(optimizer=optimizer, loss='sparse_categorical_crossentropy', metrics=['accuracy'])

history = model.fit(generator(trainX,trainY,batch_size,train_action=True),
                              steps_per_epoch=len(trainX) / batch_size,
                              validation_data=generator(valX,valY,batch_size,train_action=False),
                              epochs=EPOCHS,
                              validation_steps=len(valX) / batch_size,
                              callbacks=[checkpointer, reduce])
model.save('my_model.h5')
print(history)
复制代码

If you want to specify classes, there are two conditions: include_top: True, weights: None. Otherwise classes cannot be specified.

So specifying classes cannot use pre-training, so another way is used:

model = Sequential()
model.add(MobileNet(include_top=False, pooling='avg', weights='imagenet'))
model.add(Dense(classnum, activation='softmax'))
复制代码

This enables both pre-training and classnum to be specified.

In addition, fit supports generator mode in version 2.X, so use fit directly.

The seventh step retains the training results and generates pictures

loss_trend_graph_path = r"WW_loss.jpg"
acc_trend_graph_path = r"WW_acc.jpg"
import matplotlib.pyplot as plt

print("Now,we start drawing the loss and acc trends graph...")
# summarize history for accuracy
fig = plt.figure(1)
plt.plot(history.history["accuracy"])
plt.plot(history.history["val_accuracy"])
plt.title("Model accuracy")
plt.ylabel("accuracy")
plt.xlabel("epoch")
plt.legend(["train", "test"], loc="upper left")
plt.savefig(acc_trend_graph_path)
plt.close(1)
# summarize history for loss
fig = plt.figure(2)
plt.plot(history.history["loss"])
plt.plot(history.history["val_loss"])
plt.title("Model loss")
plt.ylabel("loss")
plt.xlabel("epoch")
plt.legend(["train", "test"], loc="upper left")
plt.savefig(loss_trend_graph_path)
plt.close(2)
print("We are done, everything seems OK...")
# #windows系统设置10关机
#os.system("shutdown -s -t 10")
复制代码

image-20220202121436026

image-20220202121449688

test section

Single Image Prediction

1. Import dependencies

import cv2
import numpy as np
from tensorflow.keras.preprocessing.image import img_to_array
from  tensorflow.keras.models import load_model
import time
import os
import albumentations
复制代码

2. Set global parameters

Note here that the order of the dictionary is the same as the order during training

norm_size=224
imagelist=[]
emotion_labels = {
    0: 'Black-grass',
    1: 'Charlock',
    2: 'Cleavers',
    3: 'Common Chickweed',
    4: 'Common wheat',
    5: 'Fat Hen',
    6: 'Loose Silky-bent',
    7: 'Maize',
    8: 'Scentless Mayweed',
    9: 'Shepherds Purse',
    10: 'Small-flowered Cranesbill',
    11: 'Sugar beet',
}
复制代码

3. Set the image normalization parameters

The settings of the normalization parameters are consistent with the verified parameters

val_transform = albumentations.Compose([
        albumentations.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225), max_pixel_value=255.0, p=1.0)
    ])
复制代码

3. Load the model

emotion_classifier=load_model("my_model.h5")
复制代码

4. Process pictures

The logic for processing images is similar to the training set, steps:

  • read image
  • Resize the image to norm_size×norm_size size.
  • Convert the image to an array.
  • Put it in imagelist.
  • Convert list to numpy array.
image = cv2.imdecode(np.fromfile('data/test/0a64e3e6c.png', dtype=np.uint8), -1)
image = val_transform(image=image)['image']
image = cv2.resize(image, (norm_size, norm_size), interpolation=cv2.INTER_LANCZOS4)
image = img_to_array(image)
imagelist.append(image)
imageList = np.array(imagelist, dtype="float")

复制代码

5. Prediction category

Predict the class and get the index of the highest class.

pre=np.argmax(emotion_classifier.predict(imageList))
emotion = emotion_labels[pre]
t2=time.time()
print(emotion)
t3=t2-t1
print(t3)
复制代码

result:

image-20220202121723539

batch prediction

The difference between batch prediction and single prediction is mainly in the read data and the processing of the prediction category after the prediction is completed. Others are unchanged.

step:

  • Load the model.
  • The directory where the test set is defined
  • Get pictures in a directory
  • cycle cycle pictures
    • read image
    • Normalize the image.
    • resize image
    • turn array
    • Put it in imageList
  • predict
predict_dir = 'data/test'
test11 = os.listdir(predict_dir)
for file in test11:
    filepath=os.path.join(predict_dir,file)

    image = cv2.imdecode(np.fromfile(filepath, dtype=np.uint8), -1)
    image = val_transform(image=image)['image']
    image = cv2.resize(image, (norm_size, norm_size),      interpolation=cv2.INTER_LANCZOS4)
    image = img_to_array(image)
    imagelist.append(image)
imageList = np.array(imagelist, dtype="float")
out = emotion_classifier.predict(imageList)
print(out)
pre = [np.argmax(i) for i in out]

class_name_list=[emotion_labels[i] for i in pre]
print(class_name_list)
t2 = time.time()
t3 = t2 - t1
print(t3)
复制代码

result:

image-20220202121944755

Full code: download.csdn.net/download/hh…

Guess you like

Origin juejin.im/post/7085115355667890213