Deep learning practice (2): AlexNet implements flower image classification

AlexNet has done a detailed explanation in my previous blog, for details, see :
https://blog.csdn.net/muye_IT/article/details/123602605?spm=1001.2014.3001.5501
The code has been submitted to github, for details, see (Trouble Star!) :
https://github.com/Jasper0420/Deep-Learning-Practice-AlexNet
More Ai information: Princess AiCharm
insert image description here

1. Dataset Introduction

  Flower classification data set flower_data download: http://download.tensorflow.org/example_images/flower_photos.tgz (copy open)

  • flower_photos (unpacked dataset folder, 3670 samples)
  • rain (generated training set, 3306 samples)
  • val (generated validation set, 364 samples)
    insert image description here
    how to divide the data set into training set and validation set?
    The steps are as follows:
    (1) Create a new folder "flower_data" under the data_set folder
    (2) Click the link to download the flower classification dataset http://download.tensorflow.org/example_images/flower_photos.tgz (copy and open the link)
    ( 3) Unzip the data set to the flower_data folder
    (4) Execute the "split_data.py" script to automatically divide the data set into a training set train and a validation set val
    insert image description here
    insert image description here

The code of split_data.py is as follows. When using your own data set, just modify the file path in the code.

import os
from shutil import copy
import random

def mkfile(file):
    if not os.path.exists(file):
        os.makedirs(file)
        
# 获取 flower_photos 文件夹下除 .txt 文件以外所有文件夹名(即5种花的类名)
file_path = 'flower_data/flower_photos'
flower_class = [cla for cla in os.listdir(file_path) if ".txt" not in cla] 

# 创建 训练集train 文件夹,并由5种类名在其目录下创建5个子目录
mkfile('flower_data/train')
for cla in flower_class:
    mkfile('flower_data/train/'+cla)
    
# 创建 验证集val 文件夹,并由5种类名在其目录下创建5个子目录
mkfile('flower_data/val')
for cla in flower_class:
    mkfile('flower_data/val/'+cla)

# 划分比例,训练集 : 验证集 = 9 : 1
split_rate = 0.1

# 遍历5种花的全部图像并按比例分成训练集和验证集
for cla in flower_class:
    cla_path = file_path + '/' + cla + '/'  # 某一类别花的子目录
    images = os.listdir(cla_path)		    # iamges 列表存储了该目录下所有图像的名称
    num = len(images)
    eval_index = random.sample(images, k=int(num*split_rate)) # 从images列表中随机抽取 k 个图像名称
    for index, image in enumerate(images):
    	# eval_index 中保存验证集val的图像名称
        if image in eval_index:					
            image_path = cla_path + image
            new_path = 'flower_data/val/' + cla
            copy(image_path, new_path)  # 将选中的图像复制到新路径
           
        # 其余的图像保存在训练集train中
        else:
            image_path = cla_path + image
            new_path = 'flower_data/train/' + cla
            copy(image_path, new_path)
        print("\r[{}] processing [{}/{}]".format(cla, index+1, num), end="")  # processing bar
    print()

print("processing done!")

2. AlexNet network introduction

AlexNet has been explained in detail in my previous blog. For details, see : https://blog.csdn.net/muye_IT/article/details/123602605?spm=1001.2014.3001.5501
  AlexNet deepens the network structure on the basis of LeNet , to learn richer and higher-dimensional image features. Features of AlexNet:

  1. A convolutional neural network structure with convolutional layers plus fully connected layers is proposed.
  2. For the first time, the ReLU function is used as the activation function of the neural network.
  3. Dropout regularization is proposed for the first time to control overfitting.
  4. The convergence of the training process is accelerated using the mini-batch gradient descent algorithm with added momentum.
  5. Using a data augmentation strategy greatly suppresses overfitting during the training process.
  6. The parallel computing capability of the GPU is used to accelerate the training and inference of the network.
    The AlexNet network has a total of: 5 convolutional layers, 3 pooling layers, 2 local response normalization layers, and 3 fully connected layers.
    Layer statistics description:
    AlexNet has a total of 8 layers: 5 convolutional layers (CONV1—CONV5) and 3 fully connected layers (FC6-FC8)
  • ➢ When calculating the number of network layers, only the convolutional layer and the fully connected layer are counted;
  • ➢ The pooling layer and various normalization layers post-process the feature maps output by the convolutional layers in front of them, and are not counted as a separate layer.

insert image description hereinsert image description hereinsert image description here
insert image description here

3. Model.py implementation

It should be noted that:
the dual GPU used in the original paper, my computer only has one GPU, and the code only uses half of the network parameters , which is equivalent to only using the lower half of the network structure in the original paper, but if the full network is used to run Once again, it was found that the accuracy of the training results of half of the parameters was almost the same as that of the full parameters.

import torch.nn as nn
import torch

class AlexNet(nn.Module):
    def __init__(self, num_classes=1000, init_weights=False):
        super(AlexNet, self).__init__()
        # 用nn.Sequential()将网络打包成一个模块,精简代码
        self.features = nn.Sequential(   # 卷积层提取图像特征
            nn.Conv2d(3, 48, kernel_size=11, stride=4, padding=2),  # input[3, 224, 224]  output[48, 55, 55]
            nn.ReLU(inplace=True), 									# 直接修改覆盖原值,节省运算内存
            nn.MaxPool2d(kernel_size=3, stride=2),                  # output[48, 27, 27]
            nn.Conv2d(48, 128, kernel_size=5, padding=2),           # output[128, 27, 27]
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2),                  # output[128, 13, 13]
            nn.Conv2d(128, 192, kernel_size=3, padding=1),          # output[192, 13, 13]
            nn.ReLU(inplace=True),
            nn.Conv2d(192, 192, kernel_size=3, padding=1),          # output[192, 13, 13]
            nn.ReLU(inplace=True),
            nn.Conv2d(192, 128, kernel_size=3, padding=1),          # output[128, 13, 13]
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2),                  # output[128, 6, 6]
        )
        self.classifier = nn.Sequential(   # 全连接层对图像分类
            nn.Dropout(p=0.5),			   # Dropout 随机失活神经元,默认比例为0.5
            nn.Linear(128 * 6 * 6, 2048),
            nn.ReLU(inplace=True),
            nn.Dropout(p=0.5),
            nn.Linear(2048, 2048),
            nn.ReLU(inplace=True),
            nn.Linear(2048, num_classes),
        )
        if init_weights:
            self._initialize_weights()
            
	# 前向传播过程
    def forward(self, x):
        x = self.features(x)
        x = torch.flatten(x, start_dim=1)	# 展平后再传入全连接层
        x = self.classifier(x)
        return x
        
	# 网络权重初始化,实际上 pytorch 在构建网络时会自动初始化权重
    def _initialize_weights(self):
        for m in self.modules():
            if isinstance(m, nn.Conv2d):                            # 若是卷积层
                nn.init.kaiming_normal_(m.weight, mode='fan_out',   # 用(何)kaiming_normal_法初始化权重
                                        nonlinearity='relu')
                if m.bias is not None:
                    nn.init.constant_(m.bias, 0)                    # 初始化偏重为0
            elif isinstance(m, nn.Linear):            # 若是全连接层
                nn.init.normal_(m.weight, 0, 0.01)    # 正态分布初始化
                nn.init.constant_(m.bias, 0)          # 初始化偏重为0

4. train.py implementation

train.py ——load the data set and train, the training set calculates the loss, the test set calculates the accuracy, and saves the trained network parameters

4.1 Loading of related packages

import os
import sys
import json
import torch
import torch.nn as nn
from torchvision import transforms, datasets, utils
import matplotlib.pyplot as plt
import numpy as np
import torch.optim as optim
from tqdm import tqdm
from model import AlexNet

4.2 Data preprocessing

 data_transform = {
    
    
        "train": transforms.Compose([transforms.RandomResizedCrop(224),# 随机裁剪,再缩放成 224×224
                                     transforms.RandomHorizontalFlip(0.5), # 水平方向随机翻转,概率为 0.5, 即一半的概率翻转, 一半的概率不翻转
                                     transforms.ToTensor(),
                                     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]),
        "val": transforms.Compose([transforms.Resize((224, 224)),  # cannot 224, must (224, 224)
                                   transforms.ToTensor(),
                                   transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])}

4.3 Load training set

But this time the flower classification data set is not in pytorch's torchvision.datasets. We can't import and load the data set using torchvision.datasets.CIFAR10 and torch.utils.data.DataLoader() in the previous LeNet network construction . Need to use datasets.ImageFolder() to import. The object returned by ImageFolder() is a two-dimensional tuple container containing all images in the dataset and their corresponding labels. It supports indexing and iteration, and can be used as the input of torch.utils.data.DataLoader. For details, please refer to: pytorch ImageFolder and Dataloader load self-made image datasets

    #   获取图像数据集的路径
   	data_root = os.path.abspath(os.path.join(os.getcwd(), "../.."))  # get data root path
    image_path = os.path.join(data_root, "data_set", "flower_data")  # flower data set path
    assert os.path.exists(image_path), "{} path does not exist.".format(image_path)
    # 导入训练集并进行预处理
    train_dataset = datasets.ImageFolder(root=os.path.join(image_path, "train"),
                                         transform=data_transform["train"])
    train_num = len(train_dataset)
	#为了方便在 predict 时读取信息,将 索引:标签 存入到一个 json 文件中	
    # 字典,类别:索引 {'daisy':0, 'dandelion':1, 'roses':2, 'sunflower':3, 'tulips':4}
    flower_list = train_dataset.class_to_idx
    # 将 flower_list 中的 key 和 val 调换位置
    cla_dict = dict((val, key) for key, val in flower_list.items())
    # 将 cla_dict 写入 json 文件中
    json_str = json.dumps(cla_dict, indent=4)
    with open('class_indices.json', 'w') as json_file:
        json_file.write(json_str)

    batch_size = 64
    nw =0 # number of workers
    print('Using {} dataloader workers every process'.format(nw))
    # 按batch_size分批次加载训练集
    train_loader = torch.utils.data.DataLoader(train_dataset,
                                               batch_size=batch_size, shuffle=True,
                                               num_workers=nw)

4.4 Loading the validation set

	validate_dataset = datasets.ImageFolder(root=os.path.join(image_path, "val"),
                                            transform=data_transform["val"])
    val_num = len(validate_dataset)
    validate_loader = torch.utils.data.DataLoader(validate_dataset,
                                                  batch_size=4, shuffle=False,
                                                  num_workers=nw)

4.5 Training network and verification network

	net = AlexNet(num_classes=5, init_weights=True)# 实例化网络(输出类型为5,初始化权重)
    net.to(device)# 分配网络到指定的设备(GPU/CPU)训练
    loss_function = nn.CrossEntropyLoss()# 交叉熵损失
    # pata = list(net.parameters())
    optimizer = optim.Adam(net.parameters(), lr=0.0002)# 优化器(训练参数,学习率)
    
    epochs = 10
    save_path = './AlexNet.pth'
    best_acc = 0.0
    train_steps = len(train_loader)
    #训练集
    for epoch in range(epochs):
        # train
        net.train()# 训练过程中开启 Dropout
        running_loss = 0.0 #每个 epoch 都会对 running_loss  清零
        time_start = time.perf_counter()	# 对训练一个 epoch 计时
        train_bar = tqdm(train_loader, file=sys.stdout)# 对训练一个 epoch 计时
        for step, data in enumerate(train_bar): # 遍历训练集,step从0开始计算
            images, labels = data   # 获取训练集的图像和标签
            optimizer.zero_grad()	# 清除历史梯度
            outputs = net(images.to(device))
            loss = loss_function(outputs, labels.to(device))
            loss = loss.requires_grad_(True)
            loss.backward()
            optimizer.step()
            running_loss += loss.item()
            # 打印训练进度(使训练过程可视化)
            rate = (step + 1) / len(train_loader)           # 当前进度 = 当前step / 训练一轮epoch所需总step
            a = "*" * int(rate * 50)
            b = "." * int((1 - rate) * 50)
            print("\rtrain loss: {:^3.0f}%[{}->{}]{:.3f}".format(int(rate * 100), a, b, loss), end="")
            print()
            print('%f s' % (time.perf_counter()-time_start))
            
        # 验证集
        net.eval()# 验证过程中关闭 Dropout
        acc = 0.0  # accumulate accurate number / epoch
        with torch.no_grad():
            val_bar = tqdm(validate_loader, file=sys.stdout)
            for val_data in val_bar:
                val_images, val_labels = val_data
                outputs = net(val_images.to(device))
                predict_y = torch.max(outputs, dim=1)[1]# 以output中值最大位置对应的索引(标签)作为预测输出
                acc += torch.eq(predict_y, val_labels.to(device)).sum().item()
                
        val_accurate = acc / val_num
        print('[epoch %d] train_loss: %.3f  val_accuracy: %.3f' %
              (epoch + 1, running_loss / train_steps, val_accurate))
        # 保存准确率最高的那次网络参数
        if val_accurate > best_acc:
            best_acc = val_accurate
            torch.save(net.state_dict(), save_path)

    print('Finished Training')
if __name__ == '__main__':
    main()

4.6 Complete code

import os
import sys
import json
import torch
import time
import torch.nn as nn
from torchvision import transforms, datasets, utils
import matplotlib.pyplot as plt
import numpy as np
import torch.optim as optim
from tqdm import tqdm
from model import AlexNet


def main():
    device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
    print("using {} device.".format(device))

    data_transform = {
    
    
        "train": transforms.Compose([transforms.RandomResizedCrop(224),
                                     transforms.RandomHorizontalFlip(),
                                     transforms.ToTensor(),
                                     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]),
        "val": transforms.Compose([transforms.Resize((224, 224)),  # cannot 224, must (224, 224)
                                   transforms.ToTensor(),
                                   transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])}

    data_root = os.path.abspath(os.path.join(os.getcwd(), "../.."))  # get data root path
    image_path = os.path.join(data_root, "data_set", "flower_data")  # flower data set path
    assert os.path.exists(image_path), "{} path does not exist.".format(image_path)
    train_dataset = datasets.ImageFolder(root=os.path.join(image_path, "train"),
                                         transform=data_transform["train"])
    train_num = len(train_dataset)

    # {'daisy':0, 'dandelion':1, 'roses':2, 'sunflower':3, 'tulips':4}
    flower_list = train_dataset.class_to_idx
    cla_dict = dict((val, key) for key, val in flower_list.items())
    # write dict into json file
    json_str = json.dumps(cla_dict, indent=4)
    with open('class_indices.json', 'w') as json_file:
        json_file.write(json_str)

    batch_size = 64
    nw =0 # number of workers
    print('Using {} dataloader workers every process'.format(nw))

    train_loader = torch.utils.data.DataLoader(train_dataset,
                                               batch_size=batch_size, shuffle=True,
                                               num_workers=nw)

    validate_dataset = datasets.ImageFolder(root=os.path.join(image_path, "val"),
                                            transform=data_transform["val"])
    val_num = len(validate_dataset)
    validate_loader = torch.utils.data.DataLoader(validate_dataset,
                                                  batch_size=4, shuffle=False,
                                                  num_workers=nw)

    print("using {} images for training, {} images for validation.".format(train_num,
                                                                           val_num))
    # test_data_iter = iter(validate_loader)
    # test_image, test_label = test_data_iter.next()
    #
    # def imshow(img):
    #     img = img / 2 + 0.5  # unnormalize
    #     npimg = img.numpy()
    #     plt.imshow(np.transpose(npimg, (1, 2, 0)))
    #     plt.show()
    #
    # print(' '.join('%5s' % cla_dict[test_label[j].item()] for j in range(4)))
    # imshow(utils.make_grid(test_image))

    net = AlexNet(num_classes=5, init_weights=True)# 实例化网络(输出类型为5,初始化权重)
    net.to(device)# 分配网络到指定的设备(GPU/CPU)训练
    loss_function = nn.CrossEntropyLoss()# 交叉熵损失
    # pata = list(net.parameters())
    optimizer = optim.Adam(net.parameters(), lr=0.0002)# 优化器(训练参数,学习率)
    
    epochs = 10
    save_path = './AlexNet.pth'
    best_acc = 0.0
    train_steps = len(train_loader)
    #训练集
    for epoch in range(epochs):
        # train
        net.train()# 训练过程中开启 Dropout
        running_loss = 0.0 #每个 epoch 都会对 running_loss  清零
        time_start = time.perf_counter()	# 对训练一个 epoch 计时
        train_bar = tqdm(train_loader, file=sys.stdout)# 对训练一个 epoch 计时
        for step, data in enumerate(train_bar): # 遍历训练集,step从0开始计算
            images, labels = data   # 获取训练集的图像和标签
            optimizer.zero_grad()	# 清除历史梯度
            outputs = net(images.to(device))
            loss = loss_function(outputs, labels.to(device))
            loss = loss.requires_grad_(True)
            loss.backward()
            optimizer.step()
            running_loss += loss.item()
            # 打印训练进度(使训练过程可视化)
            rate = (step + 1) / len(train_loader)           # 当前进度 = 当前step / 训练一轮epoch所需总step
            a = "*" * int(rate * 50)
            b = "." * int((1 - rate) * 50)
            print("\rtrain loss: {:^3.0f}%[{}->{}]{:.3f}".format(int(rate * 100), a, b, loss), end="")
            print()
            print('%f s' % (time.perf_counter()-time_start))


          

        # validate
        net.eval()# 验证过程中关闭 Dropout
        acc = 0.0  # accumulate accurate number / epoch
        with torch.no_grad():
            val_bar = tqdm(validate_loader, file=sys.stdout)
            for val_data in val_bar:
                val_images, val_labels = val_data
                outputs = net(val_images.to(device))
                predict_y = torch.max(outputs, dim=1)[1]# 以output中值最大位置对应的索引(标签)作为预测输出
                acc += torch.eq(predict_y, val_labels.to(device)).sum().item()

        val_accurate = acc / val_num
        print('[epoch %d] train_loss: %.3f  val_accuracy: %.3f' %
              (epoch + 1, running_loss / train_steps, val_accurate))
        # 保存准确率最高的那次网络参数
        if val_accurate > best_acc:
            best_acc = val_accurate
            torch.save(net.state_dict(), save_path)

    print('Finished Training')
    
if __name__ == '__main__':
    main()

4. Bugs resolved

During training many people will encounter:
**OSError: [WinError 1455] The page file is too small to complete the operation.
Error loading “E:\Anaconda3\lib\site-packages\torch\lib\shm.dll” or one of its dependencies.** There are usually three methods for such errors :

  1. restart pycharm
  2. Set num_works to 0
  3. Increase the size of the page file + change the batch_size

I am using the second type, because I am training under windows, usually numworks is set to 0.
insert image description here
If training under Lunix, set numworks to

 nw = min([os.cpu_count(), batch_size if batch_size > 1 else 0, 8])

insert image description here

insert image description here
last normal training
insert image description here

5. predict.py implementation

import os
import json

import torch
from PIL import Image
from torchvision import transforms
import matplotlib.pyplot as plt

from model import AlexNet


def main():
    device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

    data_transform = transforms.Compose(
        [transforms.Resize((224, 224)),
         transforms.ToTensor(),
         transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

    # load image
    img_path = "../tulip.jpg"
    assert os.path.exists(img_path), "file: '{}' dose not exist.".format(img_path)
    img = Image.open(img_path)

    plt.imshow(img)
    # [N, C, H, W]
    img = data_transform(img)
    # expand batch dimension
    img = torch.unsqueeze(img, dim=0)

    # read class_indict
    json_path = './class_indices.json'
    assert os.path.exists(json_path), "file: '{}' dose not exist.".format(json_path)

    json_file = open(json_path, "r")
    class_indict = json.load(json_file)

    # create model
    model = AlexNet(num_classes=5).to(device)

    # load model weights
    weights_path = "./AlexNet.pth"
    assert os.path.exists(weights_path), "file: '{}' dose not exist.".format(weights_path)
    model.load_state_dict(torch.load(weights_path))
	# 关闭 Dropout
    model.eval()
    with torch.no_grad():
        # predict class
        output = torch.squeeze(model(img.to(device))).cpu()
        predict = torch.softmax(output, dim=0)
        predict_cla = torch.argmax(predict).numpy()

    print_res = "class: {}   prob: {:.3}".format(class_indict[str(predict_cla)],
                                                 predict[predict_cla].numpy())
    plt.title(print_res)
    for i in range(len(predict)):
        print("class: {:10}   prob: {:.3}".format(class_indict[str(i)],
                                                  predict[i].numpy()))
    plt.show()


if __name__ == '__main__':
    main()

Download flower pictures from the Internet for testing
insert image description here
Use Google cloab free GPU training
insert image description here
More Ai information: Princess AiCharm
insert image description here

Guess you like

Origin blog.csdn.net/muye_IT/article/details/123895360