Coggle 30 Days of ML Check-In Task 3: Apple Disease Model Training and Prediction

Coggle 30 Days of ML Check-In Task 3: Apple Disease Model Training and Prediction

Task 3: Apple Disease Model Training and Prediction

  • Difficulty/Score: Medium/2

Punch content:

  1. Contestant name : AppleDoctor
  2. Completion date : 2023.6.11
  3. Task completion status :
    • Programming languages ​​used: Python, PyTorch
    • Implemented functions:
      • Custom dataset read
      • Custom CNN model
      • Model Training and Validation
      • Make predictions on the test set

background introduction

This check-in task is the third task in Coggle 30 Days of ML. The task requires contestants to use the provided apple disease data set to build a model, and perform model training and prediction. Contestants can choose the appropriate deep learning framework and model architecture, and use the training set for model training. Then, the contestants need to use the trained model to predict the apple leaf disease images in the test set.

mission name Difficulty/Score
Task 1: Data visualization of two competition questions low/1
Task 2: Apple disease data loading and data enhancement Medium/2
Task 3: Apple disease model training and prediction Medium/2
Task 4: Apple disease model optimization and multi-fold training High/3
Task 5: Building detection data loading and data enhancement High/2
Task 6: Building detection model training and prediction Medium/2
Task 7: Building detection model optimization and multi-fold training High/3

custom data set

First, we use to PyTorchcreate a custom dataset to suit our task. Here I define a AppleDatasetclass named .

# 定义Apple数据集的类
class AppleDataset(Dataset):
    def __init__(self, img_path, transform=None):
        """
        构造函数,初始化数据集的路径和数据增强操作
        Args:
            - img_path: list类型,存储数据集中图像的路径
            - transform: torchvision.transforms类型,数据增强操作
        """
        self.img_path = img_path
        self.transform = transform
            
    def __getitem__(self, index):
        """
        获取一个样本
        Args:
            - index: 数据集中的索引
        Returns:
            - img: Tensor类型,经过处理后的图像
            - label: Tensor类型,标签
        """
        img = Image.open(self.img_path[index])
        if self.transform is not None:
            img = self.transform(img)

        # 将类别名转换为数字标签
        class_name = self.img_path[index].split('/')[-2]
        if class_name in ['d1', 'd2', 'd3', 'd4', 'd5', 'd6', 'd7', 'd8', 'd9']:
            label = ['d1', 'd2', 'd3', 'd4', 'd5', 'd6', 'd7', 'd8', 'd9'].index(class_name)
        else:
            label = -1

        return img, torch.from_numpy(np.array(label))
    
    def __len__(self):
        """
        获取数据集的大小
        Returns:
            - size: int类型,数据集的大小
        """
        return len(self.img_path)

Next, we define the data augmentation method set up earlier and apply it to the dataset. We used the following data augmentation operations:

  1. Resize Image: Use transforms.Resize((224, 224))to resize the image to 224x224 pixels.
  2. Random Horizontal Flip: Use to flip the image horizontally transforms.RandomHorizontalFlip()with a probability of 0.5.
  3. Random Vertical Flip: Use to flip the image vertically transforms.RandomVerticalFlip()with a probability of 0.5.
  4. Random Rotation: Use to rotate the image transforms.RandomRotation(degrees=15)by a random angle, from -15 to 15 degrees.
  5. Convert to Tensor: Use transforms.ToTensor()to convert an image to a tensor.
  6. Normalize: Use transforms.Normalize(mean, std)to normalize the image with the given mean and standard deviation.
# 定义图像预处理的参数
image_mean = [0.4940, 0.4187, 0.3855]
image_std = [0.2048, 0.1941, 0.1932]

# 定义训练集的数据增强操作
transform_train = transforms.Compose([
    transforms.Resize((224,224)), # 将图像大小调整为224*224
    transforms.RandomHorizontalFlip(), # 以0.5的概率对图像进行水平翻转
    transforms.RandomVerticalFlip(), # 以0.5的概率对图像进行垂直翻转
    transforms.RandomApply([transforms.GaussianBlur(kernel_size=3)], p=0.1), # 以0.1的概率对图像进行高斯模糊,卷积核大小为3
    transforms.RandomApply([transforms.ColorJitter(brightness=0.9)],p=0.5), # 以0.5的概率调整图像的亮度,亮度因子的范围为[0.1, 1.9]
    transforms.RandomApply([transforms.ColorJitter(contrast=0.9)],p=0.5), # 以0.5的概率调整图像的对比度,对比度因子的范围为[0.1, 1.9]
    transforms.RandomApply([transforms.ColorJitter(saturation=0.9),],p=0.5), # 以0.5的概率调整图像的饱和度,饱和度因子的范围为[0.1, 1.9]
    transforms.ToTensor(), # 将图像转换成张量
    transforms.Normalize(image_mean, image_std), # 对图像进行标准化处理
])

# 定义验证集的数据预处理操作
transform_valid = transforms.Compose([
    transforms.Resize((224,224)), # 将图像大小调整为224*224
    transforms.ToTensor(), # 将图像转换成张量
    transforms.Normalize(image_mean, image_std), # 对图像进行标准化处理
])

We can then use a custom dataset class to load the data.

train_data = AppleDataset(train_path,transform_train)
valid_data = AppleDataset(val_path,transform_valid)

print("类别: {}".format(classes_name))
print("训练集: {}".format(len(train_data)))
print("验证集: {}".format(len(valid_data)))

As can be seen from the output results, the data set has about 10,000 pictures, and I divide it into a training set and a test set at a ratio of 8:2, which can prevent overfitting

类别: ['d1', 'd2', 'd3', 'd4', 'd5', 'd6', 'd7', 'd8', 'd9']
训练集: 8165
验证集: 2046

Next, load the training and validation data using PyTorch's DataLoader. You can adjust batch_sizeand num_workersparameters according to your needs. (Win num_workersmay only be set to 0, if there is a video memory problem, it can be adjusted down batch_size)

# 使用PyTorch的DataLoader加载训练集数据
train_loader = torch.utils.data.DataLoader(
        train_data, batch_size=64, shuffle=True, num_workers= 8, pin_memory = True)

val_loader = torch.utils.data.DataLoader(
        valid_data, batch_size=64, shuffle=False, num_workers= 8, pin_memory = True)

Custom CNN model

In this part, we will use the PyTorch Image Models (timm) library to quickly build CNN models. timm is a deep learning library created by Ross Wightman that integrates state-of-the-art computer vision models, layers, utilities, optimizers, schedulers, data loaders, augmentation and training/validation scripts. This library can reproduce ImageNet training results and is conveniently used to build models and modify parameters such as categories and channels.

Here is an example of building a resnet34 model using timm:

import timm 

# 使用timm库中的resnet34模型
model = timm.create_model('resnet34')

# 生成一张224x224的RGB图像
x     = torch.randn(1, 3, 224, 224)

# 进行推理并打印输出的结果的形状
print(model(x).shape)

The output should be:

torch.Size([1, 1000])

The shape of the output is correct, a tensor of size [1, 1000], indicating that the model has 1000 output categories. However, our dataset only has 9 categories, not the 1000 categories of ImageNet. timm provides a parameter to quickly modify the number of categories in the last fully connected layer. In addition, we can also load pre-trained weights for transfer learning. In this example, we use resnet18 as an example, first make sure that the whole process is running normally:

model = timm.create_model('resnet18', num_classes=9, pretrained=True)
x     = torch.randn(1, 3, 224, 224)
model(x).shape

The output should be:

Downloading model.safetensors: 100% 46.8M/46.8M [00:03<00:00, 17.2MB/s]
torch.Size([1, 9])

In this way, we successfully built a resnet18 model and set the number of categories to 9. At the same time, we also loaded pre-trained weights, which is helpful for transfer learning.

If you need to know more about how to use timm, please refer to the documentation and warehouse of timm:

  • timm documentation: https://timm.fast.ai
  • timm repository: https://github.com/rwightman/pytorch-image-models

Model Training and Validation

Before training the model, we first need to define the loss function and optimizer. In this example, we use the cross-entropy loss function and the SGD optimizer.

# 定义训练的超参数
epochs = 10
optimizer = torch.optim.SGD(model.parameters(), lr=0.05, momentum=0.9)

# 初始化损失函数
if cuda:
    loss_fn = nn.CrossEntropyLoss().cuda()
else:
    loss_fn = nn.CrossEntropyLoss()

Next, for the convenience of training, a function is defined to calculate the accuracy of the model:

def get_acc(outputs, label):
    """
    计算模型的准确率
    Args:
    - outputs: Tensor类型,模型的输出
    - label: Tensor类型,标签
    Returns:
    - acc: float类型,模型的准确率
    """
    total = outputs.shape[0]
    probs, pred_y = outputs.data.max(dim=1) # 得到概率
    correct = (pred_y == label).sum().data
    acc = correct / total
    return acc

Then, we start the training of the model, and after each epoch, we can save the state of the model as needed for later inference or continue training.

During training, we can also add a validation step to evaluate the performance of the model on the validation set.

epoch_step = len(train_loader)
if epoch_step == 0:
    raise ValueError("训练集过小,无法进行训练,请扩充数据集,或者减小batchsize")

epoch_step_val = len(val_loader)

if cuda:
    model.cuda()
    
os.makedirs('checkpoint', exist_ok=True)
best_acc = 0
for epoch in range(epochs):
    model.train()
    train_loss = 0
    train_acc = 0
    print('Start Train')
    with tqdm(total=epoch_step,desc=f'Epoch {
      
      epoch + 1}/{
      
      epochs}',postfix=dict,mininterval=0.3) as pbar:
        for step,(im,label) in enumerate(train_loader,start=0):
            with torch.no_grad():
                if cuda:
                    im = im.cuda()
                    label = label.cuda()


            #----------------------#
            #   清零梯度
            #----------------------#
            optimizer.zero_grad()
            #----------------------#
            #   前向传播forward
            #----------------------#
            outputs = model(im)
            #----------------------#
            #   计算损失
            #----------------------#

            loss = loss_fn(outputs,label)
            #----------------------#
            #   反向传播
            #----------------------#
            # backward
            loss.backward()
            # 更新参数
            optimizer.step()
        
            train_loss += loss.data
            train_acc += get_acc(outputs,label)
            lr = optimizer.param_groups[0]['lr']
            # 在合适的位置添加以下代码
            pbar.set_postfix(**{
    
    
                'Train Loss': "{:.6f}".format(train_loss.item() / (step + 1)),
                'Train Acc': "{:.6f}".format(train_acc.item() / (step + 1)),
                'Lr': "{:.6f}".format(lr),
                'Memory': "{:.3g}G".format(torch.cuda.memory_reserved() / 1E9)
            })

            pbar.update(1)

    train_loss = train_loss.item() / len(train_loader)
    train_acc = train_acc.item() * 100 / len(train_loader)
    acc = train_acc
    print("Epoch: {}, Train Loss: {:.6f}, Train Acc: {:.6f}".format(epoch+1, train_loss, train_acc))

    state = {
    
    
        'net': model.state_dict(),
        'acc': acc,
        'epoch': epoch+1,
        'optimizer': optimizer.state_dict(),
    }
    torch.save(state, './checkpoint/last_resnet34_ckpt.pth')  # 模型保存路径
    
    if epoch_step_val != 0:
        model.eval()
        val_loss = 0
        val_acc = 0
        print('Start Val')
        #--------------------------------
        #   相同方法,同train
        #--------------------------------
        with tqdm(total=epoch_step_val,desc=f'Epoch {
      
      epoch + 1}/{
      
      epochs}',postfix=dict,mininterval=0.3) as pbar2:
            for step,(im,label) in enumerate(val_loader,start=0):
                with torch.no_grad():
                    if cuda:
                        im = im.cuda()
                        label = label.cuda()

                    #----------------------#
                    #   前向传播
                    #----------------------#
                    outputs = model(im)

                    loss = loss_fn(outputs,label)
                    val_loss += loss.data
                    val_acc += get_acc(outputs,label)
                    pbar2.set_postfix(**{
    
    'Val Loss': "{:.6f}".format(val_loss.item() / (step + 1)),
                                        'Val Acc': "{:.6f}".format(val_acc.item()/(step+1)),
                                        'Memory': "{:.3g}G".format(torch.cuda.memory_reserved() / 1E9)
                                        })
                    pbar2.update(1)

        lr = optimizer.param_groups[0]['lr']
        val_acc = val_acc.item() * 100 / len(val_loader)
        val_loss = val_loss.item() / len(val_loader)
        acc = val_acc
        print("Epoch: {}, Val Loss: {:.6f}, Val Acc: {:.6f}".format(epoch+1, val_loss, val_acc))
    
    # Save checkpoint.
    if acc > best_acc:
        print('Saving Best Model...')
        state = {
    
    
            'net': model.state_dict(),
            'acc': acc,
            'epoch': epoch+1,
            'optimizer': optimizer.state_dict(),
        }
        torch.save(state, './checkpoint/best_resnet34_ckpt.pth')
        best_acc = acc
Start Train
Epoch 1/10: 100%|██████████| 128/128 [00:18<00:00,  7.06it/s, Lr=0.050000, Memory=1.93G, Train Acc=0.549712, Train Loss=1.246984]
Epoch: 1, Train Loss: 1.246984, Train Acc: 54.971230
Start Val
Epoch 1/10: 100%|██████████| 32/32 [00:05<00:00,  5.92it/s, Memory=1.93G, Val Acc=0.765074, Val Loss=0.703293]
Epoch: 1, Val Loss: 0.703293, Val Acc: 76.507372
Saving Best Model...
Start Train
Epoch 2/10: 100%|██████████| 128/128 [00:17<00:00,  7.37it/s, Lr=0.050000, Memory=1.93G, Train Acc=0.745348, Train Loss=0.759731]
Epoch: 2, Train Loss: 0.759731, Train Acc: 74.534816
Start Val
Epoch 2/10: 100%|██████████| 32/32 [00:04<00:00,  7.71it/s, Memory=1.93G, Val Acc=0.839828, Val Loss=0.468506]
Epoch: 2, Val Loss: 0.468506, Val Acc: 83.982801
Saving Best Model...
Start Train
Epoch 3/10: 100%|██████████| 128/128 [00:17<00:00,  7.48it/s, Lr=0.050000, Memory=1.93G, Train Acc=0.809956, Train Loss=0.566048]
Epoch: 3, Train Loss: 0.566048, Train Acc: 80.995631
Start Val
Epoch 3/10: 100%|██████████| 32/32 [00:04<00:00,  6.73it/s, Memory=1.93G, Val Acc=0.873488, Val Loss=0.382040]
Epoch: 3, Val Loss: 0.382040, Val Acc: 87.348789
Saving Best Model...
Start Train
Epoch 4/10: 100%|██████████| 128/128 [00:19<00:00,  6.66it/s, Lr=0.050000, Memory=1.93G, Train Acc=0.839854, Train Loss=0.477816]
Epoch: 4, Train Loss: 0.477816, Train Acc: 83.985364
Start Val
Epoch 4/10: 100%|██████████| 32/32 [00:05<00:00,  6.09it/s, Memory=1.93G, Val Acc=0.887664, Val Loss=0.329272]
Epoch: 4, Val Loss: 0.329272, Val Acc: 88.766378
Saving Best Model...
Start Train
Epoch 5/10: 100%|██████████| 128/128 [00:17<00:00,  7.33it/s, Lr=0.050000, Memory=1.93G, Train Acc=0.862014, Train Loss=0.408341]
Epoch: 5, Train Loss: 0.408341, Train Acc: 86.201435
Start Val
Epoch 5/10: 100%|██████████| 32/32 [00:04<00:00,  6.64it/s, Memory=1.93G, Val Acc=0.924757, Val Loss=0.251357]
Epoch: 5, Val Loss: 0.251357, Val Acc: 92.475742
Saving Best Model...
Start Train
Epoch 6/10: 100%|██████████| 128/128 [00:17<00:00,  7.43it/s, Lr=0.050000, Memory=1.93G, Train Acc=0.876175, Train Loss=0.372916]
Epoch: 6, Train Loss: 0.372916, Train Acc: 87.617451
Start Val
Epoch 6/10: 100%|██████████| 32/32 [00:04<00:00,  6.86it/s, Memory=1.93G, Val Acc=0.908628, Val Loss=0.268046]
Epoch: 6, Val Loss: 0.268046, Val Acc: 90.862840
Start Train
Epoch 7/10: 100%|██████████| 128/128 [00:19<00:00,  6.54it/s, Lr=0.050000, Memory=1.93G, Train Acc=0.881424, Train Loss=0.353227]
Epoch: 7, Train Loss: 0.353227, Train Acc: 88.142353
Start Val
Epoch 7/10: 100%|██████████| 32/32 [00:05<00:00,  6.40it/s, Memory=1.93G, Val Acc=0.936980, Val Loss=0.213115]
Epoch: 7, Val Loss: 0.213115, Val Acc: 93.698019
Saving Best Model...
Start Train
Epoch 8/10: 100%|██████████| 128/128 [00:18<00:00,  6.91it/s, Lr=0.050000, Memory=1.93G, Train Acc=0.890658, Train Loss=0.313763]
Epoch: 8, Train Loss: 0.313763, Train Acc: 89.065802
Start Val
Epoch 8/10: 100%|██████████| 32/32 [00:05<00:00,  5.44it/s, Memory=1.93G, Val Acc=0.924301, Val Loss=0.221276]
Epoch: 8, Val Loss: 0.221276, Val Acc: 92.430067
Start Train
Epoch 9/10: 100%|██████████| 128/128 [00:17<00:00,  7.28it/s, Lr=0.050000, Memory=1.93G, Train Acc=0.900245, Train Loss=0.303409]
Epoch: 9, Train Loss: 0.303409, Train Acc: 90.024549
Start Val
Epoch 9/10: 100%|██████████| 32/32 [00:05<00:00,  6.01it/s, Memory=1.93G, Val Acc=0.926207, Val Loss=0.237639]
Epoch: 9, Val Loss: 0.237639, Val Acc: 92.620653
Start Train
Epoch 10/10: 100%|██████████| 128/128 [00:18<00:00,  6.80it/s, Lr=0.050000, Memory=1.93G, Train Acc=0.907415, Train Loss=0.280949]
Epoch: 10, Train Loss: 0.280949, Train Acc: 90.741462
Start Val
Epoch 10/10: 100%|██████████| 32/32 [00:06<00:00,  5.20it/s, Memory=1.93G, Val Acc=0.944808, Val Loss=0.167078]
Epoch: 10, Val Loss: 0.167078, Val Acc: 94.480848
Saving Best Model...

It can be seen that the final verification set can also reach an accuracy rate of about 94.5, and you can continue to work harder

Make predictions on the test set

Next, we use the trained model to make predictions on the test set and generate the final CSV file for submission.

First, define a function to perform model inference.

from tqdm import tqdm

def predict(test_loader, model):
    """
    进行模型的推理
    Args:
    - test_loader: DataLoader类型,测试集数据加载器
    - model: 模型
    Returns:
    - test_pred: numpy数组类型,模型的输出结果
    """
    # 将模型设为评估模式
    model.eval()
    
    test_pred = []
    with torch.no_grad():
        # 遍历测试集数据
        for input, _ in tqdm(test_loader):
            input = input.cuda()

            # 进行模型的推理
            output = model(input)
            test_pred.append(output.data.cpu().numpy())
            
    # 将模型的输出结果转换为numpy数组类型
    return np.vstack(test_pred)

Next, load the test set data and perform model inference repeated 10 times, and add up the results of each inference, which is equivalent to integrating 10 models, but integrating with yourself.

# 获取测试集数据的路径
test_path = glob.glob(f'{
      
      root}/test/*')

# 加载测试集数据
test_data = AppleDataset(test_path, transform_valid)

# 使用PyTorch的DataLoader加载测试集数据
test_loader = torch.utils.data.DataLoader(
        test_data, batch_size=1, shuffle=False, num_workers= 0, pin_memory = True)

pred = None
print("-----------------Repeat 10 times-----------------")

# 进行10次模型推理,并将结果相加
for _ in range(10):
    if pred is None:
        pred = predict(test_loader, model)
    else:
        pred += predict(test_loader, model)

Finally, save the inference results in a CSV file for submission to the competition platform.

# 创建包含标签的DataFrame
submit = pd.DataFrame(
    {
    
    
        'uuid': [x.split('/')[-1] for x in test_path],
        'label': [['d1', 'd2', 'd3', 'd4', 'd5', 'd6', 'd7', 'd8', 'd9'][x] for x in pred.argmax(1)]
    }
)

# 生成CSV文件的时间戳
timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")

# 保存包含标签的DataFrame
label_csv_file = f'submit/predictions_{
      
      timestamp}.csv'
submit = submit.sort_values(by='uuid')
submit.to_csv(label_csv_file, index=None)

# 打印保存路径
print("Predictions saved to: {}".format(label_csv_file))

Summarize

In this part, I have successfully completed the training and prediction tasks of the apple disease model. Use Python and PyTorch as the programming language, and customize a dataset class called AppleDataset according to requirements. In terms of model training, I used a custom CNN model and loaded the data of the training set and validation set using PyTorch's DataLoader. In addition, I also used the PyTorch Image Models (timm) library to quickly build a CNN model, and used the cross-entropy loss function and the SGD optimizer for model training and validation.

After completing this task, I will further optimize and improve the model. For example, you can try more complex model architectures, tune hyperparameters, or try different optimization algorithms. In addition, other data augmentation methods can be tried to improve the generalization ability of the model.

Guess you like

Origin blog.csdn.net/weixin_45508265/article/details/131183652