[Deep Learning Experiment] Feedforward Neural Network (final): Customize the iris classification feedforward neural network model and conduct training and evaluation

Table of contents

1. Experiment introduction

 2. Experimental environment

1. Configure the virtual environment

2. Library version introduction

3. Experimental content

0. Import necessary toolkits

1. Build a data set (IrisDataset)

2. Build model (FeedForward)

a. __init__(initialization)

b. forward (forward propagation)

3. Integrate training, evaluation, and prediction processes (Runner)

4. Model Evaluation (Accuracy)

5. __main__

6. Code integration


1. Experiment introduction

        Iris classification (predicting the type of iris by inputting its characteristic information) is a common machine learning problem. This experiment aims to use PyTorch to build a simple iris classification feedforward neural network model, and conduct training and evaluation to understand the basic usage of PyTorch and the model training process.

 2. Experimental environment

    This series of experiments uses the PyTorch deep learning framework. The relevant operations are as follows:

1. Configure the virtual environment

conda create -n DL python=3.7 
conda activate DL
pip install torch==1.8.1+cu102 torchvision==0.9.1+cu102 torchaudio==0.8.1 -f https://download.pytorch.org/whl/torch_stable.html
conda install matplotlib
 conda install scikit-learn

2. Library version introduction

software package This experimental version The latest version currently
matplotlib 3.5.3 3.8.0
numpy 1.21.6 1.26.0
python 3.7.16
scikit-learn 0.22.1 1.3.0
torch 1.8.1+cu102 2.0.1
torchaudio 0.8.1 2.0.2
torchvision 0.9.1+cu102 0.15.2

3. Experimental content

ChatGPT:

        Feedforward Neural Network is a common artificial neural network model, also known as Multilayer Perceptron (MLP). It is a model based on forward propagation and is mainly used to solve classification and regression problems.

        Feedforward neural network consists of multiple layers, including input layer, hidden layer and output layer. Its name "feedforward" comes from the fact that signals can only flow forward in the network, that is, from the input layer through the hidden layer and finally to the output layer, without feedback connections.

Here's how feedforward neural networks generally work:

  1. Input layer: receives raw data or feature vectors as input to the network, and each input is represented as a neuron of the network. Each neuron weights the input and transforms it through an activation function to produce an output signal.

  2. Hidden layer: A feedforward neural network can contain one or more hidden layers, each consisting of multiple neurons. The neurons in the hidden layer receive input from the previous layer and pass the weighted sum of the signal transformed by the activation function to the next layer.

  3. Output layer: The output of the last hidden layer is passed to the output layer, which usually consists of one or more neurons. The neurons in the output layer use appropriate activation functions (such as Sigmoid, Softmax, etc.) according to the type of problem to be solved (classification or regression) to output the final result.

  4. Forward propagation: The process of transmitting signals from the input layer through the hidden layer to the output layer is called forward propagation. During forward propagation, each neuron multiplies the output of the previous layer by the corresponding weight and passes the result to the next layer. Such calculations are performed layer by layer through each layer in the network until the final output is produced.

  5. Loss function and training: The training process of a feedforward neural network usually involves defining a loss function that measures the difference between the model's predicted output and the true label. Common loss functions include Mean Squared Error and Cross-Entropy. By using backpropagation and optimization algorithms (such as gradient descent), the network adjusts parameters according to the gradient of the loss function to minimize the value of the loss function.

        The advantages of feedforward neural networks include the ability to handle complex nonlinear relationships, their suitability for a variety of problem types, and their ability to automatically learn feature representations through training. However, it also has some challenges, such as easy over-fitting and difficulty in processing large-scale data and high-dimensional data. In order to cope with these challenges, some improved network structures and training techniques have been proposed, such as Convolutional Neural Networks and Recurrent Neural Networks.

This series is experimental content and does not explain theoretical knowledge in detail.

(Ahem, I actually don’t have time to sort it out. I’ll come back and fill in the gaps when I have the opportunity)

977468b5ae9843c6a88005e792817cb1.png

0. Import necessary toolkits

import torch
from torch import nn
import torch.nn.functional as F
# 绘画时使用的工具包
import matplotlib.pyplot as plt
# 导入鸢尾花数据集
from sklearn.datasets import load_iris
# 构建自己的数据集,继承自Dataset类
from torch.utils.data import Dataset, DataLoader

1. Build a data set (IrisDataset)

  • This experiment uses the iris data set, which contains 150 samples, each sample has 4 features and 1 label.
  • The load_iris function loads the data set and normalizes the data;
  • The custom IrisDataset class is used for data loading and building training sets, validation sets and test sets.

[Deep Learning Experiment] Feedforward Neural Network (7): Loading data in batches (loading data directly → defining class encapsulation data)_QomolangmaH’s blog-CSDN blog icon-default.png?t=N7T8https://blog.csdn.net/m0_63834988/article/details/133181882? spm=1001.2014.3001.5501

2.  Build the model (FeedForward)

       This experiment constructed a simple two-layer feedforward neural network. The biggest difference between this feedforward neural network and the MLP class implemented previously is that our implementation class uses an activation function written by ourselves. This activation function cannot update parameters through backpropagation, but the deep learning framework has already completed this function for us. . (In fact, through simple changes, our activation function can also propagate the gradient back)

[Deep Learning Experiment] Feedforward Neural Network (3): Customized multi-layer perceptron (activation function logistic, linear layer calculation Linear)_QomolangmaH's blog-CSDN blog icon-default.png?t=N7T8https://blog.csdn.net/m0_63834988/article/details /133097102?spm=1001.2014.3001.5501

class FeedForward(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(FeedForward,self).__init__()
        self.fc1 = nn.Linear(input_size, hidden_size)
        self.fc2 = nn.Linear(hidden_size, output_size)
        self.act = nn.Sigmoid()
    
    def forward(self, inputs):
        outputs = self.fc1(inputs)
        outputs = self.act(outputs)
        outputs = self.fc2(outputs)
        return outputs

a. __init__(initialization)

  • Three parameters:
    • input_size (input size)
    • hidden_size (hidden layer size)
    • output_size (output size)
  • Call the initialization method super(FeedForward, self) .init () of the parent class nn.Module to ensure that the class is correctly initialized as a nn.Module.
  • Two linear layers self.fc1 and self.fc2:
    • The input size of self.fc1 is input_size and the output size is hidden_size;
    • The input size of self.fc2 is hidden_size and the output size is output_size.
  • An activation function self.act, used here is nn.Sigmoid(), which is the Sigmoid activation function.

b. forward (forward propagation)

  • Accepts an input tensor inputs. During the forward propagation process,
    • The input passes through the self.fc1 linear layer,
    • Then perform nonlinear transformation through the self.act activation function,
    • Then go through the self.fc2 linear layer to get the final output tensor outputs

3. Integrate training, evaluation, and prediction processes (Runner)

        The Runner class encapsulates the process of model training and evaluation.

  • The initialization function receives parameters such as model, optimizer, loss function and evaluation index, and defines some member variables to record the loss and evaluation index changes during the training process.
  • The train function performs model training
  • The evaluate function evaluates the model
  • The predict function performs model predictions
  • The save_model and load_model functions are used to save and load model parameters

[Deep Learning Experiment] Feedforward Neural Network (9): Integrated Training, Evaluation, and Prediction Process (Runner)_QomolangmaH's Blog-CSDN Blog icon-default.png?t=N7T8https://blog.csdn.net/m0_63834988/article/details/133219448?spm=1001.2014 .3001.5501

4. Model Evaluation (Accuracy)

[Deep Learning Experiment] Feedforward Neural Network (8): Model Evaluation (Customize the Accuracy class that supports batch evaluation)_QomolangmaH's Blog-CSDN Blog icon-default.png?t=N7T8https://blog.csdn.net/m0_63834988/article/details/133186305 ?spm=1001.2014.3001.5501

5. __main__

if __name__ == '__main__':
    batch_size = 16

    # 分别构建训练集、验证集和测试集
    train_dataset = IrisDataset(mode='train')
    dev_dataset = IrisDataset(mode='dev')
    test_dataset = IrisDataset(mode='test')

    train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
    dev_loader = DataLoader(dev_dataset, batch_size=batch_size)
    test_loader = DataLoader(test_dataset, batch_size=1, shuffle=True)

    input_size = 4
    output_size = 3
    hidden_size = 6
    # 定义模型
    model = FeedForward(input_size, hidden_size, output_size)
    # 定义损失函数
    loss_fn = F.cross_entropy
    # 定义优化器
    optimizer = torch.optim.SGD(model.parameters(), lr=0.2)
    # 定义评价方法
    metric = Accuracy(is_logist=True)
    # 实例化辅助runner类
    runner = Runner(model, optimizer, loss_fn, metric)
    # 模型训练
    runner.train(train_loader, dev_loader, num_epochs=50, log_steps=10, eval_steps=5)
    # 训练结束后,网络的参数会自动保存为.pth结尾的文件,且与训练文件在同一目录下
    model_path = 'model_25.pth'
    # 首先读入经过训练后的网络的参数
    runner.load_model(model_path)
    x, label = next(iter(test_loader))
    print(runner.predict(x.float()))
    print(label)

  • batch_size = 16 Set batch size for data loader;

  • Construct the data set objects of the training set, validation set and test set;

  • Create data loader;

  • Set the model’s input size, output size, and hidden layer size:

    • input_size = 4 The input size is 4, corresponding to the number of features of the iris dataset.
    • output_size = 3 The output size is 3, corresponding to the number of categories of the iris dataset.
    • hidden_size = 6 The hidden layer size is 6, which is the number of units in the hidden layer of the feedforward neural network model.
  • Define the model, loss function, optimizer and evaluation metrics:

    • Feedforward neural network model: Use FeedForwardclasses to set input, hidden layer, and output sizes.
    • Loss function: Use cross-entropy loss function F.cross_entropy.
    • Optimizer: Use stochastic gradient descent (SGD) with a learning rate of 0.2.
    • Evaluation index: usage Accuracycategory.
  • Create auxiliary Runnerclass objects for training and evaluating models.

  • Perform model training:

    • Set the training data loader to train_loader, the verification data loader to dev_loader, the training rounds to 50, the log printing frequency to print logs every 10 steps, and the evaluation frequency to evaluate every 5 epochs.
  • After training, runner.load_model(model_path)load the parameters of the model by calling the method

  • Construct a test sample  x and label  label, call  runner.predict(x.float()) the method to predict the sample, and output the prediction results. The real labels are then output  label.

6. Code integration

# 导入必要的工具包
import torch
from torch import nn
import torch.nn.functional as F
# 绘画时使用的工具包
import matplotlib.pyplot as plt
# 导入鸢尾花数据集
from sklearn.datasets import load_iris
# 构建自己的数据集,继承自Dataset类
from torch.utils.data import Dataset, DataLoader


# 此函数用于加载鸢尾花数据集
def load_data(shuffle=True):
    x = torch.tensor(load_iris().data)
    y = torch.tensor(load_iris().target)

    # 数据归一化
    x_min = torch.min(x, dim=0).values
    x_max = torch.max(x, dim=0).values
    x = (x - x_min) / (x_max - x_min)

    if shuffle:
        idx = torch.randperm(x.shape[0])
        x = x[idx]
        y = y[idx]
    return x, y


class IrisDataset(Dataset):
    def __init__(self, mode='train', num_train=120, num_dev=15):
        super(IrisDataset, self).__init__()
        x, y = load_data(shuffle=True)
        if mode == 'train':
            self.x, self.y = x[:num_train], y[:num_train]
        elif mode == 'dev':
            self.x, self.y = x[num_train:num_train + num_dev], y[num_train:num_train + num_dev]
        else:
            self.x, self.y = x[num_train + num_dev:], y[num_train + num_dev:]

    def __getitem__(self, idx):
        return self.x[idx], self.y[idx]

    def __len__(self):
        return len(self.x)


class FeedForward(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(FeedForward, self).__init__()
        self.fc1 = nn.Linear(input_size, hidden_size)
        self.fc2 = nn.Linear(hidden_size, output_size)
        self.act = nn.Sigmoid()

    def forward(self, inputs):
        outputs = self.fc1(inputs)
        outputs = self.act(outputs)
        outputs = self.fc2(outputs)
        return outputs


# 支持分批进行模型评价的 Accuracy 类
class Accuracy:
    def __init__(self, is_logist=True):
        # 正确样本个数
        self.num_correct = 0
        # 样本总数
        self.num_count = 0
        self.is_logist = is_logist

    def update(self, outputs, labels):
        # 判断是否为二分类任务
        if outputs.shape[1] == 1:
            outputs = outputs.squeeze(-1)
            # 判断是否是logit形式的预测值
            if self.is_logist:
                preds = (outputs >= 0).long()
            else:
                preds = (outputs >= 0.5).long()
        else:
            # 多分类任务时,计算最大元素索引作为类别
            preds = torch.argmax(outputs, dim=1).long()

        # 获取本批数据中预测正确的样本个数
        labels = labels.squeeze(-1)
        batch_correct = (preds == labels).float().sum()
        batch_count = len(labels)
        # 更新
        self.num_correct += batch_correct
        self.num_count += batch_count

    def accumulate(self):
        # 使用累计的数据,计算总的评价指标
        if self.num_count == 0:
            return 0
        return self.num_correct / self.num_count

    def reset(self):
        self.num_correct = 0
        self.num_count = 0


class Runner(object):
    def __init__(self, model, optimizer, loss_fn, metric, **kwargs):
        self.model = model
        self.optimizer = optimizer
        self.loss_fn = loss_fn
        # 用于计算评价指标
        self.metric = metric

        # 记录训练过程中的评价指标变化
        self.dev_scores = []
        # 记录训练过程中的损失变化
        self.train_epoch_losses = []
        self.dev_losses = []
        # 记录全局最优评价指标
        self.best_score = 0

    # 模型训练阶段
    def train(self, train_loader, dev_loader=None, **kwargs):
        # 将模型设置为训练模式,此时模型的参数会被更新
        self.model.train()
        num_epochs = kwargs.get('num_epochs', 0)
        log_steps = kwargs.get('log_steps', 100)
        save_path = kwargs.get('save_path', 'best_mode.pth')
        eval_steps = kwargs.get('eval_steps', 0)
        # 运行的step数,不等于epoch数
        global_step = 0

        if eval_steps:
            if dev_loader is None:
                raise RuntimeError('Error: dev_loader can not be None!')
            if self.metric is None:
                raise RuntimeError('Error: Metric can not be None')

        # 遍历训练的轮数
        for epoch in range(num_epochs):
            total_loss = 0
            # 遍历数据集
            for step, data in enumerate(train_loader):
                x, y = data
                logits = self.model(x.float())
                loss = self.loss_fn(logits, y.long())
                total_loss += loss
                if log_steps and global_step % log_steps == 0:
                    print(f'loss:{loss.item():.5f}')

                loss.backward()
                self.optimizer.step()
                self.optimizer.zero_grad()
            # 每隔一定轮次进行一次验证,由eval_steps参数控制,可以采用不同的验证判断条件
            if (epoch + 1) % eval_steps == 0:

                dev_score, dev_loss = self.evaluate(dev_loader, global_step=global_step)
                print(f'[Evalute] dev score:{dev_score:.5f}, dev loss:{dev_loss:.5f}')

                if dev_score > self.best_score:
                    self.save_model(f'model_{epoch + 1}.pth')

                    print(
                        f'[Evaluate]best accuracy performance has been updated: {self.best_score:.5f}-->{dev_score:.5f}')
                    self.best_score = dev_score

                # 验证过程结束后,请记住将模型调回训练模式
                self.model.train()

            global_step += 1
            # 保存当前轮次训练损失的累计值
            train_loss = (total_loss / len(train_loader)).item()
            self.train_epoch_losses.append((global_step, train_loss))

        print('[Train] Train done')

    # 模型评价阶段
    def evaluate(self, dev_loader, **kwargs):
        assert self.metric is not None
        # 将模型设置为验证模式,此模式下,模型的参数不会更新
        self.model.eval()
        global_step = kwargs.get('global_step', -1)
        total_loss = 0
        self.metric.reset()

        for batch_id, data in enumerate(dev_loader):
            x, y = data
            logits = self.model(x.float())
            loss = self.loss_fn(logits, y.long()).item()
            total_loss += loss
            self.metric.update(logits, y)

        dev_loss = (total_loss / len(dev_loader))
        self.dev_losses.append((global_step, dev_loss))
        dev_score = self.metric.accumulate()
        self.dev_scores.append(dev_score)
        return dev_score, dev_loss

    # 模型预测阶段,
    def predict(self, x, **kwargs):
        self.model.eval()
        logits = self.model(x)
        return logits

    # 保存模型的参数
    def save_model(self, save_path):
        torch.save(self.model.state_dict(), save_path)

    # 读取模型的参数
    def load_model(self, model_path):
        self.model.load_state_dict(torch.load(model_path, map_location=torch.device('cpu')))


if __name__ == '__main__':
    batch_size = 16

    # 分别构建训练集、验证集和测试集
    train_dataset = IrisDataset(mode='train')
    dev_dataset = IrisDataset(mode='dev')
    test_dataset = IrisDataset(mode='test')

    train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
    dev_loader = DataLoader(dev_dataset, batch_size=batch_size)
    test_loader = DataLoader(test_dataset, batch_size=1, shuffle=True)

    input_size = 4
    output_size = 3
    hidden_size = 6
    # 定义模型
    model = FeedForward(input_size, hidden_size, output_size)
    # 定义损失函数
    loss_fn = F.cross_entropy
    # 定义优化器
    optimizer = torch.optim.SGD(model.parameters(), lr=0.2)
    # 定义评价方法
    metric = Accuracy(is_logist=True)
    # 实例化辅助runner类
    runner = Runner(model, optimizer, loss_fn, metric)
    # 模型训练
    runner.train(train_loader, dev_loader, num_epochs=50, log_steps=10, eval_steps=5)
    # 训练结束后,网络的参数会自动保存为.pth结尾的文件,且与训练文件在同一目录下
    model_path = 'model_25.pth'
    # 首先读入经过训练后的网络的参数
    runner.load_model(model_path)
    x, label = next(iter(test_loader))
    print(runner.predict(x.float()))
    print(label)

Guess you like

Origin blog.csdn.net/m0_63834988/article/details/132891131