Pytorch tutorial (code explained line by line)

0. Registration environment tutorial

1. Start importing the corresponding packages

import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision.transforms import ToTensor

Torch is the abbreviation of pytorch.
torch.utils.data import DataLoader is an iterator used to read data.
torchvision is a visual processing package. datasets imports vision-related data sets.
transforms is used for image transformation.

2. Download the data set (prepare the data set)

# Download training data from open datasets.
training_data = datasets.FashionMNIST(
    root="data",
    train=True,
    download=True,
    transform=ToTensor(),
)

# Download test data from open datasets.
test_data = datasets.FashionMNIST(
    root="data",
    train=False,
    download=True,
    transform=ToTensor(),
)

datasets.FashionMNIST refers to a data set used for clothing recognition. FashionMNIST is a very popular image classification dataset, which contains 70,000 28x28 grayscale images in 10 categories.
Of course, pytorch has many other dataset formats. For example, the following data set. For other data sets, please click this linkInsert image description here

3. Load the data set

batch_size = 64

# Create data loaders.
train_dataloader = DataLoader(training_data, batch_size=batch_size)
test_dataloader = DataLoader(test_data, batch_size=batch_size)

for X, y in test_dataloader:
    print(f"Shape of X [N, C, H, W]: {
      
      X.shape}")
    print(f"Shape of y: {
      
      y.shape} {
      
      y.dtype}")
    break

DataLoader is a very useful module in PyTorch. It is mainly used to load data in batches. Especially when the data set is very large, DataLoader can greatly improve the data loading speed and reduce memory usage.
The main functions of DataLoader include:
Batch processing of data: DataLoader can divide data into multiple batches, each batch contains a certain number of data samples, and then process one batch of data at a time, which can greatly reduce memory usage. .
Data shuffling: By setting the shuffle=True parameter, DataLoader can randomly shuffle the order of the data set at the beginning of each epoch, which can increase the generalization ability of the model.
batch_size refers to the size of the data read each time. Here, it is set to read 64 images at a time.

4. Create a trained model

# Get cpu, gpu or mps device for training.
device = (
    "cuda"
    if torch.cuda.is_available()
    else "mps"
    if torch.backends.mps.is_available()
    else "cpu"
)
print(f"Using {
      
      device} device")

# Define model
class NeuralNetwork(nn.Module):
    def __init__(self):
        super().__init__()
        self.flatten = nn.Flatten()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(28*28, 512),
            nn.ReLU(),
            nn.Linear(512, 512),
            nn.ReLU(),
            nn.Linear(512, 10)
        )

    def forward(self, x):
        x = self.flatten(x)
        logits = self.linear_relu_stack(x)
        return logits

model = NeuralNetwork().to(device)
print(model)

super(). init () means calling the init () method self.flatten = nn.Flatten() of the parent class (nn.Module)
. The main function of this line of code is in the neural network model to convert the input data from Multidimensional (such as two or three dimensions) are converted into one dimension. This operation is often called "flatten".
In this example, the expected input to the model is a tensor of shape [batch_size, 28, 28], that is, a data set containing multiple (here 28*28=784) feature values. The nn.Flatten() layer converts this three-dimensional data into a one-dimensional array so that subsequent linear layers (nn.Linear) can operate in a more efficient manner.

nn.Sequential is a module in PyTorch for creating sequential neural network models. It is an ordered container that can contain any number of other modules. When you input data into the nn.Sequential model, the data passes through each module in the order you define in the container.

5. Set the optimizer and loss function

loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)

There are many types of loss functions. For other references, click this link.
There are also many types of optimizers, such as ASGD, ADAM, etc. For other references, please click this link.

6. Model training

Define the training process

def train(dataloader, model, loss_fn, optimizer):
    size = len(dataloader.dataset)
    model.train()
    for batch, (X, y) in enumerate(dataloader):
        X, y = X.to(device), y.to(device)

        # Compute prediction error
        pred = model(X)
        loss = loss_fn(pred, y)

        # Backpropagation
        loss.backward()
        optimizer.step()
        optimizer.zero_grad()

        if batch % 100 == 0:
            loss, current = loss.item(), (batch + 1) * len(X)
            print(f"loss: {
      
      loss:>7f}  [{
      
      current:>5d}/{
      
      size:>5d}]")

From the data set, one image and one label are taken each time for training, and then back propagation and gradient optimization are performed to complete the training.
item(): .item() is a method used to extract a scalar value from a tensor. When you call the .item() method, if there is only one element in the tensor, then this element will be returned; if there are multiple elements in the tensor, an error will be thrown.

def test(dataloader, model, loss_fn):
    size = len(dataloader.dataset)
    num_batches = len(dataloader)
    model.eval()
    test_loss, correct = 0, 0
    with torch.no_grad():
        for X, y in dataloader:
            X, y = X.to(device), y.to(device)
            pred = model(X)
            test_loss += loss_fn(pred, y).item()
            correct += (pred.argmax(1) == y).type(torch.float).sum().item()
    test_loss /= num_batches
    correct /= size
    print(f"Test Error: \n Accuracy: {
      
      (100*correct):>0.1f}%, Avg loss: {
      
      test_loss:>8f} \n")

correct += (pred.argmax(1) == y).type(torch.float).sum().item(): Explanation:
(pred.argmax(1) == y): First, this line of code passes argmax(1) obtains the predicted category of each sample. Then it compares the predicted class with the true class (==). This returns a Boolean tensor indicating whether the prediction for each sample was correct.
(pred.argmax(1) == y).type(torch.float): Next, this line of code converts the boolean tensor to float. In PyTorch, Boolean tensors are automatically converted to floating point types.
(pred.argmax(1) == y).type(torch.float).sum(): This line of code then counts the total number of correct predictions across all samples. This is accomplished by calling the sum() function, which returns the sum of all elements in a tensor.
correct += …: Finally, this line of code adds the total number of correct predictions to the variable correct. += is an accumulation operator that adds the variable on the left to the expression result on the right.

7. Define training rounds

epochs = 5
for t in range(epochs):
    print(f"Epoch {
      
      t+1}\n-------------------------------")
    train(train_dataloader, model, loss_fn, optimizer)
    test(test_dataloader, model, loss_fn)
print("Done!")

8. Save the model

torch.save(model.state_dict(), "model.pth")
print("Saved PyTorch Model State to model.pth")

model.state_dict(): Explanation:
The model.state_dict() function returns a dictionary containing all parameters of the model, and the torch.save() function saves this dictionary to a file on disk.

9. Load the model

model = NeuralNetwork().to(device)
model.load_state_dict(torch.load("model.pth"))

10. Model testing

classes = [
    "T-shirt/top",
    "Trouser",
    "Pullover",
    "Dress",
    "Coat",
    "Sandal",
    "Shirt",
    "Sneaker",
    "Bag",
    "Ankle boot",
]

model.eval()
x, y = test_data[0][0], test_data[0][1]
with torch.no_grad():
    x = x.to(device)
    pred = model(x)
    predicted, actual = classes[pred[0].argmax(0)], classes[y]
    print(f'Predicted: "{
      
      predicted}", Actual: "{
      
      actual}"')

All complete code:

import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision.transforms import ToTensor

# Download training data from open datasets.
training_data = datasets.FashionMNIST(
    root="data",
    train=True,
    download=True,
    transform=ToTensor(),
)

# Download test data from open datasets.
test_data = datasets.FashionMNIST(
    root="data",
    train=False,
    download=True,
    transform=ToTensor(),
)

batch_size = 64

# Create data loaders.
train_dataloader = DataLoader(training_data, batch_size=batch_size)
test_dataloader = DataLoader(test_data, batch_size=batch_size)

for X, y in test_dataloader:
    print(f"Shape of X [N, C, H, W]: {
      
      X.shape}")
    print(f"Shape of y: {
      
      y.shape} {
      
      y.dtype}")
    break

# Get cpu, gpu or mps device for training.
device = (
    "cuda"
    if torch.cuda.is_available()
    else "mps"
    if torch.backends.mps.is_available()
    else "cpu"
)
print(f"Using {
      
      device} device")

# Define model
class NeuralNetwork(nn.Module):
    def __init__(self):
        super().__init__()
        self.flatten = nn.Flatten()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(28*28, 512),
            nn.ReLU(),
            nn.Linear(512, 512),
            nn.ReLU(),
            nn.Linear(512, 10)
        )

    def forward(self, x):
        x = self.flatten(x)
        logits = self.linear_relu_stack(x)
        return logits

model = NeuralNetwork().to(device)
print(model)

loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)

def train(dataloader, model, loss_fn, optimizer):
    size = len(dataloader.dataset)
    model.train()
    for batch, (X, y) in enumerate(dataloader):
        X, y = X.to(device), y.to(device)

        # Compute prediction error
        pred = model(X)
        loss = loss_fn(pred, y)

        # Backpropagation
        loss.backward()
        optimizer.step()
        optimizer.zero_grad()

        if batch % 100 == 0:
            loss, current = loss.item(), (batch + 1) * len(X)
            print(f"loss: {
      
      loss:>7f}  [{
      
      current:>5d}/{
      
      size:>5d}]")

def test(dataloader, model, loss_fn):
    size = len(dataloader.dataset)
    num_batches = len(dataloader)
    model.eval()
    test_loss, correct = 0, 0
    with torch.no_grad():
        for X, y in dataloader:
            X, y = X.to(device), y.to(device)
            pred = model(X)
            test_loss += loss_fn(pred, y).item()
            correct += (pred.argmax(1) == y).type(torch.float).sum().item()
    test_loss /= num_batches
    correct /= size
    print(f"Test Error: \n Accuracy: {
      
      (100*correct):>0.1f}%, Avg loss: {
      
      test_loss:>8f} \n")

epochs = 5
for t in range(epochs):
    print(f"Epoch {
      
      t+1}\n-------------------------------")
    train(train_dataloader, model, loss_fn, optimizer)
    test(test_dataloader, model, loss_fn)
print("Done!")

Guess you like

Origin blog.csdn.net/weixin_49321128/article/details/134407907