0. [Pytorch programming] Pytorch entry learning related basic concepts and first experience
-
- reference resources
- Environment configuration
- View the Pytorch version used in the current conda environment
- Simple experience with Pytorch
-
- Import related packages
- Download the FashionMNIST dataset
- View dataset dimensions
- Define a neural network
- Calculate the total model parameters and trainable parameters
- Define loss function and optimizer
- Define training process and testing process
- Perform model training
- save model
- load model
- test model
My blog column Pytorch programming series. For Python environment configuration, refer to "[Python Learning] Windows 10 Start Your Anaconda Installation and Python Environment Management" or "[Python Learning] Pure Terminal Commands to Start Your Anaconda Installation and Python Environment Management" .
Author: Chen Yirong
Code environment: Python3.6, Pytorch1.4.0, jupyter notebook
reference resources
- Pytorch official website: https://pytorch.org/
- PyTorch official tutorial Chinese version: https://www.pytorch123.com/
Environment configuration
Before configuring the Pytorch environment, you need to determine your own development conditions:
- System: Linux, Mac, Windows
- Package management tools: Conda, Pip, …
- Locale: Python, C++, Java
- Computing resources: whether there is a graphics card, the model of the graphics card, the version of the graphics card, and the version of CUDA
The above conditions vary according to the actual situation of each individual.
Taking myself as an example, I use a server, Ubuntu 18.04, the package management tool is Conda, the graphics card version is GeForce RTX 2080ti, the nvidia driver version is 495.46, and the supported cuda version is up to 11.5.
Generally speaking, the system is fixed, and the graphics card model in the computing resources is fixed. We can determine the supported nvidia driver according to the graphics card model, and the supported cuda version according to the nvidia driver version. You can find it on the website https: // www.nvidia.cn/geforce/drivers/Search for the nvidia driver that suits you, refer to [Python Learning] Install CUDA and cuDNN from scratch on Ubuntu 18.04 for driver configuration.
For Python environment configuration, refer to "[Python Learning] Windows 10 Start Your Anaconda Installation and Python Environment Management" or "[Python Learning] Pure Terminal Commands to Start Your Anaconda Installation and Python Environment Management" .
Once the driver environment configuration is complete, you can use the package management tool to install Pytorch. The installation command looks like this:
- Install pytorch1.4.0 under Ubuntu18.04, cuda10.1 configuration
conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=10.1 -c pytorch
- Install pytorch1.6.0 under Ubuntu18.04, cuda10.2 configuration
conda install pytorch==1.6.0 torchvision==0.7.0 cudatoolkit=10.2 -c pytorch
These commands can be found on the pytorch official website. For details, please refer to the link: https://pytorch.org/get-started/previous-versions/
View the Pytorch version used in the current conda environment
import torch
print(torch.__version__) #注意是双下划线
1.4.0
Simple experience with Pytorch
Import related packages
torch is the top-level package, and the commonly used packages are:
- torch.nn: Provides a series of basic neural network modules, see https://pytorch.org/docs/1.4.0/nn.html for details
- torch.utils.data: Provides the classes required for data reading, see https://pytorch.org/docs/1.4.0/data.html for details
- torchvision: Provides visual processing of popular datasets, models, etc., see https://pytorch.org/docs/1.4.0/torchvision/index.html for details
- torchtext: Provides popular data sets, models, etc. for natural language processing. For details, see https://pytorch.org/text/stable/index.html
, etc.
import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision.transforms import ToTensor
Download the FashionMNIST dataset
This data set refers to https://pytorch.org/docs/1.4.0/torchvision/datasets.html#fashion-mnist
Run the following command, a directory named data will be created in the directory where this file is located, and then in the data directory Create the FashionMNIST directory, so the directory to store the dataset is:
./data/FashionMNIST
# Download training data from open datasets.
training_data = datasets.FashionMNIST(
root="data", # 指定下载的数据集存储的根目录
train=True, # 下载训练集
download=True, # If true, downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, it is not downloaded again.
transform=ToTensor(),
)
# Download test data from open datasets.
test_data = datasets.FashionMNIST(
root="data", # 指定下载的数据集存储的根目录
train=False,# 下载测试集
download=True,
transform=ToTensor(),
)
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz to data/FashionMNIST/raw/train-images-idx3-ubyte.gz
HBox(children=(HTML(value=''), FloatProgress(value=1.0, bar_style='info', layout=Layout(width='20px'), max=1.0…
Extracting data/FashionMNIST/raw/train-images-idx3-ubyte.gz to data/FashionMNIST/raw
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz to data/FashionMNIST/raw/train-labels-idx1-ubyte.gz
HBox(children=(HTML(value=''), FloatProgress(value=1.0, bar_style='info', layout=Layout(width='20px'), max=1.0…
Extracting data/FashionMNIST/raw/train-labels-idx1-ubyte.gz to data/FashionMNIST/raw
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz
HBox(children=(HTML(value=''), FloatProgress(value=1.0, bar_style='info', layout=Layout(width='20px'), max=1.0…
Extracting data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz
HBox(children=(HTML(value=''), FloatProgress(value=1.0, bar_style='info', layout=Layout(width='20px'), max=1.0…
Extracting data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/raw
Processing...
Done!
View dataset dimensions
torch.utils.data.DataLoader is a data loading class, as follows:
DataLoader(dataset, batch_size=1, shuffle=False, sampler=None,
batch_sampler=None, num_workers=0, collate_fn=None,
pin_memory=False, drop_last=False, timeout=0,
worker_init_fn=None)
Among them, dataset specifies the object of torch.utils.data.Dataset class or its subclasses, and batch_size specifies the data read batch.
Before neural network training, data loading objects need to be created. As shown below, two data loading objects are initialized: train_dataloader and test_dataloader.
batch_size = 64
# 创建训练集和测试集的数据加载对象
train_dataloader = DataLoader(training_data, batch_size=batch_size)
test_dataloader = DataLoader(test_data, batch_size=batch_size)
for X, y in test_dataloader:
print(f"Shape of X [N, C, H, W]: {X.shape}")
print(f"Shape of y: {y.shape} {y.dtype}")
break
Shape of X [N, C, H, W]: torch.Size([64, 1, 28, 28])
Shape of y: torch.Size([64]) torch.int64
Define a neural network
Defining a neural network is not complicated. Its essence is to create a class that needs to inherit the parent class torch.nn.Module
. Therefore, torch.nn.Module
it is also called the base class of all neural network modules. Its definition format is as follows:
class NeuralNetwork(nn.Module):
def __init__(self):
super(NeuralNetwork, self).__init__()
# 初始化的模型模块
def forward(self, x):
# 模型的前向传递
return logits
# 利用torch.cuda.is_available()判断GPU是否可用,从而确定device选项
device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Using {device} device")
# 创建神经网络类,该类需要继承父类nn.Module
class NeuralNetwork(nn.Module):
def __init__(self):
super(NeuralNetwork, self).__init__()
self.flatten = nn.Flatten() # 把一个数据拉成一维,相当于torch.nn.Flatten(start_dim=1, end_dim=-1)
self.linear_relu_stack = nn.Sequential(
nn.Linear(28*28, 512), # 线性变换模块,输入为[batch_size,28*28],输出为[batch_size,512]
nn.ReLU(), # 激活函数模块
nn.Linear(512, 512), # 线性变换模块,输入为[batch_size,512],输出为[batch_size,512]
nn.ReLU(), # 激活函数模块
nn.Linear(512, 10) # 线性变换模块,输入为[batch_size,512],输出为[batch_size,10]
)
def forward(self, x):
x = self.flatten(x) # 从x的第二维开始拉成一维,[64, 1, 28, 28]--->[64, 1*28*28]
logits = self.linear_relu_stack(x)
return logits
model = NeuralNetwork().to(device)
print(model)
Using cuda device
NeuralNetwork(
(flatten): Flatten()
(linear_relu_stack): Sequential(
(0): Linear(in_features=784, out_features=512, bias=True)
(1): ReLU()
(2): Linear(in_features=512, out_features=512, bias=True)
(3): ReLU()
(4): Linear(in_features=512, out_features=10, bias=True)
)
)
Calculate the total model parameters and trainable parameters
def count_trainable_parameters(model):
'''获取需要训练的参数数量
使用示例:print(f'The model has {count_trainable_parameters(model):,} trainable parameters')
'''
return sum(p.numel() for p in model.parameters() if p.requires_grad)
def count_total_parameters(model):
'''获取模型总的参数数量
使用示例:print(f'The model has {count_total_parameters(model):,} total parameters')
'''
return sum(p.numel() for p in model.parameters())
total_params = count_total_parameters(model)
print(f'{total_params:,} total parameters.')
total_trainable_params = count_trainable_parameters(model)
print(f'{total_trainable_params:,} total trainable parameters.')
669,706 total parameters.
669,706 total trainable parameters.
Define loss function and optimizer
loss_fn = nn.CrossEntropyLoss() # nn.LogSoftmax() 与 nn.NLLLoss() 的组合
optimizer = torch.optim.SGD(model.parameters(), lr=1e-3) # 使用torch.optim.SGD优化器,固定学习率为0.001
Define training process and testing process
Training requires training data loaders, models, loss functions, and optimizers. The process can be summarized as:
- Data loading and reading
- call model calculation
- Use the loss function to calculate the loss value
- Initialize the gradient to zero, then use the loss for a backward pass, updating all parameters
The testing process is similar to the training process, but does not include the last two steps.
# 训练过程
def train(dataloader, model, loss_fn, optimizer):
size = len(dataloader.dataset)
model.train()
for batch, (X, y) in enumerate(dataloader):
X, y = X.to(device), y.to(device)
# Compute prediction error
pred = model(X)
loss = loss_fn(pred, y)
# Backpropagation
optimizer.zero_grad()
loss.backward()
optimizer.step()
if batch % 100 == 0:
loss, current = loss.item(), batch * len(X)
print(f"loss: {loss:>7f} [{current:>5d}/{size:>5d}]")
# 测试过程
def test(dataloader, model, loss_fn):
size = len(dataloader.dataset)
num_batches = len(dataloader)
model.eval()
test_loss, correct = 0, 0
with torch.no_grad():
for X, y in dataloader:
X, y = X.to(device), y.to(device)
pred = model(X)
test_loss += loss_fn(pred, y).item()
correct += (pred.argmax(1) == y).type(torch.float).sum().item()
test_loss /= num_batches
correct /= size
print(f"Test Error: \n Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f} \n")
Perform model training
epochs = 5
for t in range(epochs):
print(f"Epoch {t+1}\n-------------------------------")
train(train_dataloader, model, loss_fn, optimizer)
test(test_dataloader, model, loss_fn)
print("Done!")
Epoch 1
-------------------------------
loss: 2.312360 [ 0/60000]
loss: 2.291506 [ 6400/60000]
loss: 2.271516 [12800/60000]
loss: 2.252658 [19200/60000]
loss: 2.244356 [25600/60000]
loss: 2.222664 [32000/60000]
loss: 2.228115 [38400/60000]
loss: 2.203855 [44800/60000]
loss: 2.192679 [51200/60000]
loss: 2.153655 [57600/60000]
Test Error:
Accuracy: 38.2%, Avg loss: 2.149532
Epoch 2
-------------------------------
loss: 2.169748 [ 0/60000]
loss: 2.151293 [ 6400/60000]
loss: 2.094591 [12800/60000]
loss: 2.099757 [19200/60000]
loss: 2.060622 [25600/60000]
loss: 2.005261 [32000/60000]
loss: 2.027278 [38400/60000]
loss: 1.956725 [44800/60000]
loss: 1.939770 [51200/60000]
loss: 1.872965 [57600/60000]
Test Error:
Accuracy: 57.6%, Avg loss: 1.870182
Epoch 3
-------------------------------
loss: 1.906271 [ 0/60000]
loss: 1.869101 [ 6400/60000]
loss: 1.755461 [12800/60000]
loss: 1.784045 [19200/60000]
loss: 1.698693 [25600/60000]
loss: 1.647050 [32000/60000]
loss: 1.658152 [38400/60000]
loss: 1.569280 [44800/60000]
loss: 1.572960 [51200/60000]
loss: 1.469435 [57600/60000]
Test Error:
Accuracy: 63.3%, Avg loss: 1.492781
Epoch 4
-------------------------------
loss: 1.561221 [ 0/60000]
loss: 1.521496 [ 6400/60000]
loss: 1.376429 [12800/60000]
loss: 1.439268 [19200/60000]
loss: 1.347573 [25600/60000]
loss: 1.329434 [32000/60000]
loss: 1.343370 [38400/60000]
loss: 1.273368 [44800/60000]
loss: 1.297012 [51200/60000]
loss: 1.199990 [57600/60000]
Test Error:
Accuracy: 64.4%, Avg loss: 1.230259
Epoch 5
-------------------------------
loss: 1.306296 [ 0/60000]
loss: 1.285495 [ 6400/60000]
loss: 1.123326 [12800/60000]
loss: 1.221825 [19200/60000]
loss: 1.122804 [25600/60000]
loss: 1.131946 [32000/60000]
loss: 1.158367 [38400/60000]
loss: 1.095908 [44800/60000]
loss: 1.125929 [51200/60000]
loss: 1.049075 [57600/60000]
Test Error:
Accuracy: 65.0%, Avg loss: 1.070740
Done!
save model
import os
SAVE_PATH = "./about_pytorch_model"
if not os.path.exists(SAVE_PATH):
os.makedirs(SAVE_PATH)
torch.save(model.state_dict(), os.path.join(SAVE_PATH,"model.pth"))
print("Saved PyTorch Model State to model.pth")
Saved PyTorch Model State to model.pth
load model
model = NeuralNetwork()
model.load_state_dict(torch.load(os.path.join(SAVE_PATH,"model.pth")))
<All keys matched successfully>
test model
classes = [
"T-shirt/top",
"Trouser",
"Pullover",
"Dress",
"Coat",
"Sandal",
"Shirt",
"Sneaker",
"Bag",
"Ankle boot",
]
model.eval()
x, y = test_data[0][0], test_data[0][1]
with torch.no_grad():
pred = model(x)
predicted, actual = classes[pred[0].argmax(0)], classes[y]
print(f'Predicted: "{predicted}", Actual: "{actual}"')
Predicted: "Ankle boot", Actual: "Ankle boot"