Read and store model
To conclude, it is one of several functions
- torch.load()/torch.save()
Complete serialization and deserialization is completed through the python pickle memory. <-> disk conversion.
- Module.state_dict()/Module.load_state_dict()
state_dict () Gets the model parameters .load_state_dict () to load the model parameters
Read Tensor
We can directly use save
functions and load
functions are stored and read Tensor
. save
Using Python pickle utility objects are serialized and saved to Disk serialized objects, use save
can be saved various objects, including the model, tensor, and dictionaries. The laod
use of pickle unpickle tool to pickle the object file deserialize memory.
The following example creates a Tensor
variable x
, and there is the same file name as x.pt
the file.
import torch
from torch import nn
x = torch.ones(3)
torch.save(x, 'x.pt')
We then read back the data from the memory to store files.
x2 = torch.load('x.pt')
x2
Output:
tensor([1., 1., 1.])
We can also store a Tensor
list and read back into memory.
y = torch.zeros(4)
torch.save([x, y], 'xy.pt')
xy_list = torch.load('xy.pt')
xy_list
Output:
[tensor([1., 1., 1.]), tensor([0., 0., 0., 0.])]
Storing and reading a map from a string to Tensor
the dictionary.
torch.save({'x': x, 'y': y}, 'xy_dict.pt')
xy = torch.load('xy_dict.pt')
xy
Output:
{'x': tensor([1., 1., 1.]), 'y': tensor([0., 0., 0., 0.])}
state_dict
In the PyTorch, Module
may be learning parameters (i.e., weights and bias), the parameters included in the model module (via model.parameters()
access). state_dict
It is a hidden parameter from the parameter name strikes Tesnor
dictionary objects.
class MLP(nn.Module):
def __init__(self):
super(MLP, self).__init__()
self.hidden = nn.Linear(3, 2)
self.act = nn.ReLU()
self.output = nn.Linear(2, 1)
def forward(self, x):
a = self.act(self.hidden(x))
return self.output(a)
net = MLP()
net.state_dict()
Output:
OrderedDict([('hidden.weight', tensor([[ 0.2448, 0.1856, -0.5678],
[ 0.2030, -0.2073, -0.0104]])),
('hidden.bias', tensor([-0.3117, -0.4232])),
('output.weight', tensor([[-0.4556, 0.4084]])),
('output.bias', tensor([-0.3573]))])
Note that only the layer (layer convolution linear layer, etc.) having a learn only parameter state_dict
entry optimizer ( optim
) is also a state_dict
, which contains information about the status and optimization parameters used by the super.
optimizer = torch.optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
optimizer.state_dict()
Output:
{'state': {}, 'param_groups': [{'lr': 0.001, 'momentum': 0.9, 'dampening': 0, 'weight_decay': 0, 'nesterov': False, 'params': [139952370292992, 139952370293784, 139952370294144, 139952370293496]}]}
Save and load model
PyTorch save and load training model, there are two common methods:
- Save and load only the model parameters (
state_dict
) - Saving and loading the entire model
Save and load state_dict
(the recommended way)
Storage:
torch.save(model.state_dict(), PATH) # 推荐的文件后缀名是pt或pth
load:
model = TheModelClass(*args, **kwargs)
model.load_state_dict(torch.load(PATH))
Saving and loading the entire model
Storage:
torch.save(model, PATH)
load:
model = torch.load(PATH)
We use a recommended method to experiment a bit:
X = torch.randn(2, 3)
Y = net(X)
PATH = "./net.pt"
torch.save(net.state_dict(), PATH)
net2 = MLP()
net2.load_state_dict(torch.load(PATH))
Y2 = net2(X)
Y2 == Y
Output:
tensor([[1],
[1]], dtype=torch.uint8)
Because this net
and net2
have the same model parameters, then the same input X
calculation results will be the same. The above output also verify this.
In addition, there are other usage scenarios, such as models save between GPU and CPU and read, using multiple GPU memory model, etc., you can refer to when using official documents .