"Hands-on Deep Learning Pytorch Edition" 5.3 Delayed initialization

import torch
from torch import nn
from d2l import torch as d2l

The input dimensions of the multilayer perceptron instantiated below are unknown, so the framework has not initialized any parameters, shown as "UninitializedParameter".

net = nn.Sequential(nn.LazyLinear(256), nn.ReLU(), nn.LazyLinear(10))

net[0].weight
c:\Software\Miniconda3\envs\d2l\lib\site-packages\torch\nn\modules\lazy.py:178: UserWarning: Lazy modules are a new feature under heavy development so changes to the API or functionality can happen at any moment.
  warnings.warn('Lazy modules are a new feature under heavy development '





<UninitializedParameter>

Once the input dimensions are specified, the framework can be lazily initialized layer by layer.

X = torch.rand(2, 20)
net(X)

net[0].weight.shape
torch.Size([256, 20])

practise

(1) What happens if the input dimensions of the first layer are specified, but the dimensions of subsequent layers are not specified? Do you want to initialize now?

net = nn.Sequential(
    nn.Linear(20, 256), nn.ReLU(),
    nn.LazyLinear(128), nn.ReLU(),
    nn.LazyLinear(10)
)
net[0].weight, net[2].weight, net[4].weight
c:\Software\Miniconda3\envs\d2l\lib\site-packages\torch\nn\modules\lazy.py:178: UserWarning: Lazy modules are a new feature under heavy development so changes to the API or functionality can happen at any moment.
  warnings.warn('Lazy modules are a new feature under heavy development '





(Parameter containing:
 tensor([[ 0.1332,  0.1372, -0.0939,  ..., -0.0579, -0.0911, -0.1820],
         [-0.1570, -0.0993, -0.0685,  ..., -0.0469, -0.0208,  0.0665],
         [ 0.0861,  0.1135,  0.1631,  ..., -0.1407,  0.1088, -0.2052],
         ...,
         [-0.1454, -0.0283, -0.1074,  ..., -0.2164, -0.2169,  0.1913],
         [-0.1617,  0.1206, -0.2119,  ..., -0.1862, -0.0951,  0.1535],
         [-0.0229, -0.2133, -0.1027,  ...,  0.1973,  0.1314,  0.1283]],
        requires_grad=True),
 <UninitializedParameter>,
 <UninitializedParameter>)
net(X)  # 延迟初始化
net[0].weight.shape, net[2].weight.shape, net[4].weight.shape
(torch.Size([256, 20]), torch.Size([128, 256]), torch.Size([10, 128]))

(2) What happens if unmatched dimensions are specified?

X = torch.rand(2, 10)
net(X)  # 会报错
---------------------------------------------------------------------------

RuntimeError                              Traceback (most recent call last)

Cell In[14], line 2
      1 X = torch.rand(2, 10)
----> 2 net(X)


File c:\Software\Miniconda3\envs\d2l\lib\site-packages\torch\nn\modules\module.py:1130, in Module._call_impl(self, *input, **kwargs)
   1126 # If we don't have any hooks, we want to skip the rest of the logic in
   1127 # this function, and just call forward.
   1128 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1129         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1130     return forward_call(*input, **kwargs)
   1131 # Do not call functions when jit is used
   1132 full_backward_hooks, non_full_backward_hooks = [], []


File c:\Software\Miniconda3\envs\d2l\lib\site-packages\torch\nn\modules\container.py:139, in Sequential.forward(self, input)
    137 def forward(self, input):
    138     for module in self:
--> 139         input = module(input)
    140     return input


File c:\Software\Miniconda3\envs\d2l\lib\site-packages\torch\nn\modules\module.py:1130, in Module._call_impl(self, *input, **kwargs)
   1126 # If we don't have any hooks, we want to skip the rest of the logic in
   1127 # this function, and just call forward.
   1128 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1129         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1130     return forward_call(*input, **kwargs)
   1131 # Do not call functions when jit is used
   1132 full_backward_hooks, non_full_backward_hooks = [], []


File c:\Software\Miniconda3\envs\d2l\lib\site-packages\torch\nn\modules\linear.py:114, in Linear.forward(self, input)
    113 def forward(self, input: Tensor) -> Tensor:
--> 114     return F.linear(input, self.weight, self.bias)


RuntimeError: mat1 and mat2 shapes cannot be multiplied (2x10 and 20x256)

(3) What needs to be done if the inputs have different dimensions?

Adjust the dimensions, either filling or dimensionality reduction.

Guess you like

Origin blog.csdn.net/qq_43941037/article/details/132914685
Recommended