Pytorch fixed parameters-model pretrain and fine-tune

Looking through many blogs and forums, generally freezing parameters include two steps:

  1. Set the attribute of the parameter to False, ie requires_grad=False
  2. When the optimizer is defined, the parameters that do not update the gradient are filtered out, which is generally the case
optimizer.SGD(filter(lambda p: p.requires_grad, model.parameters()), lr=1e-3)

I won't go into details above, most of Baidu is like this.


Let me talk about my task first:

I have a model consisting of an encoder and a decoder. The parameters of the decoder are fixed during pre-training, and only the parameters of the encoder are trained. Then train all the parameters during fine-tune.

problem:

According to the above method, an error of inconsistent length will be reported when reloading the model.

ValueError: loaded state dict contains a parameter group that doesn't match the size of optimizer's group

After debugging for a long time, I found that the model I loaded only saved the parameters of the encoder part, but the new model is the parameters of the encoder and decoder. Therefore, the pre-trained parameters cannot be loaded into the new model.

Solution:

Only set the attribute of the parameter to True/False, without filtering the parameter in the optimizer, so that the length will be consistent.

Moreover, the fixed parameters during the pre-training process are indeed not updated, and all the parameters are updated during the fine-tune, which just meets our requirements.


Attach my adjustment process:

  • Pre-training: only modify attributes, do not filter parameters
    for param in model.parameters():
        param.requires_grad = False
    for param in model.encoder.parameters():
        param.requires_grad = True

Output the two updated parameters, you can find that only the encoder is updated, the decoder is not updated.

  • fine-tune:
    for param in model.parameters():
        param.requires_grad = True

Also output the updated parameters twice, you can find that the decoder parameters have also been updated. over!

Guess you like

Origin blog.csdn.net/Answer3664/article/details/104874243