] [Pytorch loading Pre-trained model and the Fine-tuning the network structure after modified

在实际工作或者学习当中,为了节省时间提高效率,我们在深度学习训练中,一般会使用已经训练好的开源模型(一般都是基于ImageNet数据集),但通常情况下我们自己涉及的模型和别人训练好的有很多地方不一样。 难道我们就没法用了吗?当然不是,我们可以有很多种方法去实现我们想要的。


In fact, not to learn, just waiting for the Lakers beat the Clippers game


Pre-trained

There are currently three ways to load the Pre-trained model:

  • The first is to modify the final output layer fully connected network;
  • The second model is the network layer of the selective loading;
  • A third method is to transplant directly trained network model transplanted to our own network model them.
#导入头文件
from torch import nn
import torch
from torchvision import models
from torch.autograd import Variable
from torch import optim

method one

#改变最后输出类别数
transfer_model = models.resnet18(pretrained=True)

dim_in = transfer_model.fc.in_features
transfer_model.fc = nn.Linear(dim_in,10) #img_class =10
#print(transfer_model)

Method Two

for param in transfer_model.parameters():
    param.requires_grad = False

optimizer = optim.SGD(transfer_model.fc.parameters(),lr=1e-3)
#为了加快效率,我们只在优化器中更新全连接部分中的参数

Method Three

resnet50 = models.resnet50(pretrained=True)#加载model
cnn = CNN(Bottleneck, [3, 4, 6, 3])#自定义网络

#读取参数
pretrained_dict = resnet50.state_dict()
model_dict = cnn.state_dict()

# 将pretrained_dict里不属于model_dict的键剔除掉
pretrained_dict =  {k: v for k, v in pretrained_dict.items() if k in model_dict}

# 更新现有的model_dict
model_dict.update(pretrained_dict)

# 加载我们真正需要的state_dict
cnn.load_state_dict(model_dict)

# print(resnet50)
print(cnn)

Fine-tuning

When and How to Fine-tune

Decide how to use the migration factor has a lot to learn, this is the most important of only two: the size of the new data set, as well as the similarity of the new data and the original data set. One thing we must remember: before the network layers have learned is universal features, behind layers have learned is related to the type of characteristics. Here are four scenarios used:

1. New data sets and are small and similar to the original data set. Because the new data set is relatively small, if fine-tune may be over-fitting; and because the old and new data collection Similarly, we expect a similar level of their characteristics, can be used as a pre-trained network feature extractor, with the extracted features training linear classifiers.

2. A new data set is large and the original data set and similar. Because the new data set is large enough, you can fine-tune the entire network.

3. A new data set is small and is not similar to the original data set. The new data set is small, it is best not fine-tune, and the original data sets are not similar, it is best not to use high-level features. However, this time using the features of the front layer is trained SVM classifier.

4. The new data set and the original large data sets and is not similar. Because the new data set is large enough, you can re-training. But in practice fine-tune the pre-training model is still useful. The new data set is large enough, the entire network can be fine-tine.

Published 44 original articles · won praise 9 · views 10000 +

Guess you like

Origin blog.csdn.net/Jeremy_lf/article/details/104744809