Transfer learning xlm-roberta-base model is applied to classification tasks

Download modelInsert image description here

Load model

The methods recommended by the official website are not used here.

from transformers import AutoTokenizer, AutoModelForMaskedLM

tokenizer = AutoTokenizer.from_pretrained("xlm-roberta-base")

model = AutoModelForMaskedLM.from_pretrained("xlm-roberta-base")

Use the normal loading method
https://huggingface.co/docs/transformers/model_doc/xlm-roberta#transformers.XLMRobertaTokenizer

from transformers import XLMRobertaTokenizer, XLMRobertaModel, BertConfig

Set Config class

class Config(object):
    def __init__(self, dataset):
    # 还有很多自己设定的config,我这里没写,每个人都不一样
    self.num_classes = n # 设置n分类
    self.hidden_size = 768
    self.model_path = "/***/xlm-roberta-base"
    self.tokenizer = XLMRobertaTokenizer.from_pretrained("xlm-roberta-base")
    self.bert_config = BertConfig.from_pretrained(self.model_path + '/config.json')
        

Set up classification model

class Model(nn.Module):
    def __init__(self, config):
        super(Model, self).__init__()
        self.bert = XLMRobertaModel.from_pretrained(config.model_path, config=config.bert_config)
        self.fc = nn.Linear(config.hidden_size, config.num_classes)


    def forward(self, x):
        context = x  # 输入的句子
        _, pooled = self.bert(context, attention_mask=mask,return_dict=False)
        # 如果电脑又GPU,要加上return_dict=False,否则不用加
        out = self.fc(pooled)
        return out

The remaining training functions are very simple, just write them yourself

Guess you like

Origin blog.csdn.net/weixin_46398647/article/details/124476171