[Deep Learning | Python] Introduction to AutoModel and AutoProcessor

from transformers import AutoModel, AutoProcessor

transformers is a natural language processing ( NLP ) model library developed and maintained by Hugging Face , which provides a variety of pre-trained models that can be used for text classification, information extraction, natural language generation and other tasks.

This library contains many commonly used NLP models, such as BERT , GPT , RoBERTa , T5 , etc.

AutoModel and AutoProcessor are two modules in the transformers library for loading pre-trained models and processors.

AutoModel is used to load the pre-trained model, which can automatically select the corresponding model according to the model name, and load the model as an object that can be used directly.

AutoProcessor is used to load the processor. It can automatically select the corresponding processor according to the model name, and load the processor as an object that can be used directly. When performing downstream tasks, it is necessary to convert text data into model input. At this time, it is necessary to Processor is used.

The following example uses AutoModel and AutoProcessor to load the pre-trained model and processor respectively:

from transformers import AutoModel, AutoProcessor

# 加载模型和处理器
model_name = 'bert-base-uncased'
model = AutoModel.from_pretrained(model_name)
processor = AutoProcessor.from_pretrained(model_name)

# 处理文本输入,将其转换为模型输入
text = "Hello, world!"
tokens = processor(text, return_tensors='pt', padding=True, truncation=True)

# 将模型输入传给模型
outputs = model(**tokens)

# 打印模型输出
print(outputs)

The output is:

Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
BaseModelOutputWithPoolingAndCrossAttentions(last_hidden_state=tensor([[[-0.0781,  0.1587,  0.0400,  ..., -0.2805,  0.0248,  0.4081],
         [-0.2016,  0.1781,  0.4184,  ..., -0.2522,  0.3630, -0.0979],
         [-0.7156,  0.6751,  0.6017,  ..., -1.1032,  0.0797,  0.0567],
         [ 0.0527, -0.1483,  1.3609,  ..., -0.4513,  0.1274,  0.2655],
         [-0.7122, -0.4815, -0.1438,  ...,  0.5602, -0.1062, -0.1301],
         [ 0.9955,  0.1328, -0.0621,  ...,  0.2460, -0.6502, -0.3296]]],
       grad_fn=<NativeLayerNormBackward0>), pooler_output=tensor([[-0.8130, -0.2470, -0.7289,  0.5582,  0.3357, -0.0758,  0.7851,  0.1526,
         -0.5705, -0.9997, -0.3183,  0.7643,  0.9550,  0.5801,  0.9046, -0.6037,
         -0.3113, -0.5445,  0.3740, -0.4197,  0.5471,  0.9996,  0.0560,  0.2710,
          0.3869,  0.9316, -0.7260,  0.8900,  0.9311,  0.5901, -0.5208,  0.0532,
         -0.9711, -0.1791, -0.8414, -0.9663,  0.2318, -0.6239,  0.0885,  0.1203,
         -0.8333,  0.1662,  0.9993,  0.1384,  0.1207, -0.3476, -1.0000,  0.2947,
         -0.7443,  0.7037,  0.6978,  0.5853,  0.0875,  0.4013,  0.3722,  0.1009,
         -0.1470,  0.1421, -0.2055, -0.4406, -0.6010,  0.2476, -0.7887, -0.8612,
          0.8639,  0.7504, -0.0738, -0.2541,  0.0941, -0.1272,  0.7828,  0.1683,
          0.0685, -0.8279,  0.4741,  0.2687, -0.6123,  1.0000, -0.3837, -0.9341,
          0.5166,  0.5990,  0.5714, -0.2885,  0.4897, -1.0000,  0.2800, -0.1625,
         -0.9728,  0.2292,  0.3729, -0.1447,  0.2490,  0.5224, -0.5050, -0.3634,
         -0.2048, -0.7688, -0.2677, -0.1745, -0.0355, -0.2574, -0.1838, -0.3517,
          0.2785, -0.3823, -0.3204,  0.4208, -0.0671,  0.6005,  0.3758, -0.3386,
          0.4421, -0.9251,  0.5425, -0.2365, -0.9684, -0.5510, -0.9714,  0.4726,
         -0.2355, -0.3178,  0.8958,  0.1285,  0.2222,  0.0103, -0.5784, -1.0000,
         -0.5691, -0.5153, -0.0901, -0.1982, -0.9424, -0.9055,  0.4781,  0.9141,
          0.0904,  0.9976, -0.2006,  0.8990, -0.3713, -0.6045,  0.5630, -0.3681,
          0.7174,  0.1177, -0.4574,  0.1722, -0.0565,  0.2068, -0.5352, -0.1658,
...
         -0.3599, -1.0000,  0.3665, -0.2367,  0.6221, -0.5721,  0.3542, -0.5887,
         -0.9486, -0.2115,  0.1483,  0.6009, -0.4153, -0.6647,  0.4821, -0.1477,
          0.8825,  0.7133, -0.2224,  0.2536,  0.5956, -0.6499, -0.6185,  0.8514]],
       grad_fn=<TanhBackward0>), hidden_states=None, past_key_values=None, attentions=None, cross_attentions=None)

The BERT model and BERT processor are used here , and the BERT model and BERT processor are first loaded . Then use the processor to convert the text data into model input, and then pass the model input to the model for calculation to obtain the output of the model.

Guess you like

Origin blog.csdn.net/wzk4869/article/details/130649159