Use Transformers' Trainer to fine-tune pre-trained large models in PyTorch - Code World

Use Transformers' Trainer to fine-tune pre-trained large models in PyTorch

Enterprise 2023-09-30 02:27:11 views: null

background

Transformers provides a very convenient API to fine-tune large models. Let’s talk about the steps of using Trainer to fine-tune large models.

Step 1: Load the pre-trained large model

from transformers import AutoModelForSequenceClassification

model = AutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased")

Step 2: Set training hyperparameters

from transformers import TrainingArguments

training_args = TrainingArguments(
    output_dir="path/to/save/folder/",
    learning_rate=2e-5,
    per_device_train_batch_size=8,
    per_device_eval_batch_size=8,
    num_train_epochs=2,
)

For example, this set epoch equal to 2

Step 3: Get the tokenizer

from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")

Step 4: Load the dataset

from datasets import load_dataset

dataset = load_dataset("rotten_tomatoes")  # doctest: +IGNORE_RESULT

Step 5: Create a word segmentation function and specify the fields in the data set that need to be segmented:

def tokenize_dataset(dataset):
    return tokenizer(dataset["text"])

Step 6: Call map() to apply the word segmentation function to the entire data set

dataset = dataset.map(tokenize_dataset, batched=True)

Step 7: Use DataCollatorWithPadding to fill data in batches to speed up the filling process:

from transformers import DataCollatorWithPadding

data_collator = DataCollatorWithPadding(tokenizer=tokenizer)

Step 8: Initialize Trainer

from transformers import Trainer

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=dataset["train"],
    eval_dataset=dataset["test"],
    tokenizer=tokenizer,
    data_collator=data_collator,
)  # doctest: +SKIP

Step 9: Start training

trainer.train()

Summarize:

Using the API provided by Trainer, you can fine-tune a large model in just nine simple steps and a dozen lines of code. Do you want to give it a try?

Guess you like

Origin blog.csdn.net/duzm200542901104/article/details/133081182

Use Transformers' Trainer to fine-tune pre-trained large models in PyTorch

Overview of pre-trained language models (3) - the actual use of pre-trained language models

pytorch uses cnn_finetune to call pre-trained models

How to use the pre-trained model to fine-tune the model (such as freezing certain layers, setting different learning rates for different layers, etc.)

pytorch cv comes with a pre-trained model and then fine-tunes it

Pytorch loads pre-trained models, fine-tunes, and adds its own layers to existing models, and sets different parameter updates for different layers

How to use pre-trained BERT as embedding in pytorch

BERT model fine-tuning in practice: Use Transformers to fine-tune the BERT model for question answering and text classification tasks

LLMs: LLaMA Efficient Tuning (an efficient tool that can efficiently fine-tune [full parameters/LoRA/QLoRA] mainstream large models [ChatGLM2/LLaMA2/Baichuan, etc.] [pre-training + instruction supervision fine-tuning +

Paddle: Load pre-trained weights and fine-tune fixed partial weights

【Paper Notes】Text Detoxification using Large Pre-trained Neural Models

What are the challenges faced by the implementation of large-scale pre-trained models?

Load pytorch pre-trained models vgg, resnet, alexnet, etc. offline or online

65 billion parameters, 8 GPUs can fine-tune all parameters: Qiu Xipeng's team has lowered the threshold of large models

65 billion parameters, 8 GPUs can fine-tune: Qiu Xipeng's team has lowered the threshold of large models

] [Pytorch loading Pre-trained model and the Fine-tuning the network structure after modified

Collation, summary and introduction of large-scale pre-trained language models in the field of LegalAI (continuous update ing...)

Generative AI New World | Hands-on practice in the field of Vincent graphs: fine-tuning of pre-trained models

[PyTorch] Conversion of pre-trained weights

Some details about the input of the pre-trained model in the Transformers library

Exclusive | When to fine-tune a large language model?

Pytorch fixed parameters-model pretrain and fine-tune

[Transformers 03] Use Pytorch to start building transformers

Use peft's lora to fine-tune MAE

Task tuning based on transformer and related pre-trained models

IJCAI2023 | A Systematic Survey of Chemical Pre-trained Models (Review of Pre-trained Models of Chemical Small Molecules)

Prompt-"Design Prompt Template: Use less data to achieve superior performance of pre-trained models, helping Few-Shot and Zero-Shot tasks"

pytorch loads some parameters of the pre-trained model

Loading mechanism pytorch version of pre-trained model

Questions regarding image processing for pre-trained image classifier in PyTorch

Recommended

TIOBE May list: Fortran “resurrected” into Top 10

GCC 14.1 released

Ranking

B. Little Girl and Game【1300 / 回文字符串博弈论】

CIKERS Shane 20190613

"Javascript advanced programming" study notes - the constructor and prototype

beeline hiveserver2 start

springboot - Automatically backup mysql data every day

Data Storage Full Solution--Detailed Persistence Technology

Detailed Explanation of Spring Web MVC DispatcherServlet—Official Original

TCP / IP protocol layers structure and function

Command type literal pos: unknown； Fallback type literal pos: unknown] with root cause

Design of multifunctional curtain controller with indoor anti-theft alarm

Daily

More

2024-05-08(18)

2024-05-07(34)

2024-05-06(6)

2024-05-05(0)

2024-05-04(18)

2024-05-03(8)

2024-05-02(0)

2024-05-01(4)

2024-04-30(36)

2024-04-29(5)