Some details about the input of the pre-trained model in the Transformers library

Do I need to add special marks such as start mark [bos] or bos when inputting?

in conclusion:

The start sign [bos] does not need to be added manually, the model will automatically add it for you, and the end sign [eos] must be added.

The specific reasons can be found below

1. Do I need to add special marks such as start mark [bos] or bos when inputting?

In the era of RNN, for the Seq2Seq model, we must add the start flag [bos] and the end flag [eos] to the data processing. The purpose of this is to perform autoregressive language in the decoding phase of the model. The model can receive an end flag [eos] and decode the input start flag [bos] to ensure the model does not see the first real word.
For example, we enter a sentence: Who are you?
First process it into the following format: [ bos ] Who are you? [ e o s]
Such cognitions and habits have been used by us to this day.
But does the pre-trained model provided by the Transformer library require you to perform such operations manually?

2. Take Bart model as an example

We found this code in the first line of the forward part of the source code
Insert image description here
Continue clicking:
Insert image description here
We will find that we can There is no need to pass in the decoder_input_ids parameter, the model will be automatically generated for you. Its source is to move the labels to the right and add a start mark at the beginning. The start mark model here is subscript 2

Test this method manually
Insert image description here
Sure enough, the model automatically added the start flag for us.
Therefore, when we process data, we do not need to manually add the start flag, otherwise it will cause the duplication of start flags and affect the prediction of the model.
It is worth noting that we need to add the end flag [ e o s ]. The model will not add it automatically. If it is not added, it will lead to not knowing what can end when decoding.

Supongo que te gusta

Origin blog.csdn.net/q506610466/article/details/124163980
Recomendado
Clasificación