【Paper notes】DialoGPT:Large-Scale Generative Pre-training for Conversational Response Generation

DIALOGPT:Large-Scale Generative Pre-training for Conversational Response Generation

insert image description here

Conference : ACL2020 System Demonstrations

Original : DialoGPT-acl2020demos

Source code : project address

Abstract

Obtained 147M comment data from Reddit from 2005 to 2017 for generative pre-training, and released the DialoGPT model. DialoGPT has the ability of general dialogue reply, which can generate more relevant, content-rich and context-consistent replies.

Model

Model Architecture

Using GPT2 as the basic architecture, the multi-turn dialogue is modeled as a long text, and the dialogue generation task is regarded as language modeling. Specifically, a multi-turn dialogue unit is spliced ​​into a long text, ending with a text marker. Then represent the source (dialogue history) as S = x 1 , ⋅ ⋅ ⋅ , xm S = x_1,···,x_mS=x1⋅⋅⋅xm, express the target (generating target) as T = xm + 1 , ⋅ ⋅ ⋅ , x NT = x_{m+1}, ···, x_NT=xm+1⋅⋅⋅xN. Take the second sentence to the last sentence as replies and optimize all source-target pairs.

Mutual Information Maximization

  • To deal with the problem that open-domain text generation tends to generate boring, uninformative text, this paper implements a maximum mutual information scoring function (MMI).

  • MMI employs a pre-trained backward model to predict the source sentence from a given reply, namely: P ( Source ∣ T target ) P(Source|Target)P ( S o u rce T a r g e t ) . First generate a series of hypotheses through top-sampling, and then useP ( Source ∣ H hypothesis ) P(Source|Hypothesis)P ( S o u rce Hy p o t h es i s ) reorders the hypotheses.

  • The specific operation is to first copy the dialogue history into N copies (N in the text is 16), and then input the same N copies of the dialogue history into the forward model, and get N different answers through Top-K sampling (K in the text is 10). Then the N answers are spliced ​​with the reverse dialogue history into a long text, input into the reverse model, and finally the answer with the smallest loss is taken as the model output . Both the forward model and the backward model are a 345M fine-tuned model from GPT-2medium. The experimental results show that MMI can generate more diverse responses, which is reflected in the relatively high indicators of NIST, METEOR, Entropy, and Dist.

  • Intuition: Intuitively, maximizing the backward model likelihood penalizes boring assumptions , because frequent and repeated assumptions can be associated with many possible queries (contexts), resulting in lower probabilities for any particular query . (That is, there are many Sources corresponding to the general reply, so when a general reply is generated, it is unlikely to reverse a specific Source , thus improving the quality of the reply.)

Experiment

  • The author found that the model has surpassed human performance on some indicators, but the author believes that this does not mean that the results generated by the model are more realistic than humans, but may be due to the one-to-many nature of the dialogue process . As shown, multiple human responses (R1-R4) work well as answers above. Without loss of generality, assume that R1-R3 are the labels of the test set, and R4 is the answer given by humans. In semantic space, the responses Rg generated by a well-trained model will tend to lie near the geometric centers of all possible responses , since the training goal seeks to generate the most probable responses . Such a result may be close to the geometric mean of all training instances, thus "averaging" these instances. Therefore, the generated response Rg may have a lower "semantic distance" (reflected by higher automatic scores such as BLEU) than the target human response R4.

insert image description here

  • The author compared directly fine-tuning GPT-2 vs. re-pretraining DialoGPT. The experimental results show that for small models, direct fine-tuning of GPT2 is better, while large models have comparable performance.

Guess you like

Origin blog.csdn.net/m0_47779101/article/details/129969072