Generating Diverse Corrections with Local Beam Search for Grammatical Error Correction翻译

Summary

In this research, we propose a cluster search method to obtain diversified outputs in local sequence conversion tasks where most characters in the source sentence and target sentence overlap, such as in grammatical error correction (GEC). ) In . In GEC, the ideal is to rewrite only the partial sequence that must be rewritten, while keeping the correct sequence unchanged. However, the existing methods of obtaining various outputs focus on modifying all the characters of the sentence. Therefore, the existing methods may force the entire sentence to be changed to generate non-grammatical sentences, or reduce constraints to avoid generating non-grammatical sentences and generate non-diversified sentences. Considering these issues, we propose a method that does not rewrite all characters in the text, but only those parts that require various corrections. Our cluster search method adjusts the search characters in the cluster according to the predicted probability of copying from the source sentence. Experimental results show that our proposed method can produce more corrections than existing methods without reducing the accuracy of the GEC task.

1 Introduction

Insert picture description here
  Grammatical Error Correction (GEC) is a task of correcting grammatical errors in input text. Depending on the input given, there are multiple ways to correct this type of text. For example, for the same grammatical error text, 10 annotators can produce 10 different effective correction results. If the GEC model shows multiple candidate correction objects, it can help the user decide whether to use the correction result, so that the user can choose the correct expression he likes from the candidate objects.
  However, the current existing GEC model does not consider generating multiple correction candidates . Generally, in GEC, the method used to obtain multiple corrections involves using ordinary cluster search to generate top-n best candidates. However, it has been shown that ordinary cluster search cannot provide good enough candidates and cannot generate nearly identical sequence lists . Therefore, in the absence of diversity control, the nn generated by the cluster searchThe n best candidates will not provide useful additional information. Considering this problem, several cluster search methods have been proposed to generate different candidate objects. These different cluster search methods encourage diversity by globally rewriting all characters in a sentence. We refer to these methods as variousglobal cluster search methods. On the contrary, considering the local sequence conversion task in GEC, where most of the characters in the source sentence and the target sentence overlap, it is not recommended to correct the input sentence too much, because unnecessary rewriting will damage the input sentence The grammatically correct part. However, encouraging unnecessary corrections will reduce the performance of the GEC itself. Therefore, we assume that ordinary cluster search and diversified global cluster search methods may not be suitable for the GEC task, and the GEC model must correct the grammatical errors of the input sentence in a variety of ways while retaining the correct part of the sentence.
  In this research, we propose adiversified local cluster search methodto consider whether to correct characters in the cluster search process to obtain diversified output. Please note that our method can be used for any partial sequence conversion task. Figure 1 shows a comparison between the existing method and the proposed method.
  The proposed cluster search method considers the following factors:
  (a) In the general cluster search, the correction is concentrated on a specific path. Therefore, this method generates sentences with similar character combinations and a small number of word types.
  (B) Various global cluster search methods explored many different paths. Therefore, unlike ordinary cluster search, this method generates sentences with various character combinations and a large number of word types. However, it also generates candidate corrections for characters that do not need to be corrected.
  (C) The proposed diversified local cluster search only expands different paths for characters that need to be corrected. Therefore, the sentences generated by our diversified local cluster search have more diversified combinations in the positions that need to be modified than the characters generated by the ordinary cluster search.
  It should be noted that all the above methods have the samennn paths, but the path content is different. Experimental results show that compared with existing methods, our diversified local cluster search can generate more diversified and accurate top-n candidates, and the performance of evaluating data sets in the GEC task has almost no degradation.

2. Related work

Recently, some studies have proposed the use of cluster search to obtain different outputs. Li et al. (2016) modified the standard cluster search to penalize search scores with the same parent node. Their algorithm only recommends those hypotheses from different parent nodes. Vijayakumar et al. (2018) proposed a method to divide the cluster into several groups and perform a cluster search for each group. In addition, they added a constraint that makes it difficult to select characters selected by other groups at the same time. Kulikov et al. (2019) proposed an iterative beam search, which produces a more diverse set of candidate responses in neural dialogue modeling. However, these studies did not distinguish between the parts that do not need to be rewritten and the parts that need to be corrected .

3. Diversified local cluster search

The diversified local cluster search encourages candidates to make various corrections to the parts of the input sentence that must be corrected, and prevents candidates from correcting the correct parts of the input sentence. Therefore, it runs a relatively small number of calculations to generate various candidates. For this purpose, at each moment ttt will penalize the scoresb, t s_{b,t}sb,tAssigned to each cluster bbb Indicates whether correction should be made. Although different methods can be used to calculate the penalty score, in this study, we use the copy probability in the copy-augmented model as the penalty scoresb, t s_{b,t}sb,t. We will explain the copy-augmented model in detail in Section 4.1. Using the penalty score, we use the cluster search score kkk is punished as follows:
kb, t = (λ sb, t + β) log pbt (1) k_{b,t}=(\lambda s_{b,t}+\beta)log~p_{b_t}\ tag{1}kb,t=(λsb,t+b ) l o g p bt( 1 )
whereppp is the output distribution of the GEC model. β ββλ λλ is a hyperparameter, whereβ ββ prevents the penalty from falling to zero, andλ λλ determines the intensity of the penalty.

4. Experiment

4.1 Model

We use the copy-augmented model as the GEC model. The model passes the balance factor α copy α^{copy}ac o p y control copy distributionpcopyp^{copy}pc o p y and generating distributionpgenp^{gen}pThe balance between g e n . pcopyp^{copy}pc o p y is the probability distribution of characters to be copied from the source sentence, andpgenp^{gen}pg e n is the generation probability distribution of predicted output characters. The final output distributionpb, t p_{b,t}pb,t计算如下:
p b , t = ( 1 − α b , t c o p y ) p b , t g e n + α b , t c o p y p b , t c o p y (2) p_{b,t}=(1-\alpha^{copy}_{b,t})p^{gen}_{b,t}+\alpha^{copy}_{b,t}p^{copy}_{b,t}\tag{2} pb,t=(1ab,tcopy)pb,tgen+ab,tcopypb,tcopy( 2 )
Copying or generating characters from the source sentence can be regarded as a choice between correction or not. Therefore, we useα b, tcopy α^{copy}_{b,t}in formula 1.ab,tcopyUsed as penalty score sb, t s_{b,t}sb,t. For diversified local cluster search, we set β = 1.0 β=1.0b=1 . 0λ = 4.0 λ = 4.0λ=4.0

Guess you like

Origin blog.csdn.net/qq_28385535/article/details/113174224