Sequence to Sequence Learning with Neural Networks 论文理解

0. Description

Sequence-to-sequence mapping data learning modeling, using deep neural networks

https://ai.deepshare.net/detail/p_5d54e025bab7d_EUVqzfFX/6

One of 30 precision papers, understand why the structure of LSTM can model language/speech models

1. Abstract

Deep neural network (DNN) is a powerful model that has achieved outstanding performance on difficult learning tasks. Although DNNs work well when large training sets of labels are available, they cannot be used to map sequences to sequences. In this article, we propose a general end-to-end sequence learning method that makes minimal assumptions on the sequence structure (a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure) . Our method uses a multi-layer Long Short-TermMemory (LSTM) to map the input sequence to a vector of fixed dimensions, and then uses another deep LSTM to decode the target sequence from the vector. Our main result is that in the English-to-French translation task from the WMT'14 data set, the BLEU score of the translation generated by LSTM on the entire test set reached 34.8, and the BLEU score of LSTM was performed on words other than vocabulary. punishment. In addition, LSTM has no difficulty in long sentences. For comparison, the BLEU score of the phrase-based SMT system on the same data set reached 33.3. When we used LSTM to reorder the 1000 hypotheses generated by the above SMT system, its BLEU score increased to 36.5, which was close to the previous best result of this task. LSTM also learns wise phrase and sentence representations that are sensitive to word order and relatively invariant to active and passive speech. Finally, we found that reversing the order of words in all source sentences (not target sentences) can significantly improve the performance of LSTM, because doing so will introduce many short-term dependencies between the source sentence and the target sentence, making the optimization problem more easy

Guess you like

Origin blog.csdn.net/u013625492/article/details/114826314