NLP-文本摘要:数据集介绍及预处理【CNN/DailyMail】

论文《Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond》第一次提出。
训练集中的源文档平均有766个单词,共29.74句,而摘要由53个单词和3.72句组成。【The source documents in the train- ing set have 766 words spanning 29.74 sentences on an average while the summaries consist of 53 words and 3.72 sentences】

Guess you like

Origin blog.csdn.net/u013250861/article/details/121033651