Introduction and prospects of automatic question and answer technology-Rhinoceros Bird Research Camp Notes

This article is the notes of "Introduction and Prospects of Automatic Question and Answer Technology" shared by Teacher Li Peng in the fifth lesson of Tencent's 2020 Summer Rhinoceros Research Camp.

 

IBM DeepQA defeated the two strongest human players on Jeopardy in 2011.

Complex systems. Non-deep learning methods.

 

Elements of deep learning success:

1. Massive data.

CNN/DailyMail data set.

SQuAD dataset. Keywords: 1. Manual annotation 2. Large scale, with 10k+ questions 3. Involves a variety of reasoning

SQuAD 2.0 dataset.

CoQA。

2. Pre-trained model: BERT (important). OpenAI GPT. ELMo.

 

 

Challenge - Multi-step Reasoning (HotpotQA)

Discrete/Symbolic Reasoning (DROP)

Commonsense Reasoning (CommonsenseQA)

Open Q&A

Computational efficiency (as the model grows, calculations become harder)

 

The development of question answering technology is inseparable from well-designed data sets

What problems remain unresolved? Whether the design of the data set is sufficient to reflect the corresponding problems? The pre-trained model can become a touchstone to a certain extent.

 

 

 

 

 

 

Guess you like

Origin blog.csdn.net/zjy997/article/details/107470511