Evaluating Theory of Mind in QA

Reasoning about beliefs

  1. possessing a capacity similar to human reasoning has been argued to be necessary for the success of artificial intelligence systems.

  2. One well-studied domain that requires reasoning is question answering, where simply memorizing and looking up information is often not enough to correctly answer a question.

  3. Facebook bAbi dataset, simple reasoning tasks

  4. People reason not just about their own observations and beliefs, but also about others' mental states (such as beliefs and intentions). The capacity to recognize that others can have mental states different than one's own-theory of mind- marks an important milestone in the development of children and has been extensively studied by psychologists. TOM review.

  5. Sally-Anne task and a bAbi-style dataset, a first step in designing benchmarks to evaluate the mental-state reasoning capacity of question-answering models, but it is still limited in the types of reasoning it probes. Evaluation is only one question. This does not guarantee that a model has an understanding of the state of the world; in fact, even in developmental theory-of-mind experiments, children are asked a few questions to ensure that their correct answer reflects their understanding and is not simply due to chance.

  6. in this paper, we address these shortcomings by designing a new dataset that enables us to evaluate a model's capacity to reason about different types of beliefs as well as whether it maintains a correct understanding of the world.

猜你喜欢

转载自www.cnblogs.com/lifengfan/p/10012046.html
QA