KNOWLEDGE SOLVER: TEACHING LLMS TO SEARCH FOR DOMAIN KNOWLEDGE FROM KNOWLEDGE GRAPHS

This article is a series of LLM articles, focusing on the translation of "KNOWLEDGE SOLVER: TEACHING LLMS TO SEARCH FOR DOMAIN KNOWLEDGE FROM KNOWLEDGE GRAPHS".

Knowledge Solver: Teach LLMS to search domain knowledge from the knowledge graph

Summary

Large language models (LLMs), such as ChatGPT and GPT-4, are general-purpose and can solve different tasks due to their emergent capabilities and generalizability. However, LLM sometimes lacks domain-specific knowledge to perform tasks, which can also cause hallucinations during reasoning. In some previous works, add-on modules such as graph neural networks (GNN) were trained based on knowledge retrieved from external knowledge bases, aiming to alleviate the problem of lack of domain-specific knowledge. However, incorporating additional modules: 1) additional modules need to be retrained when encountering new domains; 2) will become a bottleneck because the powerful capabilities of LLM are not fully used for retrieval. In this paper, we propose a paradigm called Knowledge Solver (KSL) to teach LLM to search for basic knowledge from external knowledge bases by leveraging its own strong generalizability. Specifically, we design a simple yet effective cue to convert retrieval into a multi-hop decision sequence, enabling LLM to search for knowledge in a zero-shot manner. In addition, KSL can provide a complete retrieval path, thereby improving the interpretability of the LLM inference process. We conduct experiments on three datasets: CommonsenseQA, OpenbookQA and MedQA USMLE, and find that our method improves the LLM baseline performance by a relatively large margin.

1 Introduction

2 Related work

3 Problem definition

4 methods

5 experiments

6 Conclusion

In this paper, we propose Knowledge Solver (KSL), which can help LLM better perform domain-specific knowledge-demanding tasks in a zero-shot and fine-tuned manner. With external knowledge, LLM can leverage its ability to search for the necessary knowledge and information to perform relevant tasks without the need for additional training or modules. Our interactive reasoning method not only explicitly injects knowledge into LLM, but also guides LLM to solve tasks. We also show that performance improvements come primarily from our specially designed inference methods (for zero-shots) and tasks (for fine-tuning), rather than instruction adjustments. Currently, the initial problem entities for our interactive reasoning approach are randomly selected. We leave to further study how to select the first entity to initialize the execution task.

Guess you like

Origin blog.csdn.net/c_cpp_csharp/article/details/132887798