The latest roundup! When the large language model (LLM) meets the knowledge map: the two technologies complement each other

Author | Du Wei

Source | Heart of the Machine

Enter the NLP group —> join the NLP exchange group

A multi-graph review to clarify the current research status, this 29-page paper is worth reading.

Large language models (LLMs) are already strong, but can be even stronger. By combining knowledge graphs, LLM is expected to solve many problems such as lack of factual knowledge, hallucinations, and interpretability; and in turn, LLM can also help knowledge graphs with powerful text and language understanding capabilities. And if the two can be fully integrated, we may also get a more versatile artificial intelligence.

Today we will introduce a paper that summarizes the research on the joint research of LLM and knowledge graph, which includes both the research progress of using knowledge graph to enhance LLM, the research results of using LLM to enhance knowledge graph, and the recent results of LLM and knowledge graph collaboration. The general frame display in the article is very convenient for readers to refer to.

031a683de198f5904d0c3ab3a39cf4fd.png

Paper link:

https://arxiv.org/abs/2306.08302

Large-scale language models (LLMs) pre-trained on large-scale corpora, such as BERT, RoBERTA, and T5, have been able to handle a variety of natural language processing (NLP) tasks very well, such as question answering, machine translation, and text generation. Recently, with the dramatic increase in model size, LLM has also further acquired emergent capabilities, opening the way to use LLM as artificial general intelligence (AGI). State-of-the-art LLMs such as ChatGPT and PaLM2 have tens to hundreds of billions of parameters, and they have the potential to solve many complex real-world tasks, such as education, code generation, and recommendation.

Although LLMs have had many successful applications, they have been criticized for lack of factual knowledge. Specifically, LLM memorizes the facts and knowledge contained in the training corpus. However, further research showed that LLMs cannot recall facts and often suffer from hallucination problems, generating representations with false facts. For example, if you ask LLM, "When did Einstein discover gravity?" it might say, "Einstein discovered gravity in 1687." But in fact, the person who proposed the theory of gravity is Isaac g newton. This kind of problem can seriously damage the credibility of LLM.

LLM has been criticized as a black-box model and lacks interpretability. LLM implicitly represents knowledge through parameters. Therefore, it is difficult for us to interpret and verify the knowledge acquired by LLM. Furthermore, LLM performs reasoning through probabilistic models, which is a non-deterministic process. It is difficult for humans to directly obtain details and explanations for the specific patterns and functions that LLMs use to arrive at predictions and decisions.

Although some LLMs are capable of explaining their own predictions through the use of a chain-of-thought, the explanations they infer still suffer from hallucinations. This can seriously affect the application of LLM in high-stakes scenarios, such as medical diagnosis and legal judgment. As an example, in a medical diagnosis scenario, LLM may misdiagnose and provide explanations that go against common medical knowledge. This leads to another problem: LLMs trained on general corpora may not generalize well to domain-specific or new knowledge due to lack of domain-specific knowledge or new training data.

To solve the above problems, a potential solution is to integrate knowledge graph (KG) into LLM. Knowledge graphs can store huge amounts of facts in the form of triples, namely (head entity, relation, tail entity), so knowledge graphs are a structured and decisive form of knowledge representation, examples include Wikidata, YAGO, and NELL.

Knowledge graphs are critical for a variety of applications as they provide accurate and unambiguous knowledge. They are also known to have great symbolic reasoning capabilities, which can generate interpretable results. The knowledge graph can also actively evolve with the continuous input of new knowledge. In addition, by allowing experts to build domain-specific knowledge graphs, it is possible to provide accurate and reliable domain-specific knowledge.

However, knowledge graphs are difficult to construct, and current knowledge graph methods are difficult to cope with since real-world knowledge graphs are often incomplete and dynamically changing. These methods cannot effectively model unseen entities and represent new knowledge. Moreover, the rich textual information in knowledge graphs is often overlooked. Not only that, existing methods for knowledge graphs are often customized for specific knowledge graphs or tasks, and have insufficient generalization ability. Therefore, it is necessary to use LLM to solve the challenges faced by knowledge graphs. Figure 1 summarizes the advantages and disadvantages of LLM and knowledge graph.

e5fa842a5c0af94d000519b08c761c44.png

▲ Figure 1: Summary of advantages and disadvantages of LLM and knowledge graph

As shown, the advantages of LLM: general knowledge, language processing, generalization ability. Disadvantages of LLM: tacit knowledge, hallucinatory problems, undecidable problems, black boxes, lack of domain-specific knowledge and new knowledge. Advantages of knowledge graph: structured knowledge, accuracy, decision-making ability, interpretability, domain-specific knowledge, knowledge evolution. Disadvantages of knowledge graph: incompleteness, lack of language understanding, unseen knowledge.

Recently, the possibility of combining LLM and knowledge graph has attracted more and more researchers and practitioners' attention. LLM and knowledge graph are intrinsically interrelated and mutually reinforcing. If LLM is enhanced with a knowledge graph, the knowledge graph can not only be integrated into the pre-training and inference stages of LLM to provide external knowledge, but also be used to analyze the LLM to provide interpretability.

In terms of using LLM to enhance knowledge maps, LLM has been used in a variety of applications related to knowledge maps, such as knowledge map embedding, knowledge map completion, knowledge map construction, knowledge map to text generation, and knowledge map question answering. LLM can improve the performance of knowledge graph and facilitate its application. In the related research on the collaboration of LLM and knowledge graph, the researchers combined the advantages of LLM and knowledge graph, so that their abilities in knowledge representation and reasoning can be promoted each other.

This paper will provide a forward-looking roadmap in the joint LLM and knowledge graph, helping readers understand how to use their respective advantages and overcome their respective limitations for different downstream tasks. It contains detailed taxonomies and comprehensive summaries, and points to emerging directions in these rapidly evolving fields. The main contributions of this paper include:

1. Roadmap: The paper provides a forward-looking roadmap for LLM and knowledge graph integration. This roadmap contains three general frameworks for combining LLM and knowledge graphs: enhancing LLM with knowledge graphs, enhancing knowledge graphs with LLM, and synergy between LLM and knowledge graphs. Guidelines can be provided for combining these two distinct but complementary technologies.

2. Taxonomy and summary assessment: For each integration model in this roadmap, a detailed taxonomy and a new taxonomy are provided. For each category, relevant research work is summarized and evaluated from different integration strategies and task perspectives, which can provide more insights for each framework.

3. Covers new developments: The paper covers advanced techniques for LLM and knowledge graphs. It discusses the current state-of-the-art LLMs such as ChatGPT and GPT-4, as well as new knowledge graph technologies such as multimodal knowledge graphs.

4. Challenges and future directions: The paper will also give the challenges of current research and some potential future research directions.

f7d2985654a214f5d2c4fbf79190ba0b.png

LLM and Knowledge Graph Basics

Large Language Models (LLM)

LLM pre-trained on large-scale corpora can solve a variety of NLP tasks and has great potential. As shown in Figure 3, most LLMs are derived from Transformer designs that contain encoder and decoder modules and employ self-attention. LLMs can be divided into three major categories based on their architecture: encoder-only LLMs, encoder-decoder LLMs, and decoder-only LLMs. Figure 2 summarizes some representative LLMs, involving different architectures, model sizes, and open-source or not.

3c8ad1b7bb16387012babba0aafff9c9.png

▲ Figure 2: Representative LLMs in recent years. Solid boxes represent open-source models, while open boxes represent closed-source models.

66ea37920bade21219b6183f1752838b.png

▲ Figure 3: Schematic diagram of LLM based on Transformer and using self-attention mechanism

prompt engineering design

Prompt engineering is a new field concerned with creating and optimizing prompts so that LLM can be most effective for a variety of different applications and research domains. As shown in Figure 4, a prompt is a natural language input sequence for LLM, which needs to be created for a specific task such as sentiment classification. A prompt can contain multiple elements, namely: instructions, background information, input text. Instructions are short sentences that tell the model to perform a specific task. Background information provides relevant information for input text or few-shot learning. The input text is the text that needs to be processed by the model.

b5b5a8a88e42a96f41c12d0fd71351e3.png

▲ Figure 4: An example of an emotional classification prompt

The goal of prompt engineering is to improve the ability of LLM to deal with diverse and complex tasks, such as question answering, sentiment classification, and commonsense reasoning. Chain of Thought (CoT) prompts enable complex reasoning through intermediate reasoning steps. Another approach is to design better knowledge-augmented prompts by incorporating external knowledge. Automated prompt engineering (APE) is an automatic prompt generation method that can improve the performance of LLM. The prompt allows one to exploit the potential of the LLM without fine-tuning the LLM. Mastering prompt engineering design can give a better understanding of the strengths and weaknesses of LLM.

Knowledge Graph (KG)

The knowledge graph stores structured knowledge in the form of (entity, relationship, entity) triples. According to the different stored information, existing knowledge graphs can be divided into four categories: encyclopedic knowledge graphs, commonsense knowledge graphs, domain-specific knowledge graphs, and multimodal knowledge graphs. Figure 5 shows examples of different categories of knowledge graphs.

1a2002023383be391291973dcec3e4c4.png

▲ Figure 5: Examples of knowledge graphs of different categories

application

Both LLM and knowledge graphs have a wide range of applications. Table 1 summarizes some representative applications of LLM and knowledge graph.

4391c8370f1b247d042a8e8549e7159e.png

▲ Table 1: Representative applications of LLM and knowledge graph

99466412091d97a7d7e0a8ee1f3fa25b.png

Roadmap and Classification

A roadmap will be given below to show the framework for combining LLM and knowledge graphs, and then related research will be classified.

route map

Figure 6 shows the roadmap for combining LLM and knowledge graph. This roadmap contains three frameworks for combining LLM and knowledge graphs: enhancing LLM with knowledge graphs, enhancing knowledge graphs with LLM, and synergy between LLM and knowledge graphs.

4158d00c9a9d27d7e4ba0aeb5b42fff7.png

▲ Figure 6: General roadmap for joint knowledge graph and LLM

8baf6d2ea648eb652567d72f7e7e2e06.png

▲ Figure 7: The general framework of LLM and knowledge map collaboration, which contains four layers: data, collaborative model, technology, application

Classification

In order to better understand the research on joint LLM and knowledge graph, the paper further provides a fine-grained classification of each framework. Specifically, here we focus on different approaches to integrate LLM and knowledge graphs, namely: enhancing LLM with knowledge graphs, enhancing knowledge graphs with LLM, and synergizing LLM and knowledge graphs. Figure 8 shows the classification of relevant studies in a fine-grained manner.

92d301171a7c8ed76cca84623107bf04.png

▲ Figure 8: Classification of related research on joint LLM and knowledge graph

4319aa50f577e2ad89da76ad04ac4538.png

Enhancing LLM with Knowledge Graphs

Large language models perform well on many natural language processing tasks. However, LLM has also been criticized for lacking actual knowledge and often generating factual errors in reasoning. One way to solve this problem is to augment LLM with knowledge graphs.

There are several specific ways. One is to use the knowledge map to enhance LLM pre-training, the purpose of which is to inject knowledge into LLM during the pre-training phase. The second is to augment LLM reasoning with knowledge graphs, which allow LLMs to take into account up-to-date knowledge when generating sentences. The third is to use the knowledge graph to enhance the interpretability of LLM, so that we can better understand the behavior of LLM. Table 2 summarizes typical methods for enhancing LLM with knowledge graphs.

8133ad174be93d4c1d60ebb9f0bceca8.png

▲ Table 2: Methods for enhancing LLM with knowledge graphs

Enhancing LLM pre-training with knowledge graphs

Existing LLMs mainly rely on performing unsupervised training on large-scale corpora. Although these models excel on downstream tasks, they lack practical knowledge relevant to the real world. In terms of integrating knowledge graphs into LLM, previous studies can be divided into three categories: integrating knowledge graphs into training objectives, integrating knowledge graphs into the input of LLM, and integrating knowledge graphs into additional fusion modules.

ab1c87dcc5faf2e838951f385f7bd25f.png

▲ Figure 9: Inject knowledge map information into the training target of LLM through text-knowledge alignment loss, where h represents the hidden representation generated by LLM.

305ead2a9af4eeefd2df1b79f37187b4.png

▲ Figure 10: Using the graph structure to inject knowledge map information into the input of LLM

24cb5e9e4391d91324315be1f5b4dac6.png

▲ Figure 11: Integrating knowledge graph into LLM through additional fusion module

Enhancing LLM Reasoning with Knowledge Graphs

The above methods can effectively fuse knowledge with LLM's textual representation. However, real-world knowledge changes, and a limitation of these methods is that they do not allow updating the integrated knowledge unless the model is retrained. So they may not generalize well to unseen knowledge at inference time.

It is precisely the separation of knowledge space and text space that some research has focused on and the infusion of knowledge at the time of reasoning. These methods mainly focus on question answering (QA) tasks, since QA requires models to capture both textual semantics and state-of-the-art real-world knowledge.

2e6d03ae9fda31344e0530658d2242ed.png

▲ Figure 12: Dynamic knowledge map fusion for LLM reasoning

a8148c1d5e7fa75f7888ba059c662ab0.png

▲ Figure 13: Enhancing LLM generation by retrieving external knowledge

Enhancing LLM Interpretability with Knowledge Graphs

Although LLMs perform well on many NLP tasks, they are still criticized for lack of interpretability. LLM interpretability refers to understanding and explaining the inner workings and decision-making processes of large language models. This increases the credibility of LLM and facilitates the application of LLM in high-stakes scenarios, such as medical diagnosis and legal judgment. Since knowledge graphs represent knowledge in a structured way, they can provide excellent interpretability for reasoning results. Therefore, researchers will inevitably try to use knowledge graphs to improve the interpretability of LLM; related research can be roughly divided into two categories: knowledge graphs for language model detection, and knowledge graphs for language model analysis.

d0c58b22f107b1e77bddf08b6f4961d9.png

▲ Figure 14: General framework for language model detection using knowledge graphs

b70d15875e40592e14d9c888bdc83193.png

▲ Figure 15: General framework for language model analysis using knowledge graphs

a89f293b53721220d54f14cfb1a51c16.png

Enhancing Knowledge Graphs with LLMs

The notable feature of knowledge graph is structured knowledge representation. They are suitable for many downstream tasks, such as question answering, recommendation, and web search. However, traditional knowledge graphs are often incomplete, and existing methods often do not consider textual information.

In order to solve these problems, researchers have considered using LLM to enhance the knowledge map, so that it can consider text information, thereby improving the performance on downstream tasks. Table 3 summarizes representative research efforts. This will involve the use of LLM to enhance the knowledge map in different ways, including knowledge map embedding, knowledge map completion, knowledge map to text generation, and knowledge map question answering.

4c950222535227de20498e7ba6a2562a.png

▲ Table 3: Representative methods for enhancing knowledge graphs with LLM

Enhancing Knowledge Graph Embedding with LLM

The goal of Knowledge Graph Embedding (KGE) is to map each entity and relation to a low-dimensional vector (embedding) space. These embeddings contain semantic and structural information of knowledge graphs, which can be used in many different tasks, such as question answering, reasoning and recommendation. Traditional knowledge graph embedding methods mainly rely on the structural information of the knowledge graph to optimize a scoring function (such as TransE and DisMult) defined on the embedding. However, these methods struggle to represent unseen entities and long-tail relationships due to limited structural connectivity.

Figure 16 shows a recent study: To address this issue, this method uses LLM to encode textual descriptions of entities and relations, thereby enriching the representation of knowledge graphs.

9bc1e5cadcfaaaf488ac06634981032f.png

▲ Figure 16: Using LLM as a text encoder for knowledge graph embedding

9d88c3844874d4b0b6daf2b806052992.png

▲ Figure 17: LLM for joint text and knowledge map embedding

Enhancing Knowledge Graph Completion with LLM

The goal of Knowledge Graph Completion (KGC) task is to infer missing facts in a given knowledge graph. Similar to KGE, traditional KGC methods mainly focus on the structure of knowledge graphs without considering extensive textual information.

However, recent studies have integrated LLM into KGC methods to encode text or generate facts, achieving better KGC performance. Depending on how they are used, these methods fall into two categories: using LLMs as encoders (PaE) and using LLMs as generators (PaG).

25bbf73c94e7818bbf876ce78631937f.png

▲ Figure 18: A general framework for using LLM as an encoder (PaE) to complete a knowledge graph

9f2099caf6f8d62d2f05dfc011b6cca7.png

▲ Figure 19: A general framework for using LLM as a generator (PaG) to complete a knowledge graph. En. and De. represent encoders and decoders, respectively.

203dea99d663781e5326eb9a91201a31.png

▲ Figure 20: Use prompt-based PaG to complete the framework of the knowledge map

Enhancing Knowledge Graph Construction with LLM

Knowledge graph construction involves creating structured representations of knowledge within a specific domain. This includes identifying entities and the relationships between entities. The knowledge graph construction process usually involves multiple stages, including: entity discovery, coreference resolution, and relation extraction. Figure 21 shows the general framework for using LLM for various stages of knowledge graph construction. There are also recent studies exploring end-to-end knowledge graph construction (building a complete knowledge graph in one step) and distilling knowledge graphs directly from LLM.

07e0dbbf5474291f48634a45958dfba8.png

▲ Figure 21: General framework of LLM-based knowledge map construction

32c3a308d50bf8e760026d5b5f3d868b.png

▲ Figure 22: General framework for distilling knowledge graphs from LLM

Enhancing Knowledge Graph to Text Generation with LLM

The goal of knowledge graph-to-text (KG-to-text) generation is to generate high-quality text that accurately and consistently describes the input knowledge graph information. Knowledge graph to text generation connects knowledge graph and text, which can significantly improve the usability of knowledge graph in more realistic natural language generation scenarios, including story creation and knowledge-based dialogue. However, collecting large amounts of knowledge graph-text parallel data is difficult and costly, which leads to insufficient training and poor generation quality.

Therefore, there are many studies devoted to addressing these questions: How to utilize the knowledge of LLM? How to build a large-scale weakly supervised knowledge graph-text corpus to solve this problem?

69c4985d58ede7428784d3410857573e.png

▲ Figure 23: General framework for knowledge map to text generation

Enhancing Knowledge Graph Question Answering with LLM

Knowledge Graph Question Answering (KGQA) aims to find answers to natural language questions based on structured facts stored in a knowledge graph. KGQA has an unavoidable challenge: to retrieve relevant facts and extend the reasoning advantages of knowledge graphs to question answering tasks. Therefore, recent research adopts LLM to fill the gap between natural language questions and structured knowledge graphs.

Figure 24 presents a general framework for using LLM for KGQA, where LLM can be used as entity/relation extractor and answer reasoner.

912b196d15de4e41bbe6d57f848e8970.png

▲ Figure 24: General framework for using LLM for knowledge graph question answering

ffbdf6442c2ef321391949bf25b4bad1.png

Collaboration between LLM and knowledge graph

The collaboration of LLM and knowledge graph has gained a lot of attention in recent years. This method can combine the advantages of LLM and knowledge graph to better deal with various downstream tasks. For example, LLM can be used to understand natural language, while knowledge graph can be used as a knowledge base to provide factual knowledge. Combining LLMs and knowledge graphs can lead to powerful models for performing knowledge representation and reasoning.

Here we focus on the collaboration between LLM and knowledge graphs from two aspects: knowledge representation and reasoning. Table 4 summarizes representative research efforts.

1705456f8d14331ab1b3c793342964db.png

▲ Table 4: Summary of LLM and knowledge map collaborative methods

knowledge representation

Both text corpora and knowledge graphs contain a large amount of knowledge. However, knowledge in a text corpus is usually implicit and unstructured, while knowledge in a knowledge graph is explicit and structured. Therefore, to represent this knowledge in a unified way, it is necessary to align the knowledge in the text corpus and the knowledge graph. Figure 25 presents a general framework for unifying LLM and knowledge graph for knowledge representation tasks.

1f9229217bb89c8678e058fa881aee64.png

▲ Figure 25: A general framework for unifying LLM and knowledge graphs for knowledge representation tasks

KEPLER is a unified model for knowledge embedding and pretrained language representation. KEPLER uses LLM to encode textual entity descriptions into their embeddings, and then jointly optimizes the knowledge embedding and language modeling objectives. JointGT proposes a knowledge graph-text joint representation learning model, in which three pre-training tasks are proposed to align knowledge graph and text representations.

DRAGON presents a self-supervised method that can pre-train a language-knowledge joint basic model based on text and knowledge graphs. Its inputs are text snippets and associated knowledge graph subgraphs, and bidirectionally fuses information from these two modalities. DRAGON then leverages two self-supervised inference tasks (masked language modeling and knowledge graph link prediction) to optimize the model's parameters. HKLM introduces a joint LLM that integrates knowledge graphs to learn representations of domain-specific knowledge.

reasoning

In order to take advantage of the advantages of LLM and knowledge graph at the same time, researchers also perform reasoning tasks for various applications through the collaboration of LLM and knowledge graph. In the question answering task, QA-GNN will first use LLM to process text questions, and then guide the reasoning step of knowledge graph. This builds a bridge between text and structured information that can provide explanations for reasoning.

In the knowledge graph reasoning task, LARK proposes a logical reasoning method guided by LLM. It first converts traditional logic rules into language sequences, and then asks the LLM to reason about the final output. Furthermore, Siyuan et al. unify structural inference and language model pre-training through a unified framework. Given a text input, they employ LLM to generate logical queries, which can be executed on knowledge graphs to obtain structured contextual information. Finally, this structured context is fused with textual information to generate the final output.

RecInDial combines knowledge graph and LLM to provide personalized recommendation in dialogue system. KnowledgeDA proposes a unified domain language model development pipeline that augments the task-specific training process with domain knowledge graphs.

baddaf535fa777694fe995b0dcc13bf3.png

future direction

There are still many challenges to be solved in combining knowledge graphs and large language models. Some future research directions in this research area are briefly given below:

  • Using knowledge graphs to detect hallucinations in LLMs;

  • Use knowledge graphs to edit knowledge in LLM;

  • Using knowledge graphs for black-box LLM knowledge injection;

  • Use multimodal LLM for knowledge graph;

  • Use LLM to understand the structure of knowledge graphs;

  • Synergy of LLM and knowledge graph for bidirectional reasoning.


Enter the NLP group —> join the NLP exchange group

Guess you like

Origin blog.csdn.net/qq_27590277/article/details/131651197