[NeurIPS && Graph Q&A] Cone Embedding Method for Knowledge Graph (KG) Mutil-Hop Reasoning (Chinese Academy of Sciences - including source code)

Source: AINLPer WeChat public account ( daily paper dry goods sharing!! )
Editor: ShuYini
Proofreading: ShuYini
Time: 2022-09-29

introduction

Knowledge Graph (Knowledge Graph) stores data in a graph database in the form of triples. The core of realizing knowledge graph question answering is to convert natural language into query language for graph data, that is, the process of mapping natural language questions to structured queries. When some links are missing in KGs, it is difficult to recognize the correct answer, if you solve this problem, this article may be able to help you.The paper and source code are linked at the back

insert image description here

background introduction

Multi-Hop reasoning based on Knowledge Graphs (KGs), which aims to find answer entities for a given query using knowledge from KGs, has received extensive attention from academia and industry in recent years . In general, it involves answering first-order logic (FOL) queries on KGs using operators including existential quantification (∃), conjunction (∧), disjunction (∨), and negation (¬). Multi- A common method for Hop reasoning is: first convert a FOL query into a corresponding computation graph (where each node represents a set of entities and each edge represents a logic operation), and then traverse the KG according to the computation graph to identify the answer set. However, this approach faces two major challenges. First, it is difficult to identify the correct answer when some links are missing in KGs. Second, it needs to process all intermediate entities on the inference path, which may lead to exponential computational cost .

To address these issues, researchers have paid more and more attention to query embedding (QE) techniques, which embed entity and FOL queries into a low-dimensional space . The QE model associates each logical operator in the computation graph with a logical operator in the embedding space. Given a query, the QE model generates query embeddings based on the corresponding computation graph. They then determine whether an entity is the correct answer based on the similarity between the query embedding and the entity embedding.

Among existing QE models, embedding entities and queries into geometric shapes based on geometric models has shown promising performance . Geometry-based models typically represent entity sets as "regions" (eg, points and boxes) in Euclidean space, and then perform set operations on them. For example, Query2Box represents entities as points and queries as boxes. If a point is inside a box, then the corresponding entity is the answer to the query. Compared with non-geometric methods, geometric shapes provide a natural and easy-to-interpret way of representing sets and their logical relationships.

However, it is difficult for existing geometry-based models to model queries with negation, which greatly limits their applicability . For example, GQE and Query2Box - don't embed queries into dots and boxes - cannot handle queries with negation, because the complement of dot/box is no longer dot/box. To address this issue, Ren & Leskovec proposed a probabilistic quantitative easing model using the Beta distribution. However, it does not have some of the advantages of geometric models. For example, when using the Beta distribution, it is not clear how to determine whether an entity is the answer to a query in a box case. Therefore, it remains challenging to propose a geometric QE model capable of simulating all FOL queries.

Cone model introduction

 In this paper, we propose a new query embedding model, Cone Embeddings, to answer Mutil-Hop first-order logic (FOL) queries on knowledge graphs. We represent entity sets as Cartesian products of cones and design corresponding logical operations .

Cone Embeddings

 Given a query, we represent the reasoning process as a computational graph (above a), where nodes represent entity sets and edges represent logical operations on entity sets. Figure b above gives several examples of (sector) cones, and one may notice some similarities between sector cones and boxes defined in Query2Box, which also involve region representation. However, fanned cones are more expressive than square boxes in comparison.

Cone Logical operation

Projection Operator P : As shown in Figure a above, the goal of P is to represent an entity's adjacent entities, which are connected by a given relationship. It maps one entity set to another.

Intersection Operator L : As shown in Figure b above, given a request q, its connection request entity set is [ q 1 ], [ q 2 ] [q_1], [q_2][q1][q2] , then the purpose of operating L is: $[q]= {\textstyle \bigcap_{i=1}^{n}} [q_i] $

Union Operator U : As shown in Figure c above, given a request q, and its connection query request entity set [ q 1 ], [ q 2 ], [ q 3 ] [q_1], [q_2], [q_3][q1][q2][q3] are separated, then the purpose of operation U is: to unify these scattered sets together, namely:[ q ] = ⋃ i = 1 n [ qi ] [q]= {\textstyle \bigcup_{i=1}^{ n}} [q_i][q]=i=1n[qi]

Complement Operator C : As shown in Figure d above, given a connection query request q and its corresponding entity set [q], the purpose of operation C is to be able to identify [¬q] as the complement of the request q.

Distance Function : Defines a distance function for join queries. Inspired by Ren et al., we divide the distance d into two parts—the external distance do d_odoand internal distance di d_idi. Figure e above gives an illustration of the distance function d.

experiment snapshot

 In this section, we demonstrate experimentally that: 1) ConE is a powerful knowledge graph Mutil-hop reasoning model; 2) ConE's cone embedding can effectively model the cardinality (i.e., the number of elements) of an answer set.

 1. The following table shows the experimental results on queries without negation, that is, there is a positive first-order (EPFO) query, where AVG represents the average performance . Overall, ConE significantly outperforms the comparison models.

 2. The following table shows the results of ConE and BETAE when modeling FOL queries with negation . Since GQE and Q2B are not capable of handling negation operators, we do not incorporate their results into the experiments. Overall, ConE performed significantly better than BETAE

recommended reading

[1] Must see! ! [AINLPer] Natural Language Processing (NLP) Domain Knowledge && Data Sharing

[2] [NeurIPS paper download over the years] This article will take you to understand the NeurIPS International Conference (including NeurIPS2022)

[3] [NLP Paper Sharing && Language Representation] It is expected to subvert Transformer's graph recurrent neural network (GNN)

[4] [NeurIPS && Graph Q&A] Knowledge Graph (KG) Conical Embedding Method for Mutil-Hop Reasoning (Chinese Academy of Sciences – including source code)

[5] [NLP paper sharing && QA question and answer] Dynamic association GNN establishes direct association and optimizes multi-hop reasoning (including source code)

[6] [Download IJCAI papers over the years && paper express] No data against distillation, vertical joint, pre-training fine-tuning paradigm graph neural network (GNN)

[7] [NLP paper sharing && including source code] Automatic label sequence generation based on Prompting Seq2Seq (Tsinghua AI Research Institute)

[8] [NLP paper sharing && PLM source code] Pre-training model BERT plays Twitter (7 billion data pairs, more than 100 languages)

[9] [Paper Express && IJCAI Paper Dry Goods Download] Graph Neural Network (GNN) (multi-behavior recommendation, multi-modal recipe representation learning, homogeneous graph representation learning)

[10] [Download IJCAI papers over the years && paper express] No data against distillation, vertical union, pre-training fine-tuning paradigm graph neural network (GNN)

[11] [NLP Paper Sharing && Chinese Named Entity Recognition] How to Build an Excellent Gazetteer/Gazetteer (Zhejiang University & Including Source Code)

[12] Understand linear regression in one article [more detailed] (source code included)

[13] An article to understand logistic regression [more detailed] (including source code)

Paper && source code

Title: ConE: Cone Embeddings for Multi-Hop Reasoning over Knowledge Graphs
Author: Key Laboratory of Technology, Global Academy of Sciences, Chinese Academy of Sciences
Paper: https://arxiv.org/pdf/2110.13715.pdf
Code: https://github.com/MIRALab -USTC/QE-ConE.

last not last

Follow the AINLPer WeChat public account ( the latest papers are recommended to you every day!! )

Guess you like

Origin blog.csdn.net/yinizhilianlove/article/details/127115471