【阅读笔记】:End-to-end Structure-Aware Convolutional Networks for Knowledge Base Completion

      To solve the problem of incomplete knowledge map triplet (KG) herein in conjunction with FIG weighted convolutional neural network (WGCN) and two modules Conv-TransE proposed SACN (Structure-Aware Convolutional Networks) model. SACN modeled by KG WGCN entities and relationships in the extracted physical characteristics, and physical manipulation input to ConvE KG satisfy constraints triplets, the resulting entity embedding FIG. The experimental results show, on FB15k-237 data sets and WN18RR than 10% before the best model results, the SOTA current model.

      Published in AAAI 2019, of the following information:


 

  • introduction    

      First, a brief introduction Knowledge Base (Knowledge Base, KB). Knowledge in the knowledge there are many different forms, such as a body of knowledge, and Knowledge, rule base and case knowledge. Compared to the concept of the knowledge base, knowledge map (Knowledge Graph, KG) to focus more on building the relevance of knowledge. The main method is to use defined KG (s, r, o) triples, for example, capital of China Beijing, then the sentence can be represented by triples: (s = Beijing, r = IsCapitalOf, o = China). Wherein s is called the first entity, r is called relationship, o is an entity tail. Knowledge Base article full complement of tasks refers to complement the Knowledge Graph triples.

      Then, after the definition of triples with the task as described herein can be: given KG, learning embedded representation entities and relationships, to complete the task completion triples (s, r, o), i.e., a given head entities and relationships, for the most likely to tail entities.

  • method

      WGCN overall framework for the model entity embedding module extracts said embedding entity then represented as input Conv-TransE module met Loss triple constraints to train a whole network. As shown in Figure 1 the overall framework.

      Here, before the introduction of this article will be WGCN Conv-TransE module and propagates sequentially.

      1. WGCN

      As the name suggests, WGCN stands for Weighted Graph Convolutional Networks, FIG tentatively called weighted convolutional neural network. The main idea is to define a different weight, the more than one single diagram into multiple diagrams with different strength of the relationship between the different sides of the relationship. FIG nodes constituting the entity, edges of the graph the relationship between the entities.

      The classic definition of GCN memories, the main idea is: as the central node of each node of the polymerization, for each center node, polymeric layers wherein the present neighbor nodes said features represented as a central node next layer, i.e.,

Where, N i denotes the set of neighbors i-node (i-node including itself), H l vectors that express the l layer node. g (·, ·) represents the transfer function of the information, the basic method as defined

By summing operation, a vector of the neighboring nodes represents a linear transformation to share the central node, to achieve polymerization operation GCN layer. GCN layer stack layer by layer, to be implemented before transmission. It is worth mentioning that each GCN neighbor information are polymerized, the polymerization of the first layer after the first-order neighbor information, when the second layer is polymerized neighbors, its neighbor nodes have information about neighboring nodes neighboring node , the second polymeric layer information GCN second order neighbors. The more the number of stacked layers GCN, the aggregated central node neighbors the more extensive range. More specific description can be found [1].

      Then we returned to the main line, with the main difference WGCN classic GCN is, for each relationship modeling Knowledge Graph, a different relationship to the polymerization process with different weights. It is defined as the weight [alpha] T , 1≤t≤T, where T is the total number of relations. [alpha] T to be parameters. So before you can write to the iterative formula

Wherein, g (·, ·) denotes information transfer function, is defined the same as

In this way, we will be the central node and neighboring nodes separated, can be written as

Write it in matrix form, there

其中,At表示第t个关系构成的0-1邻接矩阵,0为无边相连,1为有边相连。因此,形式上就又回到了原始的GCN递推公式【1】

这样我们便将一个多关系图转变成了多个具有不同强弱关系的单关系图,如Figure 2所示,图的右侧分别对应矩阵A、H、W。这便是WGCN的巧妙之处。

      另外,关于KG在WGCN中的建图,使用实体作为图的节点,关系作为图的边。值得说明的是,本文还使用了节点的属性作为图的节点,如属性(Tom,gender,male)。这样做的目的是将属性也作为节点,起到“桥”的作用,相同属性的节点可以共享信息。还有作者为了减少过多的属性节点,对节点进行了合并, 将gender也作为了图中的节点,而不是建立male和female两个属性,理由是gender已经能够确定实体的person,而不必过多区分性别。

      因此,WGCN同时使用了三元组的结构信息和实体的属性信息,也就是标题中的structure-awared。

      2. Conv-TransE

      Conv-TransE这个部分类似于ConvE【2】,通过对(s, r)和o进行相似度打分,来预测关系是否成立。与ConvE的主要区别是去掉了ConvE中的reshape操作,具体可以参看文献【2】。这里给出经典的KG表征学习的打分函数。

      来看模型结构,如图所示。

      举例来说,将WGCN得到的实体s的embedding和预训练的关系r的embedding进行concat操作,变成一个2*n维的矩阵。对这个矩阵进行卷积操作,通过多个相同尺寸的卷积核,得到feature maps。然后将feature maps拉直成一个向量,通过全连接层进行维度缩减。将这个融合了s和r的向量与WGCN生成的所有向量分别进行点积操作,计算(s,r)与所有待选o的相似度。相似度通过sigmoid缩放到0~1范围内,取相似度最高的作为预测的实体o。

      打分函数(相似度函数)体现了前向传播的过程,公式形式为

M(·, ·)表示卷积操作,vec(·)表示拉直操作,f(·)为激活函数。

      将打分函数通过sigmoid函数,得到(s, r)和待选o构成三元组成立的概率,即

      因此,整个网络的损失也就可以定义为,(s, r)和待选o构成三元组是否成立的二分类交叉熵,即

  • 数据集

      文中共使用了三个数据集,包括:

      1)FB15k-237: freebase三元组
      2)WN18RR: wordnet三元组
      3)FB15k-237-Attr:作者从FB24k中抽取了实体的属性

      具体的统计信息如下:

  • 实验

      文章进行了链接预测任务的实验,实验结果如表3。可以看到,带有属性节点的SACN性能表现达到了State-Of-The-Art。

 

      参数敏感性部分,文章尝试了不同长度的卷积核,针对不同的数据集有不同的最优参数。

 

      文章还对不同的度的节点的性能进行了比较,可以看到,在度较低的节点下是SACN高于Conv-TransE的,因为邻居节点可以共享更多的信息;然而度较高的节点则效果不如单独的Conv-TransE,文章对它的解释是较多的邻居节点使较重要的邻居信息被过度“平滑”掉了,因此比不上单纯的Conv-TransE。


  • 结束语

      作者用WGCN来捕获具有相关关系的实体特征,使邻居节点的信息得以共享,这样学到的实体表示要好于孤立的学习ConvE得到的实体表示。本质上是GCN+ConvE的模型框架,这种串联的框架对其他类似的任务也有启发性。另外的体会是深度学习方面的研究,还是要多读论文多总结。好的,做完了这篇博客,相信事情总可以一件一件一件的做完!

 

参考文献

【1】Kipf, T.N., & Welling, M. (2016). Semi-Supervised Classification with Graph Convolutional Networks. ArXiv, abs/1609.02907.

【2】Dettmers, T., Minervini, P., Stenetorp, P., & Riedel, S. (2017). Convolutional 2D Knowledge Graph Embeddings. AAAI.

 

Guess you like

Origin www.cnblogs.com/jws-2018/p/11519383.html