【阅读笔记】Reasoning With Neural Tensor Networks for Knowledge Base Completion

Disclaimer: the author is limited, blog inevitably a lot of flaws and even serious mistakes, I hope you correct. While writing the biggest goal is also to exchange learning, not paying attention and spread. Long way to go, and you encourage each other. https://blog.csdn.net/yexiaohhjk/article/details/86374735

Foreword

Thesis addresses
Poster

Abstract

This paper, the author describes a reasoning applies to the relationship between the two entities of neural networks (Neural Tensor Network). Compared to the previous work will either be expressed as a discrete entity or the smallest unit is a single entity vector and paper experiments show that when the experiment when the entity is represented as mean their word vector effect will increase. At last, these words prove that when the vector by a large number of unsupervised learning corpus as a result of learning initialization value of the parameter for the assessment results to predict whether the knowledge base in the two entities are related significantly improved. In short thesis model results are better than the previous model, and the judge WordNet hidden relationships and FreeBase accuracy of 86.2% and 90.0%.

Introduce

Such as WordNet, Yago, Google Knowledge Graph and other similar entities and knowledge base provides a wealth of resources to provide ... information retrieval and knowledge structure to the user, but still face a lack of knowledge of the relationship between incomplete and reasoning problems.

... (omitted important presentation)

The authors provide a model can accurately predict real existing knowledge base additional information. Model primarily by a performance in the knowledge base of the entity into a vector, while the form of a vector can also exhibit their information and relationships with other entities, and each entity relationship through a new neural network tensor defined parameter to express accurately.

In summary, this paper first contribution is to propose a new neural network neural tensor network (NTN), which combines several previously proposed a neural network model and stronger than standard neural network model to more robust layer of information methods relationship between.

The second contribution is to provide a new way to show the knowledge base of the entity, before a similar 8 , 9 , 10 they just put into a variable physical performance, but if the name of the entity with the same substring can not == sharing statistical strength ==.

The third contribution is to a large number of unlabeled training to incorporate text vector word embedded form

Related Work

Too lazy to summarize, direct reading papers like!

Neural Models for Reasoning over Relations

Neural Network Architecture

The structure of the network using a bilinear model (bilinear models):

g ( e 1 , R , e 2 ) = in R T f ( e 1 T W R [ 1 : k ] e 2 + V R [ e 1 e 2 ] + b R )               ( 1 ) g(e_1,R,e_2)=u^T_Rf(e_1^TW_R^{[1:k]}e_2+V_R\begin{bmatrix} e_1 \\ e_2 \\ \end{bmatrix}+b_R)~~~~~~~~~~~~~(1)
where g is the output of the network, showing the relation R scoring. e1, e2 feature vector both entities are dimension d, the initialization value may be random, or may be a vector by the third-party tools training should be continually adjusted during training. f= tanhHidden layer activation function.
The first layer of weights is V, the offset is b, the second layer of weights is a right parenthesis is the first u entries Tensor.

Corresponding to the given paper illustrated as follows:
image

### thought loss function:
With this network can learn the knowledge base reasoning, for each given Ternary Relation (e_i, R_k, e_j), random replace entity e1 or e2 construct a new negative with other entities samples, negative samples scored for the construction tends to be smaller than the positive samples, and close to the score 1 positive samples, negative samples using the score close to 0. Thus maximize marginal function (max-margin objective functions) of the form as follows It shows:
J ( Ω ) = i = 1 N c = 1 C m a x ( 0 , 1 g ( T ( i ) ) + g ( T c ( i ) ) ) + λ Ω 2 2               ( 2 ) J(\Omega)=\sum^N_{i=1}\sum^C_{c=1}max(0,1-g(T^{(i)})+g(T_c^{(i)}))+\lambda ||\Omega||^2_2~~~~~~~~~~~~~(2)
Finally, we need to minimize the loss function towards the optimization of expressionsNare all positive samples number for each positive samples randomly constructedCnegative example sample. Where Ω is the set of all parametersu,W,V,b,E. The first is the weight 1,3,4 general bp network weight parameters, a final feature vector is an entity, is input, the second is a tensor. T ^ {(i)} _corresponding to the i-th negative samples embodiment.

Then the paper using a gradient descent or L-BFGS solved so that minimized the loss function parameters, a trained set of parameters corresponds to a relationship.

Re-examine the word vector

Author describes two methods on random initialization vector entities to improve the accuracy of the model:

  • A plurality of composite vector of words by the entity constituted by the average of the initialization vectors constituting the plurality of words (Word vector)

    The authors also try to vector entities constitute a compound word of learning RNN, but for some reason the actual effect is not good, not as good as direct averaged word vectors.

  • Vector entity first pre-trained by unsupervised learning to initialize entities in the word vector (WV-init)

    参考word representations: a simple and general method for semi-supervised learning论文

Experiments

This paper takes WordNet and FreeBase two sample sets to predict new relationships.

112581 triples using relationship (e_1, R, e_2) to train in WordNet, which triples from different entities 38,696 and 11 different types of relationships. With the former work of different authors to filter out some of the relationship, such as the same entity relationship is repeated in the Trilateral Relations in WordNet, etc.

Analyzing the relationship between the triad Relation Triplets Classification

This paper generates a negative sample by sample replacement positive entity, by setting the threshold T R T_R To determine whether there is a relationship.
g ( e 1 , R , e 2 ) > = T R g(e_1,R,e_2) >= T_R

Compared the task in the test triple relationship if there is mentioned in the text of the accuracy of five different models, this paper presents NTN model found significantly better than other models.

At the same time authors compared the accuracy of different relationships on the same data set in WordNet and Freebase, the conclusion as shown, different accuracy of the relationship is not the same:


The author also found that different have a significant impact on the accuracy of the entity initialization vector way, comparing three different entities initialization vector way:

  • EV: (Entity Vector): an entity as a whole represents a single vector
  • WV (Word Vector): Words random initialization vector is obtained, and the average value vector words represent the entity vector
  • WV-init: Initialization is obtained by the unsupervised learning when compared to the word vector WV

Examples of Reasoning

Ternary Relation task to determine the accuracy of the above, the text of the current round has proven model for predicting a triple relationship if there exists a higher accuracy rate.

In this piece of reasoning with the two experiments of subjective impressions TNT:

  • Selecting a entities and relationships, and then all the other entities of the entity relationship score values ​​in descending order, the following table

Seen from this table, we see subjective, of which the majority concluded that the relationship is credible.

  • Through the knowledge base has been trained ternary relationship between entities unknown to the reasoning of the relationship, for example as follows:

As shown by a black line existing relationship, red unknown inferred relationship, and the vector word (word vector) represents the vector entity (entity vector), the latent semantic relationship entity comprised of the same words are also preserved.

to sum up:

The main innovation of this paper is that compared to the previous use of the entity in the knowledge base to predict the relationship, the author introduces a loss function for the bilinear three-layer neural network (NTN) model, and the initialization vector processing entity for adoption unsupervised the average word vector model is trained, greatly improving the accuracy of the system.

After reading the question:

  • Ternary relationship set threshold reasoning when using this neural network training T R T_R It is how determined?

Guess you like

Origin blog.csdn.net/yexiaohhjk/article/details/86374735