"ROTATE: KNOWLEDGE GRAPH EMBEDDING BY RELATIONAL ROTATION IN COMPLEX SPACE" paper reading

Published at ICLR 2019.

abstract

Research on the task of learning representations of entities and relations in knowledge graphs to predict missing links. The success of this task depends heavily on the schema for modeling and inferring relationships. In this paper, we propose RotatE, a novel approach to knowledge embedding graphs, which can model and infer various relational schemas: symmetric/asymmetric, inversion, composition. RotatE models each relation as a rotation from a source entity to a target entity in a complex vector space. In addition, we also propose a new self-adversarial negative sampling technique for the effectiveness training of the RotatE model. Experiments prove that RotatE is not only scalable, but also can infer and model a variety of relationship patterns, and can surpass the existing SOTA methods in link prediction patterns.

1.introduction

Finding ways to model and infer patterns (e.g., symmetry/asymmetry, inversion, composition) from observed facts is important in predicting missing links. However, existing methods cannot model all of the above patterns, so we find ways to model and infer the above three types of relational patterns.

Determine the motivational value of the Euler identity: ei θ = cos θ + i sin θ e^{i\theta} =cos \theta+ i \,sin \thetaeiθ=cosθ+is i n θ , which shows that a complex number can be viewed as a rotation of the complex space. Given a triplet( h , r , t ) (h,r,t)(h,r,t ) , we expectt = h ∘ rt = h \circ rt=hr,其中 h , r , t ∈ C k h,r,t \in \mathbb{C}^k h,r,tCk is embeddings, and∣ ri ∣ = 1 |r_i|=1ri=1, ∘ \circ represents the product of Hadamard. For each dimension in the complex plane, we expect:

t i = h i r i , h i , r i , t i ∈ C , ∣ r i ∣ = 1. t_i =h_i r_i ,h_i,r_i,t_i \in \mathbb{C},|r_i|=1. ti=hirihi,ri,tiC,ri=1.

Such a simple operation can effectively model the three relationships mentioned before: symmetric/antisymmetric, inversion, composition. Example: a relation rrr is symmetric iff embeddingrrEach component of r satisfies ri = e 0 / i π = ± 1. r_i = e^{0/i\pi}=\pm 1.ri=e0/iπ=± 1. The two relations are inverse iff and their embeddings are conjugates: r 2 = r 1 ˉ r_2 = \bar{r_1}r2=r1ˉ, a relation r 3 = ei θ 3 r_3 =e^{i \theta_3}r3=eiθ3These two relations are r 2 = ei θ 2 , r 1 = ei θ 1 r_2 =e^{i \theta_2},r_1 =e^{i \theta_1}r2=eiθ2,r1=eiθ1Definition iff r 3 = r 1 ∘ r 2 ( θ 3 = θ 1 + θ 2 ) r_3 = r_1 \circ r_2 (\theta_3 =\theta_1 +\theta_2)r3=r1r2( i3=i1+i2) . Furthermore, RotatE can be extended to large knowledge graphs because it is almost linear in time and space.

In order to effectively optimize RotatE, we further propose a self-adversarial negative sampling method to generate negative samples according to the current entities and relations. This approach is very general and can be used in many existing knowledge graph embedding methods. Experiments prove that RotatE achieves SOTA.

2.related work

3.RotatE:relational rotation in complex vector space

3.1 modeling and inferring relation patterns

There are three very important relationship patterns in the knowledge graph: summer, inversion and composition. Give their definitions: \
Definition 1: A relation rrr is symmetric/antisymmetric, if for∀ x , y : \forall x,y:x,y:

r ( x , y ) ⇒ r ( y , x ) ( r ( x , y ) ⇒ ¬ r ( y , x ) ) r(x,y) \Rightarrow r(y,x)(r(x,y) \Rightarrow \neg r(y,x)) r(x,y)r(y,x)(r(x,y)¬r(y,x))

Definition 2: A relation r 1 r_1r1is the relation r 2 r_2r2The inverse, if for ∀ x , y : \forall x,y:x,y:

r 2 ( x , y ) ⇒ r 1 ( x , y ) r_2(x,y)\Rightarrow r_1(x,y) r2(x,y)r1(x,y)

Definition 3: A relation r 1 r_1r1From r 2 , r 3 r_2,r_3r2,r3Formation, if for ∀ x , y , z : \forall x,y,z:x,y,z:

r 2 ( x , y ) ⋂ r 3 ( y , z ) ⇒ r 1 ( x , z ) r_2(x,y)\bigcap r_3(y,z)\Rightarrow r_1(x,z) r2(x,y)r3(y,z)r1(x,z)

Please add a picture description

3.2 modeling relations as rotations in complex vector space

Given a triplet ( h , r , t ) (h,r,t)(h,r,t)

t = h ∘ r , w h e r e   ∣ r i ∣ = 1 t = h \circ r,where \, |r_i| =1 t=hrwhereri=1

For each element in the embedding, we have ti = hiri t_i = h_i r_iti=hiri,且 h , t ∈ C k h,t\in \mathbb{C}^k h,tCk r i ∈ C , ∣ r i ∣ = 1 r_ i \in \mathbb{C}, |r_i|=1 riC,ri=1 , cause thisri r_irispecific form ei θ r , ie^{i \theta_{r,i}}eiθr,i, is θ r , i \theta_{r,i} about the origin in the complex planeir,iA counterclockwise rotation of , only affects entity embeddings in negative vector spaces. We call the proposed model RotatE due to its rotation property. According to the above definition, for each triplet ( h , r , t ) (h,r,t)(h,r,t ) , we define the distance function of RotatE as follows:

d r ( h , t ) = ∣ ∣ h ∘ r − t ∣ ∣ d_r(h,t) =||h \circ r - t|| dr(h,t)=hrt

By defining each relation as a rotation in a complex vector space, RotatE is the only model that can model and infer the above three relation patterns.

3.3 optimization

We use a loss function similar to the negative sampling loss to efficiently optimize distance-based models:

L = − l o g   σ ( γ − d r ( h , t ) ) − ∑ i = 1 n 1 k l o g   σ ( d r ( h i ′ , t i ′ ) − γ ) , L = -log \, \sigma(\gamma-d_r(h,t))-\sum\limits_{i=1}^n \frac{1}{k}log \, \sigma (d_r(h_i',t_i')-\gamma), L=logs ( cdr(h,t))i=1nk1logs ( dr(hi,ti)c ) ,

where γ \gammaγ is a fixed margin,( hi ′ , r , ti ′ ) (h_i',r,t_i')(hi,r,ti)是i-th negative triplet。

We also propose a new method to generate negative samples. The negative sampling loss uses a uniform method to sample negative triplets. Such a uniform negative sampling method has the problem of inefficiency (because many samples are obviously wrong as the training progresses, and cannot provide any meaningful information). Therefore we propose a negative sampling called self-adversarial, which samples negative triples according to the current embedding model. Specifically, we sample negative triples from the following distribution:
Please add a picture description
the final form of the negative sampling loss with self-adversarial training:
Please add a picture description

4.experiments

Guess you like

Origin blog.csdn.net/ptxx_p/article/details/120946263