Graph representation learning and heterogeneous information networks


Selected from
[1] Li Qing, Wang Yichen, Du Chenglie. Review of research on graph representation learning methods [J/OL]. Computer Application Research: 1-16 [2023-05-11]. DOI: 10.19734/j.issn.1001-3695.2022 .09.0504.
[2] Ishikawa, Wang Ruijia, Wang Xiao. Review of analysis and application of heterogeneous information networks [J]. Journal of Software, 2022, 33(02): 598-621. DOI: 10.13328/j.cnki.jos. 006357.

Graph representation learning

Graph representation learning refers to the process of mapping nodes or whole (sub)graphs in a graph to a low-dimensional vector space. Its main purpose is to enable the geometric relationships in the low-dimensional vector space to reflect the structural information in the original graph. The embedding vectors learned in the optimized low-dimensional vector space can be used as feature input to process downstream tasks. When solving specific tasks, the objects of learning using graph representation learning technology are also different. Graph representation learning can be divided into the following three types:

  • Node embedding : Nodes in graph data often represent various important entities. These nodes themselves have a large number of attributes as their characteristics. By capturing the attributes of nodes and performing representation learning on them to obtain lower-dimensional node embeddings, it can be used to predict reality. Perform more effective analysis of entities in .
  • Edge embedding : The close or distant relationship that exists between entities can be represented by the edges of the graph. There are differences in the importance of different dependencies represented by edges, which makes some edges have weight attributes. Therefore, when performing table learning on edges, the attribute characteristics of the edge itself and the characteristics of the associated nodes can be obtained for supplementary representation.
  • Subgraph embedding : Subgraph embedding involves learning the representation of the subgraph as a whole to obtain a low-dimensional embedding vector. This method first needs to construct an appropriate subgraph according to certain rules, and then capture the attributes and structural characteristics of the nodes in the graph in the subgraph. When the amount of data being processed is large, subgraphs are sometimes used to replace the entire graph, which can greatly reduce computational costs.

basic concept

G = < V , E , T , X > G=<V,E,T,X> G=<V,E,T,X> V V V is the node set,EEE is the edge set,XXX represents the attribute matrix. Functionφ : V → T v \varphi:V\rightarrow T_vPhi:VTvϕ : E → TE \phi:E\rightarrow T_Eϕ:ETENodes and edges are mapped separately, where TV T_VTVRepresents a collection of node types, TE T_ETERepresents a collection of edge types. If TV = TE = 1 T_V=T_E=1TV=TE=1 , then G is an isomorphic graph, if∣ TV ∣ + ∣ TE ∣ > 2 |T_V|+|T_E|>2TV+TE>2 , then G is a heterogeneous graph.

Adjacency matrix AAA A [ i ] [ j ] = 1 A[i][j]=1 A[i][j]=1 meansvi v_iviand vj v_jvjThere is an edge between them, otherwise there is no edge.

degree matrix DDD is a diagonal matrix,D [ i ] [ i ] = ∑ j = 1 ∣ v ∣ A [ i ] [ j ] D[i][i]=\sum^{|v|}_{j=1 }A[i][j]D[i][i]=j=1vA [ i ] [ j ] displayvi v_ividegree

The graph Laplacian matrix is ​​L = D − AL=DAL=DA

img

Explanation : The value on the diagonal of the Laplacian matrix is ​​the degree of the node; if the node vi v_iviand vj v_jvjadjacent, then the value is -1; the Laplacian matrix is ​​a symmetric matrix. Spectral graph theory is a product of the combination of graph theory and linear algebra. It studies the properties of graphs by analyzing the eigenvalues ​​and eigenvectors of certain matrices of the graph. The Laplacian matrix is ​​the core and basic concept in spectral graph theory and has important applications in machine learning and deep learning.

The purpose of graph representation learning is to find a mapping f : vi → yi ∈ R df:v_i \rightarrow y_i \in R^df:viyiRd , insideyi y_iyizevi v_iviembedding vector, embedding dimension d ≪ ∣ V ∣ d\ll |V|dV

First-order similarity : if (vi, vj) ∈ E (v_i, v_j)\in Evivj)E,nananiviv_iviand vj v_jvjThe first-order similarity between two nodes is determined by the edge weight between the two nodes; otherwise vi v_iviand vj v_jvjThe first-order similarity between them is 0. First-order similarity captures direct neighbor relationships between nodes.

Second-order similarity and higher-order similarity : Second-order similarity captures the two-step relationship between each pair of nodes. For each node pair (vi, vj) (v_i, v_j)vivj) , the second-order similarity is determined by the same number of neighbor nodes shared by two vertices. It can also be usedvi v_ivito vj v_jvjto measure the two-step migration probability. Furthermore, high-order similarity captures the k-step relationship (k≥3) between each pair of nodes, which can better preserve the global structure of the graph.

Related technologies

Method based on dimensionality reduction analysis

The graph representation learning method based on dimensionality reduction analysis is to reduce the dimensionality of high-dimensional graph structure data into a low-dimensional representation while retaining the desired attributes of the original data. Its key advantage is that it can learn more hierarchical data features through a deep dimensionality reduction architecture, without the need to design specific artificial features for specific graph data forms, which significantly improves the effect of graph representation learning. Specifically, it can be divided into two methods: linear dimensionality reduction and nonlinear dimensionality reduction. The linear dimensionality reduction method can be divided into principal component analysis method, linear discriminant analysis method and multidimensional scaling method. The nonlinear dimensionality reduction method mainly includes: isometric. Mapping method, local linear embedding method, kernel method, etc.

Methods based on matrix factorization

The graph representation learning method based on matrix decomposition, also known as the graph decomposition method, usually uses a matrix to represent the characteristics of the graph, and the node embedding representation is achieved by decomposing this matrix. It is the first to achieve O(|E|) time complexity The graph of degree represents the learning method. It is mainly divided into two types, one is graph Laplacian matrix decomposition, and the other is matrix decomposition based on node proximity.

Methods based on random walks

The graph representation method based on random walk is suitable for graphs containing a large number of path relationships, where the path relationships contain the topological information of the graph. First, the walk randomness of the random walk-based graph representation method promotes the model to efficiently learn graph node information. Second, this method captures graph topology information by traversing adjacent nodes. Finally, the probabilistic model is executed on the randomly sampled paths to improve the node feature representation.

Deep learning based methods

Methods based on convolutional neural networks

The graph convolution network model (GCN) is a hierarchical network propagation model designed by researchers inspired by the graph convolution theory. It is mainly used in semi-supervised learning problems and is a variant based on efficient convolutional neural networks. GCN uses the local first-order approximation of spectral graph convolution to directly operate the graph and fully learn the hidden layer representation, which can capture the local topological information of the graph and learn the node feature information.

Method based on attention neural network

The graph attention network model (GAT) is a bold attempt to apply the attention mechanism to the field of graph representation learning. It solves the problem that existing graph representation methods based on convolutional neural networks can only capture the local topological information of the graph. GAT does not require pre-construction of graphs, so it also solves some problems existing in spectrum-based graph neural networks. The graph attention layer is the core design of GAT. First, GAT performs self-attention calculation in the graph attention layer based on the input node vector. Second, GAT uses a mask mechanism to avoid the loss of graph topology information. Finally, in order to reduce computational complexity, GAT strictly limits the attention mechanism to the set of neighborhood nodes of a node.

Methods based on generative adversarial networks

Compared with the training of other graph representation models, the idea of ​​generative adversarial networks avoids the use of Markov chains and greatly reduces computational costs. Secondly, in theory, all differentiable functions can be used to build generators and discriminators in generative adversarial networks, ensuring that the generative adversarial network framework can be easily applied to the field of graph representation learning without following any kind of factorization to design models. , all generators and discriminators can work normally. Finally, the idea of ​​adversarial learning of generative adversarial networks greatly improves the robustness of the model.

Method based on contrastive learning network

Most of the existing graph representation learning methods are only suitable for supervised learning, and it is difficult to obtain the high-quality labels required. The graph representation learning method based on contrastive learning solves this problem better. It focuses on self-supervised learning, distinguishes and classifies different node information by learning high-level abstraction level features, and then encodes the attributes and structural features of the nodes.

Method based on spatiotemporal dynamic network

Graph representation neural network models have been widely used in modeling and representation learning of graph-structured data, but existing methods are limited to static network data and ignore the dynamic scalability of network models. Many networks in the real world exhibit dynamic behaviors, including topological evolution, feature evolution, diffusion, etc. The emergence of graph representation learning methods based on spatiotemporal dynamic networks solves this problem. Therefore, in dynamic graph representation learning, reasonably describing the dynamic changes of nodes and edges is the focus of research.

hypergraph

The relationship between node pairs in a social network is far more complex than the relationship between simple graph edges, so it is difficult for a simple graph to fully represent the information of a social network. The emergence of hypergraphs solves this problem. Hypergraph structures are often used to simulate high-order correlations between data and can better build community structures in network data. The hypergraph neural network model applies the idea of ​​graph convolution to the hypergraph field and efficiently captures the nonlinear high-order correlation of nodes.

Heterogeneous graph

Heterogeneous graph representation aims to learn representations of different types of nodes in graphs in a low-dimensional space while preserving heterogeneous structure and semantics for downstream tasks. Compared with hypergraph representation, heterogeneous graph representation has stronger ability to describe and mine nonlinear high-order correlations between data samples, while hypergraph representation pays more attention to modeling multivariate relationships between nodes. In addition, compared with isomorphic graph representation, both heterogeneous graph representation and hypergraph representation can learn more comprehensive information and richer semantics in the graph, and are more flexible when processing multi-modal data, and are more convenient for multi-modal fusion and Extension, widely used in fields such as knowledge graphs.

heterogeneous information network

Most works model information networks as homogeneous information networks (homogeneous information networks), that is, the network only contains the same types of objects and links, such as author collaboration networks and friend circles. Homogeneous network modeling methods Often only part of the information in the actual interactive system is extracted, or the heterogeneity of objects and the relationships between them is not distinguished, resulting in irreversible information loss.

In recent years, more researchers have modeled multiple types of interconnected networked data as heterogeneous information networks (heterogeneous information networks) to achieve a more complete and natural abstraction of the real world, for example, literature data. It contains different types of objects such as authors, papers, conferences, etc., and there are many types of relationships between these objects: the writing/being written relationship between authors and papers, the publishing/being published relationship between conferences and papers, etc. Using heterogeneous networks Modeling this type of rich and interactive data can retain more comprehensive semantic and structural information.

Compared with homogeneous networks, heterogeneous network modeling brings two benefits:

  1. Heterogeneous networks are an effective tool for fusion of information. They can not only naturally fuse different types of objects and their interactions , but also can fuse information from heterogeneous data sources . In particular, with the advent of the "big data" era, in the "big data" era Many types of different objects are interconnected in the Internet. It is difficult to model these interacting objects as homogeneous networks, but heterogeneous network modeling can be naturally used. At the same time, the heterogeneous multi-source "big data" generated by different platforms only captures Even with some biased features, heterogeneous networks can naturally fuse information from these heterogeneous data sources to comprehensively characterize user characteristics. Therefore, heterogeneous network modeling not only becomes a powerful tool to address the diversity of big data, but also becomes the main method for breadth learning;
  2. Multiple types of objects and relationships coexist in heterogeneous networks, containing rich structural and semantic information , thus providing an accurate and interpretable new way to discover hidden patterns. For example, there are no longer only users and products in the heterogeneous network of recommendation systems. Two kinds of objects, but include more comprehensive content such as stores and brands; relationships are no longer limited to purchases, but include more refined interactions such as collections and favorites.

basic concept

Information network : For an object type mapping function φ : V → A \varphi:V\rightarrow APhi:VA and relation type mapping functionϕ : E → R \phi:E\rightarrow Rϕ:EThe directed graph G of R = < V , E , φ , ϕ > G=<V,E,\varphi,\phi>G=<V,E,Phi ,ϕ> , where each objectv ∈ V v\in VvV belongs to the set AAof object typeA specific object type in A , each link e ∈ E e\in EeE belongs to the relation type setRRA specific relationship type in R.

Heterogeneous/homogeneous network : If the number of object types | A | > 1 or the number of relationship types | R | > 1 in the information network, it is called a heterogeneous network; otherwise, it is called a homogeneous network.

Network mode : The network mode is recorded as TG = (A, R) T_G=(A,R)TG=(A,R ) , is with object type mappingφ \varphiφ and relation type mappingφ \phiϕ ’s information networkG = < V , E , φ , ϕ > G=<V,E,\varphi,\phi>G=<V,E,Phi ,ϕ> meta-pattern.

Figure 1 shows the information network constructed by literature data. (b) illustrates the network pattern describing heterogeneous network objects in literature and the types of relationships between them. (a) is a network example of (b). In this example, contains 3 types of objects: paper ( P ), author ( A ) and conference ( C ). Links connect objects of different types, and the type of the link is defined by the relationship between the two object types, for example, The link between the author and the paper represents the relationship of writing or being written, while the link between the conference and the paper represents the relationship of publishing or being published.

image-20230506212735113

semantic exploration method

image-20230506212735113

metapath

Metapath P is in network mode TG = (A, R) T_G=(A,R)TG=(A,The path defined on R ) is denoted as A 1 → R 1 A 2 → R 2 . . . → R l A l + 1 A_1\stackrel{R_1}{\rightarrow} A_2\stackrel{R_2}{\rightarrow}. ..\stackrel{R_l}{\rightarrow} A_{l+1}A1R1A2R2...RlAl+1At the same time, define objects A 1 , A 2 , . . . , A l + 1 A_1,A_2,...,A_{l+1}A1,A2,...,Al+1The composite relationship between R = R 1 ∘ R 2 ∘ … R l R=R_1\circ R_2\circ…R_lR=R1R2Rl, where, ∘ \circ represents the composition operator on a relation.

Take the movie recommendation heterogeneous network shown in Figure 2 as an example. Users can be connected through meta-paths, such as U → rate M → rate − 1 UU\stackrel{rate}{\rightarrow} M\stackrel{rate^-1}{ \rightarrow}UUrateMrate1In (UMU UMUUMU)路径和 U → r a t e M → d i r e c t − 1 D → d i r e c t M → r a t e − 1 U U\stackrel{rate}{\rightarrow} M\stackrel{direct^-1}{\rightarrow}D\stackrel{direct}{\rightarrow}M\stackrel{rate^-1}{\rightarrow}U UrateMdirect1DdirectMrate1U ( U M D M U UMDMU U M D M U ), etc. These paths contain different semantics,UMU UMUThe U M U path refers to the user rating the same movie (i.e. a common rating relationship), while UMDMU UMDMUThe U M D M U path indicates that the user rated the movie works of the same director.

Meta-path essentially extracts the substructure of heterogeneous networks and embodies the rich semantic information contained in paths, thus becoming the basic semantic capture method in heterogeneous network analysis. However, due to its simple structure, it cannot capture more precise or complex Semantics are often limited.

restricted element path

The UMU path cannot describe the common rating relationship of certain types of movies accurately. Therefore, the restricted element path came into being.

The restricted element path is an element path based on a certain constraint, which can be expressed as C P = P |C. Where, P = ( A 1 , A 2 , . . . , A l A_1,A_2,...,A_lA1,A2,...,Al​) represents the meta-path, and C represents the constraints on the objects in the meta-path P. The restricted meta-path UMU | M. T = "Comedy" uses the "Comedy" label to constrain the movie, so that the path represents the user's common rating relationship for comedy movies .

weighted metapath

The meta-path does not consider the attributes on the link, such as the user's rating information for the movie, so that the attribute differences of the links between path instances induce large semantic differences. Therefore, the concept of weighted meta-path is proposed to further constrain the link attribute information. .

The weighted meta-path is an extended meta-path that imposes constraints on the value of the relationship attribute. It can be expressed as A 1 → δ ( R 1 ) A 2 → δ ( R 2 ) . . . → δ ( R l ) A l + 1 ∣ C A_1\stackrel{\delta(R_1)}{\rightarrow} A_2\stackrel{\delta(R_2)}{\rightarrow}...\stackrel{\delta(R_l)}{\rightarrow} A_{l+ 1}|CA1d ( R1)A2d ( R2)...d ( Rl)Al+1C. The attribute value of the rating relationship between userUand movieMcan range from 1 to 5 points. Weighted meta pathU → 1 MU\stackrel{1}{\rightarrow} MU1M (i.e.U(1)M) indicates that the user's rating for the movie is 1, which means that the user does not like the movie; the weighted meta-pathU → 1 , 2 M → 1 , 2 UU\stackrel{1,2 }{\rightarrow} M\stackrel{1,2}{\rightarrow}UU1,2M1,2U means that the user and the target user do not like the same movie.

Metastructure/Metagraph

The meta path is defined in the meta pattern TG = (A, R) T_G=(A,R)TG=(A,R ) , and the meta-structure/meta-graphMcan be viewed as a directed acyclic graph composed of multiple meta-paths with common nodes.

For the meta-paths UMDMU and UMAMU , they can only describe two users' ratings of the same director's movies or the same actors appearing in the rated movies. They cannot express the public relations contained in the two meta-paths at the same time: two users' ratings of the same director's movies. The scores were scored and the same actors appeared in the movie. The semantics can be described using meta-structure/meta-graph, as shown in Figure 2©. It can be seen that the meta-structure/meta-graph M is a directed network model defined on Acyclic graph.

Heterogeneous network representation learning

Analogy diagram represents the method of learning
image-20230506212735113

Guess you like

Origin blog.csdn.net/qq_43570515/article/details/130623999