What can graph neural networks do?

Conceptually, we can divide the basic learning tasks of graph neural networks into 5 different directions:

(1) Graph Neural Network method;
(2) Theoretical understanding of Graph Neural Network;
(3) Scalability of Graph Neural Network;
(4) Interpretability of Graph Neural Network;
(5) Robustness of Graph Neural Network sex.

Graph neural network approach. Graph Neural Networks are neural network architectures specifically designed to operate on graph-structured data. The goal of a graph neural network is to iteratively update node representations by aggregating representations of neighboring nodes and their representations in previous iterations. A variety of graph neural networks have been proposed (Kipf and Welling, 2017a; Petar et al, 2018; Hamilton et al, 2017b; Gilmer et al, 2017; Xu et al, 2019d; Veličković et al, 2019d; Veličković et al , 2019; Kipf and Welling, 2016), they can be further divided into supervised GNN and unsupervised GNN. After learning node representation, one of the basic tasks of GNN is to classify nodes, that is, to classify nodes into some predefined categories. Although various GNNs have achieved great success, we still face a serious problem when training deep graph neural networks —the oversmoothing problem (Li et al, 2018b), where all nodes have similar representations. There have been many recent studies proposing different remedies to solve the oversmoothing problem.

A theoretical understanding of graph neural networks. The rapid development of GNN algorithms has aroused great interest in the theoretical analysis of GNNs. In particular, efforts have been made to describe how expressive GNNs are compared to traditional graph algorithms such as graph kernel-based methods, and how to build more powerful GNNs to overcome some of the limitations of GNNs. Specifically, Xuetal (2019d) demonstrated that current GNN methods can achieve the expressiveness of the one-dimensional Weisfeiler-Lehman test (Weisfeiler and Leman, 1968), which is widely used in the field of traditional graph kernels (Shervashidze etal, 2011b)

Many recent studies have further proposed a series of design strategies to further exceed the expressive power of the one-dimensional Weisfeiler-Lehman test, including attaching random attributes, distance attributes, and exploiting higher-order structures, etc.

Scalability of graph neural networks. With the increasing popularity of graph neural networks, many people try to use various graph neural network methods for real-world applications, where the size of the graph can be about 100 million nodes and 1 billion edges. Unfortunately, most GNN methods cannot be directly applied to these large-scale graph-structured data due to the large amount of memory required (Hu et al., 2020b). Specifically, this is because most GNNs need to store the entire adjacency matrix and feature matrices of intermediate layers in memory, which poses a huge challenge to computer memory consumption and computational cost. To address these issues, many recent studies have proposed various sampling strategies, such as node sampling (Hamilton et al, 2017b; Chen et al, 2018d), layer sampling (Chen and Bansal, 2018; Huang, 2018) and graph sampling ( Chiang et al, 2019; Zeng et al, 2020a).

Interpretability of graph neural networks. In order to make the machine learning process comprehensible to humans, explainable artificial intelligence is becoming more and more popular, especially due to the black-box problem of deep learning techniques. Therefore, there is also great interest in improving the interpretability of GNNs. Generally speaking, the interpretation results of GNN can be important nodes, edges, or important features of nodes or edges. Technically, methods based on white-box approximation (Baldassarre and Azizpour, 2019; Sanchez-Lengeling et al., 2020) exploit information inside the model (such as gradients, intermediate features, and model parameters) to provide explanations. In contrast, methods based on black-box approximation (Huang et al, 2020c; Zhang et al, 2020a; Vu and Thai, 2020) abandon the use of internal information of complex models, and instead exploit simple models that are inherently interpretable ( such as linear regression and decision trees) to fit complex models. However, most existing works are time-consuming, which makes processing large-scale graphs a bottleneck. To this end, there have been many recent efforts to develop more efficient methods without compromising the accuracy of interpretation.

Adversarial Robustness of Graph Neural Networks . Trustworthy machine learning has attracted a lot of attention lately. This is because existing research shows that deep learning models can be deliberately fooled, evaded, misled, and stolen (Goodfellow et al, 2015). Thus, there has been a series of works extensively studying the robustness of models in fields such as computer vision and natural language processing , which has inspired similar studies on the robustness of GNNs. Technically, the standard way to study the robustness of GNNs (via adversarial examples) is to construct a small change in the input graph data, and then observe whether it leads to a large change in the prediction result (such as node classification accuracy). At present, more and more people start to study adversarial attack (Dai et al, 2018a; Wang and Gong, 2019; Wu et al, 2019b; Zügner et al, 2018; Zügner et al, 2020) and adversarial training (  Xu et al, 2019c; Feng et al, 2019b; Chen et al, 2020i; Jin and Zhang, 2019). Many recent efforts have focused on providing theoretical guarantees and developing new algorithms in terms of adversarial training as well as certified robustness.

Graph Neural Network Applications

Since graph neural networks can model various data with complex structures, graph neural networks have been widely used in various applications and fields, such as modern recommendation systems, computer vision (Computer Vision, CV), natural language processing ( Natural Language Processing (NLP), program analysis, software mining , bioinformatics, anomaly detection, and smart cities. Although GNNs are used to solve different tasks in different applications, they all include two important steps—graph construction and graph representation learning. Graph construction aims to transform or represent input data into structured data . On the basis of graphs, graph representation learning is aimed at downstream tasks, using GNN to learn node embedding or graph embedding. Next, for different applications, we will briefly introduce the technologies involved in these two steps.

1 graph construction

Graph construction is important to capture the dependencies between objects in the input data. In view of the different formats of the input data, different applications have different graph construction techniques, among which some tasks need to pre-define the semantics of nodes and edges to fully express the structural information of the input data.

Input data with an explicit graph structure. Some applications naturally have a graph structure inside the data without the need to pre-define the nodes and the edges or relationships between them. For example: in recommender systems, user-item interactions naturally form a graph, where preferences of users and items are considered as edges between nodes of users and items; in the task of drug development, molecules are also naturally The ground is represented as a graph, where each node represents an atom, and each edge represents a bond connecting two atoms; in the task of protein function and interaction, the graph can also be easily adapted to proteins, where each A node represents an amino acid, and each edge represents an interaction between amino acids.

Some graphs are constructed with properties of nodes and edges. For example, when dealing with smart city traffic, the traffic network can be formalized as an undirected graph to predict the traffic state. Specifically, nodes are traffic sensing locations, such as sensor stations and road segments, and edges are intersections connecting these traffic sensing locations. Some urban transportation networks can be modeled as directed graphs with predicted traffic speed attributes, where nodes are road segments and edges are intersections. The width, length, and direction of road segments are represented as attributes of nodes, and the type of intersection, whether there are traffic lights or toll booths, are represented as attributes of edges.

Input data with an implicit graph structure. Graph construction becomes very challenging for many tasks for which structured data does not naturally exist. It is important to choose the best representation method so that all important information can be captured by nodes and edges. For example, computer vision tasks have three ways of graph construction. The first is to divide an image or video frame into regular grids, each grid can be used as a vertex of the visual graph. The second is to obtain the preprocessed structure first, and then directly borrow the vertex representation, such as the generation of the scene graph. The third is to use semantic information to represent visual vertices, such as assigning pixels with similar features to the same vertex. Edges in a visual image capture two kinds of information. The first is spatial information. For example, for static methods, when generating scene graphs (Xu et al, 2017a) and human skeletons (Jain et al, 2016a), edges between nodes in the visual graph are naturally chosen to represent their positional connections. The second is time information. For example, to represent video, a model must not only establish spatial relationships within frames, but also capture temporal connections between adjacent frames.

In natural language processing tasks, graphs constructed from text data can be classified into five categories—text graphs, syntactic graphs , semantic graphs, knowledge graphs , and hybrid graphs. Four of these categories are described below. Text graphs typically treat words, sentences, paragraphs, or documents as nodes and build edges through word co-occurrence, location, or textual similarity. Syntactic graphs (or trees) emphasize the grammatical dependencies between words in a sentence, such as dependency graphs and constituent graphs. Knowledge graphs are data graphs designed to accumulate and communicate real-world knowledge. Hybrid graphs contain multiple types of nodes and edges to integrate heterogeneous information. In the task of program analysis, representations of graph representations of programs include syntax trees, control flow, data flow, program dependencies, and call graphs, each of which provides a different view of the program. At a higher level, a program can be thought of as a heterogeneous set of entities that are related to each other through various relationships. This view directly maps the program as a heterogeneous directed graph, where each entity is represented as a node and each type of relationship is represented as an edge.

2 Graph Representation Learning

After getting the graph representation of the input data, the next step is to apply GNN to learn the graph representation. Some studies directly use typical GNNs, such as GCN (Kipf and Welling, 2017a), GAT (Petar et al, 2018), GGNN (Li et al, 2016a) and GraphSage (Hamilton et al, 2017b), and can be extended to different application tasks. However, some special tasks require additional design on the GNN architecture to better handle specific issues. For example, for tasks in recommender systems, people proposed PinSage (Ying et al, 2018a), which aims to take the top  k  count nodes of a node as its receptive field and perform weighted aggregation. PinSage can scale to web-scale recommender systems with millions of users and items. KGCN (Wang et al, 2019d) aims to improve the representation of items by aggregating corresponding entity neighborhoods in a knowledge graph. The idea of ​​KGAT (Wang et al, 2019j) is basically similar to that of KGCN. The former just adds an auxiliary loss in the reconstruction of the knowledge map. For example, in the NLP task of KB-alignment, Xu et al (2019f) formulated it as a graph matching problem and proposed a graph attention-based method: first match all entities in two knowledge graphs, and then Joint modeling is performed according to local matching information, and then graph-level matching vectors are obtained. We will introduce the GNN technology for various applications in detail in the following content.

This article is excerpted from "Graph Neural Networks: Basics, Frontiers and Applications"

Graph Neural Networks: Basics, Frontiers, and Applications

Frontier: Graph neural network is an emerging development direction in the fields of machine learning, data science, and data mining. It is called deep learning on graphs, and it is expected to promote the smooth development of the third generation of artificial intelligence.

Rich: Summarize the basic theory, simulation algorithms, research frontiers and extensive and emerging application scenarios of graph neural networks

In-depth: Abandoning the thinking of simply introducing concepts and frameworks, in-depth analysis of the current situation of graph neural networks and future adjustments and opportunities, to help professionals and beginners know what they are and why.

This book is dedicated to introducing the basic concepts and algorithms, research frontiers, and extensive and emerging applications of graph neural networks, covering a wide range of topics in graph neural networks, from basics to cutting-edge, from methods to applications, and from methodology to application scenarios. . The book is divided into four parts: the first part introduces the basic concept of graph neural network; the second part discusses the mature methods of graph neural network; the third part introduces the typical frontier fields of graph neural network; the fourth part describes the possible future research on graph neural network. The progress of important and promising methods and applications are compared.

This book is suitable for reading and reference for advanced undergraduate and graduate students, postdoctoral researchers, lecturers, and industry practitioners.

 

Guess you like

Origin blog.csdn.net/epubit17/article/details/130313580