Graph Neural Network (2) - Detailed Explanation of GCN-pytorch Version Code

Written in the front...
I used the graph neural network in my postgraduate work, so I usually read some papers and codes related to the graph neural network. The purpose of writing this series is to help myself understand the basic idea and process of the algorithm again. It would be great if it can help others at the same time~ The blogger is also in the process of learning. If there are some mistakes, please criticize and correct them. !

  • github: https://github.com/OuYangg/GNNs

1 Basic introduction of GCN

  • Paper title: Semi-supervised classification with graph convolutional networks
  • By Thomas N. Kipf, Max Welling

GCN is a graph convolutional neural network based on spectral domain. In the spectral-based GCN model, the input of each node is regarded as a signal, and before the convolution operation, the signal of the node is processed by using the eigenvector of the transposed normalized Laplacian matrix Fourier transform, after the convolution is completed, it is converted back with the eigenvector of the normalized Laplacian matrix. Among them, the formula for Fourier transforming the signal is as follows:
F ( x ) = UT x F(x)=U^TxF(x)=UTx
F − 1 ( x ) = U x F^{-1}(x) = Ux F1(x)=
Among U x , UUU is the normalized Laplacian matrixL = IN − D − 1 / 2 AD − 1 / 2 L=I_N-D^{-1/2}AD^{-1/2}L=IND1 / 2 AD1/2 of the eigenvectors. Based on convolution theory, the convolution operation is defined as:
gx = F − 1 ( F ( g ) F ( x ) ) = U ( UT g UT x ) , gx=F^{-1}(F(g) F (x))=U(U^TgU^Tx),gx=F1(F(g)F(x))=U(UT gUT x),
wherein,UT g U^TgUT gis a filter in the spectral domain, ifUT g U^TgUT greduces to a learnable diagonal matrixgw g_wgwgx = U gw UT xgx
=Ug_wU^Txgx=And MrwUT x.
A well-known spectral-based GCN model is ChebNet's idea is to use Chebyshev polynomials as parameters, get
gx = ∑ k = 0 K wk T k ( L ~ ) x gx=\sum_{k=0 }^K w_kT_k(\widetilde{L}) xgx=k=0KwkTk(L )x,
其中, T k ( x ) = 2 x T k − 1 ( x ) − T k − 2 ( x ) , T 0 ( x ) = 1 , T 1 ( x ) = x T_k(x) =2xT_{k-1}(x)-T_{k-2}(x), T_0(x)=1,T_1(x)=x Tk(x)=2xTk1(x)Tk2(x),T0(x)=1,T1(x)=x L ~ = 2 λ m a x L − I N \widetilde{L}=\frac{2}{\lambda_{max}}L-I_N L =lmax2LIN λ m a x \lambda_{max} lmaxfor LLThe largest eigenvalue of L.

GCN is based on ChebNet, let K = 1 K=1K=1 λ m a x ≈ 2 \lambda_{max} \approx 2 lmax2,得到
g w x = w 0 x + w 1 L ~ x g_w x = w_0x+w_1 \widetilde{L} x gwx=w0x+w1L x , whereL ~ \widetilde{L}L is simplified to D − 1 / 2 AD − 1 / 2 D^{-1/2}AD^{-1/2}D1 / 2 AD1/2,得到
g w x = w ( I N + D − 1 / 2 A D − 1 / 2 ) x g_w x=w(I_N+D^{-1/2}AD^{-1/2})x gwx=w(IN+D1 / 2 AD1/2)x,令 I N + D − 1 / 2 A D − 1 / 2 = D ~ − 1 / 2 A ~ D ~ − 1 / 2 I_N+D^{-1/2}AD^{-1/2} = \widetilde{D}^{-1/2}\widetilde{A}\widetilde{D}^{-1/2} IN+D1 / 2 AD1/2=D 1/2A D 1/2,得到
H = σ { D ~ − 1 / 2 A ~ D ~ − 1 / 2 X W } H=\sigma\{\widetilde{D}^{-1/2}\widetilde{A}\widetilde{D}^{-1/2}XW \} H=s { D 1/2A D 1 / 2 XW}
where,X ∈ RN × FX \in R^{N \times F}XRN × F is the input, that is, the feature matrix of the node,W ∈ RF × F ′ W\in R^{F \times F'}WRF×F is the parameter,F ′ F’F is the output size of the first layer,σ \sigmaσ is the ReLU activation function. The above is the forward propagation formula of GCN.

It doesn't matter if you don't understand a lot of formulas above! !
Everyone just needs to know that GCN is a relatively simple and easy-to-use graph neural network model. As for how easy it is to use, you can look at the picture below. The picture below shows a three-layer GCN that has not been trained. Classification, very shocked there is no.
insert image description here

2 code analysis

  • Code reference address: pyGCN
  • import required libraries
import math
import time
import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
import scipy.sparse as sp
import argparse

2.1 Import data

def encode_onehot(labels):
	"""使用one-hot对标签进行编码"""
    classes = set(labels)
    classes_dict = {
    
    c: np.identity(len(classes))[i, :] for i, c in
                    enumerate(classes)}
    labels_onehot = np.array(list(map(classes_dict.get, labels)),
                             dtype=np.int32)
    return labels_onehot

def normalize(mx):
    """行归一化"""
    rowsum = np.array(mx.sum(1))
    r_inv = np.power(rowsum, -1).flatten()
    r_inv[np.isinf(r_inv)] = 0.
    r_mat_inv = sp.diags(r_inv)
    mx = r_mat_inv.dot(mx)
    return mx

def sparse_mx_to_torch_sparse_tensor(sparse_mx):
    """将一个scipy sparse matrix转化为torch sparse tensor."""
    sparse_mx = sparse_mx.tocoo().astype(np.float32)
    indices = torch.from_numpy(
        np.vstack((sparse_mx.row, sparse_mx.col)).astype(np.int64))
    values = torch.from_numpy(sparse_mx.data)
    shape = torch.Size(sparse_mx.shape)
    return torch.sparse.FloatTensor(indices, values, shape)

def load_data(path="./cora/", dataset="cora"):
    """读取引文网络数据cora"""
    print('Loading {} dataset...'.format(dataset))
    idx_features_labels = np.genfromtxt("{}{}.content".format(path, dataset),
                                        dtype=np.dtype(str)) # 使用numpy读取.txt文件
    features = sp.csr_matrix(idx_features_labels[:, 1:-1], dtype=np.float32) # 获取特征矩阵
    labels = encode_onehot(idx_features_labels[:, -1]) # 获取标签

    # build graph
    idx = np.array(idx_features_labels[:, 0], dtype=np.int32)
    idx_map = {
    
    j: i for i, j in enumerate(idx)}
    edges_unordered = np.genfromtxt("{}{}.cites".format(path, dataset),
                                    dtype=np.int32)
    edges = np.array(list(map(idx_map.get, edges_unordered.flatten())),
                     dtype=np.int32).reshape(edges_unordered.shape)
    adj = sp.coo_matrix((np.ones(edges.shape[0]), (edges[:, 0], edges[:, 1])),
                        shape=(labels.shape[0], labels.shape[0]),
                        dtype=np.float32)

    # build symmetric adjacency matrix
    adj = adj + adj.T.multiply(adj.T > adj) - adj.multiply(adj.T > adj)

    features = normalize(features)
    adj = normalize(adj + sp.eye(adj.shape[0]))

    idx_train = range(140)
    idx_val = range(200, 500)
    idx_test = range(500, 1500)

    features = torch.FloatTensor(np.array(features.todense()))
    labels = torch.LongTensor(np.where(labels)[1])
    adj = sparse_mx_to_torch_sparse_tensor(adj)

    idx_train = torch.LongTensor(idx_train)
    idx_val = torch.LongTensor(idx_val)
    idx_test = torch.LongTensor(idx_test)

    return adj, features, labels, idx_train, idx_val, idx_test

2.2 GCN model framework

class GCNLayer(nn.Module):
	"""GCN层"""
    def __init__(self,input_features,output_features,bias=False):
        super(GCNLayer,self).__init__()
        self.input_features = input_features
        self.output_features = output_features
        self.weights = nn.Parameter(torch.FloatTensor(input_features,output_features))
        if bias:
            self.bias = nn.Parameter(torch.FloatTensor(output_features))
        else:
            self.register_parameter('bias',None)
        self.reset_parameters()

    def reset_parameters(self):
    	"""初始化参数"""
        std = 1./math.sqrt(self.weights.size(1))
        self.weights.data.uniform_(-std,std)
        if self.bias is not None:
            self.bias.data.uniform_(-std,std)

    def forward(self,adj,x):
        support = torch.mm(x,self.weights)
        output = torch.spmm(adj,support)
        if self.bias is not None:
            return output+self.bias
        return output

class GCN(nn.Module):
	"""两层GCN模型"""
    def __init__(self,input_size,hidden_size,num_class,dropout,bias=False):
        super(GCN,self).__init__()
        self.input_size=input_size
        self.hidden_size=hidden_size
        self.num_class = num_class
        self.gcn1 = GCNLayer(input_size,hidden_size,bias=bias)
        self.gcn2 = GCNLayer(hidden_size,num_class,bias=bias)
        self.dropout = dropout
    def forward(self,adj,x):
        x = F.relu(self.gcn1(adj,x))
        x = F.dropout(x,self.dropout,training=self.training)
        x = self.gcn2(adj,x)
        return F.log_softmax(x,dim=1)

2.3 Evaluation and training

def accuracy(output, labels):
    preds = output.max(1)[1].type_as(labels)
    correct = preds.eq(labels).double()
    correct = correct.sum()
    return correct / len(labels)

def train_gcn(epoch):
    t = time.time()
    model.train()
    optimizer.zero_grad()
    output = model(adj,features)
    loss = F.nll_loss(output[idx_train],labels[idx_train])
    acc = accuracy(output[idx_train],labels[idx_train])
    loss.backward()
    optimizer.step()
    loss_val = F.nll_loss(output[idx_val],labels[idx_val])
    acc_val = accuracy(output[idx_val], labels[idx_val])
    print('Epoch: {:04d}'.format(epoch+1),
          'loss_train: {:.4f}'.format(loss.item()),
          'acc_train: {:.4f}'.format(acc.item()),
          'loss_val: {:.4f}'.format(loss_val.item()),
          'acc_val: {:.4f}'.format(acc_val.item()),
          'time: {:.4f}s'.format(time.time() - t))


def test():
    model.eval()
    output = model(adj,features)
    loss_test = F.nll_loss(output[idx_test], labels[idx_test])
    acc_test = accuracy(output[idx_test], labels[idx_test])
    print("Test set results:",
          "loss= {:.4f}".format(loss_test.item()),
          "accuracy= {:.4f}".format(acc_test.item()))

if __name__ == '__main__':
    # 训练预设
    parser = argparse.ArgumentParser()
    parser.add_argument('--no-cuda', action='store_true', default=False,
                        help='Disables CUDA training.')
    parser.add_argument('--fastmode', action='store_true', default=False,
                        help='Validate during training pass.')
    parser.add_argument('--seed', type=int, default=42, help='Random seed.')
    parser.add_argument('--epochs', type=int, default=200,
                        help='Number of epochs to train.')
    parser.add_argument('--lr', type=float, default=0.01,
                        help='Initial learning rate.')
    parser.add_argument('--weight_decay', type=float, default=5e-4,
                        help='Weight decay (L2 loss on parameters).')
    parser.add_argument('--hidden', type=int, default=16,
                        help='Number of hidden units.')
    parser.add_argument('--dropout', type=float, default=0.5,
                        help='Dropout rate (1 - keep probability).')

    args = parser.parse_args()
    np.random.seed(args.seed)
    adj, features, labels, idx_train, idx_val, idx_test = load_data()
    model = GCN(features.shape[1],args.hidden,labels.max().item() + 1,dropout=args.dropout)
    optimizer = optim.Adam(model.parameters(),lr=args.lr,weight_decay=args.weight_decay)
    for epoch in range(args.epochs):
        train_gcn(epoch)

The result is as follows:
insert image description here

References

[1] Hamilton W L, Ying R, Leskovec J. Inductive representation learning on large graphs[J]. arXiv preprint arXiv:1706.02216, 2017.
[2] https://github.com/tkipf/pygcn

Guess you like

Origin blog.csdn.net/weixin_44027006/article/details/124100199