[Pytorch neural network combat case] 41 Paper classification based on the Cora dataset to implement the Multi_Sample Dropout graph convolution network model

Multi-sample Dropout is a variant of Dropout. This method has better generalization ability than ordinary Dropout, and at the same time can shorten the training time of the model. XMuli-sampleDropout can also reduce the error rate and loss of the training set and validation set, see the paper number arXIV:1905.09788,2019

1 Example description

This example uses the Muli-sampleDropout method to shorten the training time for the graph convolution model.

1.1 Multi-sample Dropout method/Multi-sample joint Dropout

The optimization is performed on the part that Dropout randomly selects nodes to discard, that is, a group of nodes randomly selected by Dropout becomes randomly selected multiple groups of nodes, and the result of each group of nodes and the loss value of back propagation are calculated. Finally, the calculated loss values ​​of multiple groups are averaged to obtain the final loss value, which is used to update the network, as shown in Figure 9-19.

Multi-sampleDropout uses two sets of different masks to select two sets of nodes for training in the Dropout layer. This approach is equivalent to the network layer only runs the sample once, but outputs multiple results and performs multiple trainings. Therefore, it can greatly reduce the number of iterations for training.

1.1.2 Features

In a deep neural network, too many operations occur in the convolutional layer before the Dropout layer, and Muiti-sample Dropout does not repeat these calculations, so Multi-sample Dropout has little effect on the computational cost of each iteration. It can drastically speed up training.

2 code implementation

Pytorch Neural Network Practical Study Notes_40 [Actual Combat] Graph Convolutional Neural Network for Paper Classification information, as well as information on mutual citations between papers. Build an AI model, analyze the paper information in the data set, and predict the category of papers of unknown classification according to the classification characteristics of existing papers. 1.1 Using the characteristics of graph convolutional neural networks Use graph neural networks to implement classification. The difference from the deep learning model is that the graph neural network will use the characteristics of the text itself and the relationship between the papers for processing, and only a small number of samples are needed to achieve good results. 1.2 CORA data set The CORA data set is compiled from machine learning papers, recording the key used in each paper... https://blog.csdn.net/qq_39237205/article/details/123863327 Based on the above code Modify 2.7 to build multi-layer graph convolution and training part

 2 Code writing

2.1 Code combat: Introduce basic modules and set up the running environment----Cora_GNN.py (Part 1)

from pathlib import Path # 引入提升路径的兼容性
# 引入矩阵运算的相关库
import numpy as np
import pandas as pd
from scipy.sparse import coo_matrix,csr_matrix,diags,eye
# 引入深度学习框架库
import torch
from torch import nn
import torch.nn.functional as F
# 引入绘图库
import matplotlib.pyplot as plt
import os
os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE"

# 1.1 导入基础模块,并设置运行环境
# 输出计算资源情况
device = torch.device('cuda')if torch.cuda.is_available() else torch.device('cpu')
print(device) # 输出 cuda

# 输出样本路径
path = Path('./data/cora')
print(path) # 输出 cuda

Output result:

2.2 Code implementation: reading and parsing paper data----Cora_GNN.py (Part 2)

# 1.2 读取并解析论文数据
# 读取论文内容数据,将其转化为数据
paper_features_label = np.genfromtxt(path/'cora.content',dtype=np.str_) # 使用Path对象的路径构造,实例化的内容为cora.content。path/'cora.content'表示路径为'data/cora/cora.content'的字符串
print(paper_features_label,np.shape(paper_features_label)) # 打印数据集内容与数据的形状

# 取出数据集中的第一列:论文ID
papers = paper_features_label[:,0].astype(np.int32)
print("论文ID序列:",papers) # 输出所有论文ID
# 论文重新编号,并将其映射到论文ID中,实现论文的统一管理
paper2idx = {k:v for v,k in enumerate(papers)}

# 将数据中间部分的字标签取出,转化成矩阵
features = csr_matrix(paper_features_label[:,1:-1],dtype=np.float32)
print("字标签矩阵的形状:",np.shape(features)) # 字标签矩阵的形状

# 将数据的最后一项的文章分类属性取出,转化为分类的索引
labels = paper_features_label[:,-1]
lbl2idx = { k:v for v,k in enumerate(sorted(np.unique(labels)))}
labels = [lbl2idx[e] for e in labels]
print("论文类别的索引号:",lbl2idx,labels[:5])

output:

2.3 Read and parse paper relational data

Load the relationship data of papers, convert the relationship represented by the paper ID in the data into a renumbered relationship, treat each paper as a vertex, and the citation relationship between papers as an edge, so that the relationship data of papers can use a represented by the graph structure.

 Compute the adjacency matrix of this graph structure and convert it to an undirected graph adjacency matrix.

2.3.1 Code Implementation: Transformation Matrix----Cora_GNN.py (Part 3)

# 1.3 读取并解析论文关系数据
# 读取论文关系数据,并将其转化为数据
edges = np.genfromtxt(path/'cora.cites',dtype=np.int32) # 将数据集中论文的引用关系以数据的形式读入
print(edges,np.shape(edges))
# 转化为新编号节点间的关系:将数据集中论文ID表示的关系转化为重新编号后的关系
edges = np.asarray([paper2idx[e] for e in edges.flatten()],np.int32).reshape(edges.shape)
print("新编号节点间的对应关系:",edges,edges.shape)
# 计算邻接矩阵,行与列都是论文个数:由论文引用关系所表示的图结构生成邻接矩阵。
adj = coo_matrix((np.ones(edges.shape[0]), (edges[:, 0], edges[:, 1])),shape=(len(labels), len(labels)), dtype=np.float32)
# 生成无向图对称矩阵:将有向图的邻接矩阵转化为无向图的邻接矩阵。Tip:转化为无向图的原因:主要用于对论文的分类,论文的引用关系主要提供单个特征之间的关联,故更看重是不是有关系,所以无向图即可。
adj_long = adj.multiply(adj.T < adj)
adj = adj_long + adj_long.T

output:

2.4 Matrix data of processing graph structure

The matrix data of the graph structure is processed to make it better show the characteristics of the graph structure, and it participates in the model calculation of the neural network.

2.4.1 Procedure for processing matrix data of graph structure

1. Normalize the feature data of each node.
2. Add 1 to the diagonal of the adjacency matrix: because in the classification task, the main function of the adjacency matrix is ​​to help node classification through the association between papers. For the nodes on the diagonal, the meaning of the representation is the relationship between itself and itself. Setting the diagonal nodes to 1 (self-loop graph) indicates that the nodes will also help with the classification task.
3. Normalize the adjacency matrix after complementing by 1.

2.4.2 Code implementation: Matrix data of processing graph structure----Cora_GNN.py (Part 4)

# 1.4 加工图结构的矩阵数据
def normalize(mx): # 定义函数,对矩阵的数据进行归一化处理
    rowsum = np.array(mx.sum(1)) # 计算每一篇论文的字数==>02 对A中的边数求和,计算出矩阵A的度矩阵D^的特征向量
    r_inv = (rowsum ** -1).flatten() # 取总字数的倒数==>03 对矩阵A的度矩阵D^的特征向量求逆,并得到D^逆的特征向量
    r_inv[np.isinf(r_inv)] = 0.0 # 将NaN值取为0
    r_mat_inv = diags(r_inv) # 将总字数的倒数变为对角矩阵===》对图结构的度矩阵求逆==>04 D^逆的特征向量转化为对角矩阵,得到D^逆
    mx = r_mat_inv.dot(mx) # 左乘一个矩阵,相当于每个元素除以总数===》对每个论文顶点的边进行归一化处理==>05 计算D^逆与A加入自环(对角线为1)的邻接矩阵所得A^的点积,得到拉普拉斯矩阵。
    return mx
# 对features矩阵进行归一化处理(每行总和为1)
features = normalize(features) #在函数normalize()中,分为两步对邻接矩阵进行处理。1、将每篇论文总字数的倒数变成对角矩阵。该操作相当于对图结构的度矩阵求逆。2、用度矩阵的逆左乘邻接矩阵,相当于对图中每个论文顶点的边进行归一化处理。
# 对邻接矩阵的对角线添1,将其变为自循环图,同时对其进行归一化处理
adj = normalize(adj + eye(adj.shape[0])) # 对角线补1==>01实现加入自环的邻接矩阵A

2.5 Convert data into tensors and allocate computing resources

Convert the processed graph structure matrix data to a tensor type supported by PyTorch, and divide it into 3 parts for training, testing and validation.

2.5.1 Code implementation: convert data into tensors and allocate computing resources----Cora_GNN.py (Part 5)

# 1.5 将数据转化为张量,并分配运算资源
adj = torch.FloatTensor(adj.todense()) # 节点间关系 todense()方法将其转换回稠密矩阵。
features = torch.FloatTensor(features.todense()) # 节点自身的特征
labels = torch.LongTensor(labels) # 对每个节点的分类标签

# 划分数据集
n_train = 200 # 训练数据集大小
n_val = 300 # 验证数据集大小
n_test = len(features) - n_train - n_val # 测试数据集大小
np.random.seed(34)
idxs = np.random.permutation(len(features)) # 将原有的索引打乱顺序

# 计算每个数据集的索引
idx_train = torch.LongTensor(idxs[:n_train]) # 根据指定训练数据集的大小并划分出其对应的训练数据集索引
idx_val = torch.LongTensor(idxs[n_train:n_train+n_val])# 根据指定验证数据集的大小并划分出其对应的验证数据集索引
idx_test = torch.LongTensor(idxs[n_train+n_val:])# 根据指定测试数据集的大小并划分出其对应的测试数据集索引

# 分配运算资源
adj = adj.to(device)
features = features.to(device)
labels = labels.to(device)
idx_train = idx_train.to(device)
idx_val = idx_val.to(device)
idx_test = idx_test.to(device)

2.6 Graph Convolution

The essence of graph convolution is dimension transformation, that is, transform each node feature data with in dimension into node feature data with out dimension.

The operation of graph convolution combines the input node features, weight parameters, and processed adjacency matrix together to perform a dot product operation.

The weight parameter is a matrix of size in×out, where in represents the feature dimension of the input node, and out represents the final feature dimension to be output. The function of weight parameters in dimension transformation is understood as the weight of a fully connected network, but in graph convolution, it will perform more dot product operations of node relationship information than fully connected networks.

 As shown in the figure above, the relationship between the fully connected network and the graph convolutional network after ignoring the bias is listed. It can be clearly seen from this that the graph convolutional network actually adds node relationship information on the basis of the fully connected network.

2.6.1 Code implementation: define Mish activation function and graph convolution operation class----Cora_GNN.py (Part 6)

Add bias to the algorithm base shown in the figure above and define the GraphConvolution class

# 1.6 定义Mish激活函数与图卷积操作类
def mish(x): # 性能优于RElu函数
    return x * (torch.tanh(F.softplus(x)))
# 图卷积类
class GraphConvolution(nn.Module):
    def __init__(self,f_in,f_out,use_bias = True,activation=mish):
        # super(GraphConvolution, self).__init__()
        super().__init__()
        self.f_in = f_in
        self.f_out = f_out
        self.use_bias = use_bias
        self.activation = activation
        self.weight = nn.Parameter(torch.FloatTensor(f_in, f_out))
        self.bias = nn.Parameter(torch.FloatTensor(f_out)) if use_bias else None
        self.initialize_weights()

    def initialize_weights(self):# 对参数进行初始化
        if self.activation is None: # 初始化权重
            nn.init.xavier_uniform_(self.weight)
        else:
            nn.init.kaiming_uniform_(self.weight, nonlinearity='leaky_relu')
        if self.use_bias:
            nn.init.zeros_(self.bias)

    def forward(self,input,adj): # 实现模型的正向处理流程
        support = torch.mm(input,self.weight) # 节点特征与权重点积:torch.mm()实现矩阵的相乘,仅支持二位矩阵。若是多维矩则使用torch.matmul()
        output = torch.mm(adj,support) # 将加工后的邻接矩阵放入点积运算
        if self.use_bias:
            output.add_(self.bias) # 加入偏置
        if self.activation is not None:
            output = self.activation(output) # 激活函数处理
        return output

2.7 Building a multi-layer graph convolutional network model with Multi_Sample Dropout---Cora_GNN_MUti-sample-Dropout.py (modified part 1)

# 1.7 搭建带有Multi_Sample Dropout的多层图卷积网络模型:根据GCN模型,
class GCNTD(nn.Module):
    def __init__(self,f_in,n_classes,hidden=[16],dropout_num = 8,dropout_p=0.5 ): # 默认使用8组dropout,每组丢弃率为0.5
        # super(GCNTD, self).__init__()
        super().__init__()
        layer = []
        for f_in,f_out in zip([f_in]+hidden[:-1],hidden):
            layer += [GraphConvolution(f_in,f_out)]
        self.layers = nn.Sequential(*layer)
        # 默认使用8个Dropout分支
        self.dropouts = nn.ModuleList([nn.Dropout(dropout_p,inplace=False) for _ in range(dropout_num)] )
        self.out_layer = GraphConvolution(f_out,n_classes,activation=None)
    def forward(self,x,adj):
        # Multi - sampleDropout结构默认使用了8个Dropout分支。在前向传播过程中,具体步骤如下。
        # ①输入样本统一经过多层图卷积神经网络来到Dropout层。
        # ②由每个分支的Dropout按照指定的丢弃率对多层图卷积的结果进行Dropout处理。
        # ③将每个分支的Dropout数据传入到输出层,分别得到结果。
        # ④将所有结果加起来,生成最终结果。
        for layer,d in zip(self.layers,self.dropouts):
            x = layer(x,adj)
        if len(self.dropouts) == 0:
            return self.out_layer(x,adj)
        else:
            for i, dropout in enumerate(self.dropouts): # 将每组的输出叠加
                if i == 0 :
                    out = dropout(x)
                    out = self.out_layer(out,adj)
                else:
                    temp_out = dropout(x)
                    out = out + self.out_layer(temp_out,adj)
            return out # 返回结果

n_labels = labels.max().item() + 1 # 获取分类个数7
n_features = features.shape[1] # 获取节点特征维度 1433
print(n_labels,n_features) # 输出7与1433

def accuracy(output,y): # 定义函数计算准确率
    return (output.argmax(1) == y).type(torch.float32).mean().item()

### 定义函数来实现模型的训练过程。与深度学习任务不同,图卷积在训练时需要传入样本间的关系数据。
# 因为该关系数据是与节点数相等的方阵,所以传入的样本数也要与节点数相同,在计算loss值时,可以通过索引从总的运算结果中取出训练集的结果。
# 在图卷积任务中,无论是用模型进行预测还是训练,都需要将全部的图结构方阵输入。
def step(): # 定义函数来训练模型
    model.train()
    optimizer.zero_grad()
    output = model(features,adj) # 将全部数据载入模型,只用训练数据计算损失
    loss = F.cross_entropy(output[idx_train],labels[idx_train])
    acc = accuracy(output[idx_train],labels[idx_train]) # 计算准确率
    loss.backward()
    optimizer.step()
    return loss.item(),acc

def evaluate(idx): # 定义函数来评估模型
    model.eval()
    output = model(features, adj) # 将全部数据载入模型
    loss = F.cross_entropy(output[idx], labels[idx]).item() # 用指定索引评估模型结果
    return loss, accuracy(output[idx], labels[idx])

2.8 Code Implementation: Training Visualization --- Cora_GNN_MUti-sample-Dropout.py (Modified Part 2)

model = GCNTD(n_features,n_labels,hidden=[16,32,16]).to(device)
from ranger import *
from functools import partial # 引入偏函数对Ranger设置参数
opt_func = partial(Ranger,betas=(0.9,0.99),eps=1e-6)
optimizer = opt_func(model.parameters())

from tqdm import tqdm
# 训练模型
epochs = 400
print_steps = 50
train_loss, train_acc = [], []
val_loss, val_acc = [], []
for i in tqdm(range(epochs)):
    tl,ta = step()
    train_loss = train_loss + [tl]
    train_acc = train_acc + [ta]
    if (i+1) % print_steps == 0 or i == 0:
        tl,ta = evaluate(idx_train)
        vl,va = evaluate(idx_val)
        val_loss = val_loss + [vl]
        val_acc = val_acc + [va]
        print(f'{i + 1:6d}/{epochs}: train_loss={tl:.4f}, train_acc={ta:.4f}' + f', val_loss={vl:.4f}, val_acc={va:.4f}')

# 输出最终结果
final_train, final_val, final_test = evaluate(idx_train), evaluate(idx_val), evaluate(idx_test)
print(f'Train     : loss={final_train[0]:.4f}, accuracy={final_train[1]:.4f}')
print(f'Validation: loss={final_val[0]:.4f}, accuracy={final_val[1]:.4f}')
print(f'Test      : loss={final_test[0]:.4f}, accuracy={final_test[1]:.4f}')

# 可视化训练过程
fig, axes = plt.subplots(1, 2, figsize=(15,5))
ax = axes[0]
axes[0].plot(train_loss[::print_steps] + [train_loss[-1]], label='Train')
axes[0].plot(val_loss, label='Validation')
axes[1].plot(train_acc[::print_steps] + [train_acc[-1]], label='Train')
axes[1].plot(val_acc, label='Validation')
for ax,t in zip(axes, ['Loss', 'Accuracy']): ax.legend(), ax.set_title(t, size=15)

# 输出模型的预测结果
output = model(features, adj)
samples = 10
idx_sample = idx_test[torch.randperm(len(idx_test))[:samples]]
# 将样本标签与预测结果进行比较
idx2lbl = {v:k for k,v in lbl2idx.items()}
df = pd.DataFrame({'Real': [idx2lbl[e] for e in labels[idx_sample].tolist()],'Pred': [idx2lbl[e] for e in output[idx_sample].argmax(1).tolist()]})
print(df)

output:

Better results after just 400 rounds

3 Code overview

 Cora_GNN_MUti-sample-Dropout.py

from pathlib import Path # 引入提升路径的兼容性
# 引入矩阵运算的相关库
import numpy as np
import pandas as pd
from scipy.sparse import coo_matrix,csr_matrix,diags,eye
# 引入深度学习框架库
import torch
from torch import nn
import torch.nn.functional as F
# 引入绘图库
import matplotlib.pyplot as plt
import os
os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE"

# 1.1 导入基础模块,并设置运行环境
# 输出计算资源情况
device = torch.device('cuda')if torch.cuda.is_available() else torch.device('cpu')
print(device) # 输出 cuda

# 输出样本路径
path = Path('./data/cora')
print(path) # 输出 cuda

# 1.2 读取并解析论文数据
# 读取论文内容数据,将其转化为数据
paper_features_label = np.genfromtxt(path/'cora.content',dtype=np.str_) # 使用Path对象的路径构造,实例化的内容为cora.content。path/'cora.content'表示路径为'data/cora/cora.content'的字符串
print(paper_features_label,np.shape(paper_features_label)) # 打印数据集内容与数据的形状

# 取出数据集中的第一列:论文ID
papers = paper_features_label[:,0].astype(np.int32)
print("论文ID序列:",papers) # 输出所有论文ID
# 论文重新编号,并将其映射到论文ID中,实现论文的统一管理
paper2idx = {k:v for v,k in enumerate(papers)}

# 将数据中间部分的字标签取出,转化成矩阵
features = csr_matrix(paper_features_label[:,1:-1],dtype=np.float32)
print("字标签矩阵的形状:",np.shape(features)) # 字标签矩阵的形状

# 将数据的最后一项的文章分类属性取出,转化为分类的索引
labels = paper_features_label[:,-1]
lbl2idx = { k:v for v,k in enumerate(sorted(np.unique(labels)))}
labels = [lbl2idx[e] for e in labels]
print("论文类别的索引号:",lbl2idx,labels[:5])

# 1.3 读取并解析论文关系数据
# 读取论文关系数据,并将其转化为数据
edges = np.genfromtxt(path/'cora.cites',dtype=np.int32) # 将数据集中论文的引用关系以数据的形式读入
print(edges,np.shape(edges))
# 转化为新编号节点间的关系:将数据集中论文ID表示的关系转化为重新编号后的关系
edges = np.asarray([paper2idx[e] for e in edges.flatten()],np.int32).reshape(edges.shape)
print("新编号节点间的对应关系:",edges,edges.shape)
# 计算邻接矩阵,行与列都是论文个数:由论文引用关系所表示的图结构生成邻接矩阵。
adj = coo_matrix((np.ones(edges.shape[0]), (edges[:, 0], edges[:, 1])),shape=(len(labels), len(labels)), dtype=np.float32)
# 生成无向图对称矩阵:将有向图的邻接矩阵转化为无向图的邻接矩阵。Tip:转化为无向图的原因:主要用于对论文的分类,论文的引用关系主要提供单个特征之间的关联,故更看重是不是有关系,所以无向图即可。
adj_long = adj.multiply(adj.T < adj)
adj = adj_long + adj_long.T

# 1.4 加工图结构的矩阵数据
def normalize(mx): # 定义函数,对矩阵的数据进行归一化处理
    rowsum = np.array(mx.sum(1)) # 计算每一篇论文的字数==>02 对A中的边数求和,计算出矩阵A的度矩阵D^的特征向量
    r_inv = (rowsum ** -1).flatten() # 取总字数的倒数==>03 对矩阵A的度矩阵D^的特征向量求逆,并得到D^逆的特征向量
    r_inv[np.isinf(r_inv)] = 0.0 # 将NaN值取为0
    r_mat_inv = diags(r_inv) # 将总字数的倒数变为对角矩阵===》对图结构的度矩阵求逆==>04 D^逆的特征向量转化为对角矩阵,得到D^逆
    mx = r_mat_inv.dot(mx) # 左乘一个矩阵,相当于每个元素除以总数===》对每个论文顶点的边进行归一化处理==>05 计算D^逆与A加入自环(对角线为1)的邻接矩阵所得A^的点积,得到拉普拉斯矩阵。
    return mx
# 对features矩阵进行归一化处理(每行总和为1)
features = normalize(features) #在函数normalize()中,分为两步对邻接矩阵进行处理。1、将每篇论文总字数的倒数变成对角矩阵。该操作相当于对图结构的度矩阵求逆。2、用度矩阵的逆左乘邻接矩阵,相当于对图中每个论文顶点的边进行归一化处理。
# 对邻接矩阵的对角线添1,将其变为自循环图,同时对其进行归一化处理
adj = normalize(adj + eye(adj.shape[0])) # 对角线补1==>01实现加入自环的邻接矩阵A

# 1.5 将数据转化为张量,并分配运算资源
adj = torch.FloatTensor(adj.todense()) # 节点间关系 todense()方法将其转换回稠密矩阵。
features = torch.FloatTensor(features.todense()) # 节点自身的特征
labels = torch.LongTensor(labels) # 对每个节点的分类标签

# 划分数据集
n_train = 200 # 训练数据集大小
n_val = 300 # 验证数据集大小
n_test = len(features) - n_train - n_val # 测试数据集大小
np.random.seed(34)
idxs = np.random.permutation(len(features)) # 将原有的索引打乱顺序

# 计算每个数据集的索引
idx_train = torch.LongTensor(idxs[:n_train]) # 根据指定训练数据集的大小并划分出其对应的训练数据集索引
idx_val = torch.LongTensor(idxs[n_train:n_train+n_val])# 根据指定验证数据集的大小并划分出其对应的验证数据集索引
idx_test = torch.LongTensor(idxs[n_train+n_val:])# 根据指定测试数据集的大小并划分出其对应的测试数据集索引

# 分配运算资源
adj = adj.to(device)
features = features.to(device)
labels = labels.to(device)
idx_train = idx_train.to(device)
idx_val = idx_val.to(device)
idx_test = idx_test.to(device)

# 1.6 定义Mish激活函数与图卷积操作类
def mish(x): # 性能优于RElu函数
    return x * (torch.tanh(F.softplus(x)))
# 图卷积类
class GraphConvolution(nn.Module):
    def __init__(self,f_in,f_out,use_bias = True,activation=mish):
        # super(GraphConvolution, self).__init__()
        super().__init__()
        self.f_in = f_in
        self.f_out = f_out
        self.use_bias = use_bias
        self.activation = activation
        self.weight = nn.Parameter(torch.FloatTensor(f_in, f_out))
        self.bias = nn.Parameter(torch.FloatTensor(f_out)) if use_bias else None
        self.initialize_weights()

    def initialize_weights(self):# 对参数进行初始化
        if self.activation is None: # 初始化权重
            nn.init.xavier_uniform_(self.weight)
        else:
            nn.init.kaiming_uniform_(self.weight, nonlinearity='leaky_relu')
        if self.use_bias:
            nn.init.zeros_(self.bias)

    def forward(self,input,adj): # 实现模型的正向处理流程
        support = torch.mm(input,self.weight) # 节点特征与权重点积:torch.mm()实现矩阵的相乘,仅支持二位矩阵。若是多维矩则使用torch.matmul()
        output = torch.mm(adj,support) # 将加工后的邻接矩阵放入点积运算
        if self.use_bias:
            output.add_(self.bias) # 加入偏置
        if self.activation is not None:
            output = self.activation(output) # 激活函数处理
        return output

# 1.7 搭建带有Multi_Sample Dropout的多层图卷积网络模型:根据GCN模型,
class GCNTD(nn.Module):
    def __init__(self,f_in,n_classes,hidden=[16],dropout_num = 8,dropout_p=0.5 ): # 默认使用8组dropout,每组丢弃率为0.5
        # super(GCNTD, self).__init__()
        super().__init__()
        layer = []
        for f_in,f_out in zip([f_in]+hidden[:-1],hidden):
            layer += [GraphConvolution(f_in,f_out)]
        self.layers = nn.Sequential(*layer)
        # 默认使用8个Dropout分支
        self.dropouts = nn.ModuleList([nn.Dropout(dropout_p,inplace=False) for _ in range(dropout_num)] )
        self.out_layer = GraphConvolution(f_out,n_classes,activation=None)
    def forward(self,x,adj):
        # Multi - sampleDropout结构默认使用了8个Dropout分支。在前向传播过程中,具体步骤如下。
        # ①输入样本统一经过多层图卷积神经网络来到Dropout层。
        # ②由每个分支的Dropout按照指定的丢弃率对多层图卷积的结果进行Dropout处理。
        # ③将每个分支的Dropout数据传入到输出层,分别得到结果。
        # ④将所有结果加起来,生成最终结果。
        for layer,d in zip(self.layers,self.dropouts):
            x = layer(x,adj)
        if len(self.dropouts) == 0:
            return self.out_layer(x,adj)
        else:
            for i, dropout in enumerate(self.dropouts): # 将每组的输出叠加
                if i == 0 :
                    out = dropout(x)
                    out = self.out_layer(out,adj)
                else:
                    temp_out = dropout(x)
                    out = out + self.out_layer(temp_out,adj)
            return out # 返回结果
n_labels = labels.max().item() + 1 # 获取分类个数7
n_features = features.shape[1] # 获取节点特征维度 1433
print(n_labels,n_features) # 输出7与1433

def accuracy(output,y): # 定义函数计算准确率
    return (output.argmax(1) == y).type(torch.float32).mean().item()

### 定义函数来实现模型的训练过程。与深度学习任务不同,图卷积在训练时需要传入样本间的关系数据。
# 因为该关系数据是与节点数相等的方阵,所以传入的样本数也要与节点数相同,在计算loss值时,可以通过索引从总的运算结果中取出训练集的结果。
# 在图卷积任务中,无论是用模型进行预测还是训练,都需要将全部的图结构方阵输入。
def step(): # 定义函数来训练模型
    model.train()
    optimizer.zero_grad()
    output = model(features,adj) # 将全部数据载入模型,只用训练数据计算损失
    loss = F.cross_entropy(output[idx_train],labels[idx_train])
    acc = accuracy(output[idx_train],labels[idx_train]) # 计算准确率
    loss.backward()
    optimizer.step()
    return loss.item(),acc

def evaluate(idx): # 定义函数来评估模型
    model.eval()
    output = model(features, adj) # 将全部数据载入模型
    loss = F.cross_entropy(output[idx], labels[idx]).item() # 用指定索引评估模型结果
    return loss, accuracy(output[idx], labels[idx])

model = GCNTD(n_features,n_labels,hidden=[16,32,16]).to(device)
from ranger import *
from functools import partial # 引入偏函数对Ranger设置参数
opt_func = partial(Ranger,betas=(0.9,0.99),eps=1e-6)
optimizer = opt_func(model.parameters())

from tqdm import tqdm
# 训练模型
epochs = 400
print_steps = 50
train_loss, train_acc = [], []
val_loss, val_acc = [], []
for i in tqdm(range(epochs)):
    tl,ta = step()
    train_loss = train_loss + [tl]
    train_acc = train_acc + [ta]
    if (i+1) % print_steps == 0 or i == 0:
        tl,ta = evaluate(idx_train)
        vl,va = evaluate(idx_val)
        val_loss = val_loss + [vl]
        val_acc = val_acc + [va]
        print(f'{i + 1:6d}/{epochs}: train_loss={tl:.4f}, train_acc={ta:.4f}' + f', val_loss={vl:.4f}, val_acc={va:.4f}')

# 输出最终结果
final_train, final_val, final_test = evaluate(idx_train), evaluate(idx_val), evaluate(idx_test)
print(f'Train     : loss={final_train[0]:.4f}, accuracy={final_train[1]:.4f}')
print(f'Validation: loss={final_val[0]:.4f}, accuracy={final_val[1]:.4f}')
print(f'Test      : loss={final_test[0]:.4f}, accuracy={final_test[1]:.4f}')

# 可视化训练过程
fig, axes = plt.subplots(1, 2, figsize=(15,5))
ax = axes[0]
axes[0].plot(train_loss[::print_steps] + [train_loss[-1]], label='Train')
axes[0].plot(val_loss, label='Validation')
axes[1].plot(train_acc[::print_steps] + [train_acc[-1]], label='Train')
axes[1].plot(val_acc, label='Validation')
for ax,t in zip(axes, ['Loss', 'Accuracy']): ax.legend(), ax.set_title(t, size=15)

# 输出模型的预测结果
output = model(features, adj)
samples = 10
idx_sample = idx_test[torch.randperm(len(idx_test))[:samples]]
# 将样本标签与预测结果进行比较
idx2lbl = {v:k for k,v in lbl2idx.items()}
df = pd.DataFrame({'Real': [idx2lbl[e] for e in labels[idx_sample].tolist()],'Pred': [idx2lbl[e] for e in output[idx_sample].argmax(1).tolist()]})
print(df)

Guess you like

Origin blog.csdn.net/qq_39237205/article/details/123869255