【文献阅读】ST-GDN——图神经网络预测交通流量（Xiyue Zhang等人，ArXiv，2021）

一、背景

文章题目：《Traffic Flow Forecasting with Spatial-Temporal Graph Diffusion Network》

文献下载地址：2110.04038.pdf (arxiv.org)https://arxiv.org/pdf/2110.04038.pdf

文献引用格式：Xiyue Zhang, Chao Huang, Yong Xu, Lianghao Xia, Peng Dai, Liefeng Bo, Junbo Zhang, Yu Zheng. "Traffic Flow Forecasting with Spatial-Temporal Graph Diffusion Network". arXiv preprint, arXiv: 2110.04038, 2021.

项目地址：https://github.com/jill001/ST-GDN

二、文章导读

Accurate forecasting of citywide traffic flow has been playing critical role in a variety of spatial-temporal mining applications, such as intelligent traffic control and public risk assessment. While previous work has made significant efforts to learn traffic temporal dynamics and spatial dependencies, two key limitations exist in current models. First, only the neighboring spatial correlations among adjacent regions are considered in most existing methods, and the global interregion dependency is ignored. Additionally, these methods fail to encode the complex traffic transition regularities exhibited with time-dependent and multi-resolution in nature. To tackle these challenges, we develop a new traffic prediction framework–Spatial-Temporal Graph Diffusion Network (ST-GDN). In particular, ST-GDN is a hierarchically structured graph neural architecture which learns not only the local region-wise geographical dependencies, but also the spatial semantics from a global perspective. Furthermore, a multi-scale attention network is developed to empower ST-GDN with the capability of capturing multi-level temporal dynamics. Experiments on several real-life traffic datasets demonstrate that ST-GDN outperforms different types of state-of-the-art baselines. Source codes of implementations are available at https://github.com/jill001/ST-GDN.

准确预测交通流量是非常重要的，比如对智能交通管理，公众灾害评估。先前的很多工作着重于交通瞬时动态和空间依赖，这些方法存在两个关键的问题。第一，现有方法只考虑相邻区域的空间相关性，而全局区域间的相关性则被忽略掉。此外，这些方法无法对具有时间依赖性和多分辨率的复杂交通规律进行编码。为了处理处理这些问题，本文提出了一个新的交通预测框架，ST-GDN，它是一个分层结构的图神经网络，不仅能够学习局部地理依赖关系，也能从全局学习空间语义。此外，一个多尺度注意力网络用到了ST-GDN当中。

三、文章介绍

一般的，交通流预测对于时间序列数据的处理，多采用RNN，但是它只适用于短时间的（short-term）和平滑动态（smooth dynamics）的情况，而在高维多次数据上很难预测。

实际上在做交通预测时，不仅需要考虑局部的地理相关性，也要考虑全局的区域关系。因此本文提出了ST-GDN模型。在模型中，作者引入了多尺度自注意力网络，来获得不同时间分辨率下的瞬时动态。同时为了处理多级瞬时动态依赖，作者还提出了聚合层（aggregation layer）；分级GNN通过注意力图分解范式，使得模型能够从局部相邻关系到全局交通模式表示，整合出空间语义。

本文的主要贡献如下：

• We highlight the critical importance of explicitly exploring the multi-resolution traffic transitional information and local-global cross-region dependencies, in studying the traffic prediction problem. 说明多分辨率交通传递信息和局部-局部跨区域依赖性的重要性。

• We propose a new traffic prediction framework (ST-GDN) which explicitly embeds multi-level temporal contextual signals into granularity-aware latent representations, with the cooperation of the designed multi-scale self-attention network and temporal hierarchy aggregation layer. 提出了ST-GDN模型，将多级时间上下文信号嵌入到间隔感知的隐表示中。

• ST-GDN preserves both local and global region-wise dependencies, via a hierarchically structured graph neural architecture which is consisted of a graph attention network and convolution-based graph diffusion mechanism. ST-GDN 保存了局部和全局区域依赖。

• Our extensive experiments on three real-world datasets demonstrate that ST-GDN outperforms baselines of different types in yielding better forecasting performance. Furthermore, model efficiency study is conducted for ST-GDN in the traffic prediction process. 在三个现实世界数据集上表现良好。

1. 问题定义

先将城市分成i*j个格网，交通流张量则是在过去T时间内的所有格网内的流量和。那么交通流预测问题就可以抽象为，输入T时间段内的到来交通流和外出交通流，输出交通流预测函数。

2. 方法

模型的结构如下图所示：

（1）时间分级建模（Temporal Hierarchy Modeling）

首先提出一个多尺度自注意力网络，来映射多级时间信号到隐表示中，以捕捉复杂的交通模式。这里作者引入了一个时间分辨率参数p，p的值为小时、天、周。

（2）从全局背景中学习交通依赖性（Traffic Dependency Learning with Global Context）

建立关系这里，作者使用了GNN，即构建G = (R, E)，其中R表示所有区域的集合，E表示区域之间关系的集合。为了获得区域之间的依赖关系，作者使用了注意力整合机制来捕捉局部和全局的依赖性。简单来说，整个过程及相关一些参数的定义如下：

（3）图传播范式学习区域关系（Region-wise Relation Learning with Graph Diffusion Paradigm）

另外，作者还将区域间的空间关系整合到了预测网络中。作者提出了一个图结构的传播网络来校正前面图注意力模块学习到的分辨率感知的区域表示。然后作者又将这个表示与门机制进行结合。

（4）交通预测阶段（Traffic Prediction Phase）

对于预测，作者所考虑到的因素包括：天气情况（Weather conditions），温度（Temperature），风速（Wind speed）。作者将这些特征映射到一个向量中。然后用MLP对这个向量进行投影。最后将上一步得到的分辨率感知的交通表示与这个投影向量进行连接，得到一个嵌入，并输入到预测层中预测交通的流量。

损失函数的设置如下：

模型的复杂度分析如下：

3.评估

这里作者考虑了以下问题：①与其他baseline 比较来评估模型表现②如何设计不同子模块对于模型的贡献③模型如何运行的④超参数如何影响⑤ST-GDN的效率

然后作者的评估指标，采用的是RMSE（Root Mean Squared Error）和MAPE（Mean Absolute Percentage Error）。

参与比较的模型包括：

• ARIMA (Pan, Demiryurek et al. 2012). it is a representative method for forecasting time series data.

• Support Vector Regression (SVR) (Chang and Lin 2011): another traditional time series analysis model via learning feature mapping functions.

• Fuzzy+NN (Srinivasan, Chan, and Balaji 2009): it integrates the feed-forward neural layers with the fuzzy input filter to model the traffic patterns.

• RNN (Liu et al. 2016): it leverages the recurrent neural networks for capturing both the spatial and temporal effects for making sequential data prediction.

• LSTM (Yu et al. 2017): it jointly models the normal and abnormal traffic variations based on stacked long shortterm memory networks.

• DeepST (Zhang et al. 2016): it utilizes the convolution neural network to encode the spatial correlations between regions over a citywide grid map.

• ST-ResNet (Zhang, Zheng, and Qi 2017): the residual connection technique is employed to alleviate overfitting issue for spatial-temporal prediction.

• DMVST-Net (Yao et al. 2018): it integrates the graph embedding method with the joint convolutional recurrent networks to capture spatial-temporal signals

• DCRNN (Li et al. 2018): it is a data-driven forecasting framework with diffusion recurrent neural network to capture the spatial-temporal dependencies.

• STDN (Yao et al. 2019): it designs a periodically shifted attention for learning transition regularities of traffic.

• ST-GCN (Yu, Yin, and Zhu 2018): it is an integrative framework of graph convolution network and convolutional sequence modeling layer for modeling spatial and temporal dependencies.

• ST-MGCN (Geng et al. 2019): it develops a multi-modal graph convolutional network to capture region-wise non Euclidean pair-wise correlations.

• GMAN (Zheng et al. 2020): it is a encoder-decoder traffic prediction method based on the graph multi-attention.

• UrbanFM (Liang et al. 2019): it is a deep fusion network to model traffic flow distributions.

• ST-MetaNet (Pan et al. 2019): it is a meta-learning approach to perform knowledge transfer across series with a recurrent graph attentive network.

数据集包括BJ-Taxi，NYC-Taxi和NYC-Bike。数据集的统计结果如下：

首先，对于问题①，不同模型的比较结果如下：

一些error的可视化结果如下：

对于问题②，作者设计了不同的模块消融实验：

• ST-GDN-s: ST-GDN without the multi-scale self-attention network to capture multi-level traffic dynamics.

• ST-GDN-g: ST-GDN without the graph attention module to model the global region-wise traffic dependencies.

• ST-GDN-d: ST-GDN without the graph diffusion network to integrate spatial context with cross-region traffic pattern correlations for representation recalibration.

• ST-GDN-n: ST-GDN without the incorporation of neighborhood spatial context into the graph diffusion.

• ST-GDN-e: ST-GDN without the external factor fusion.

消融实验的结果如下图所示：

对于问题③，实验结果如下图所示：

对于问题④来说，作者探究了不同的参数影响：

最后就是对于问题⑤，模型的运行时间效率：