1. 前言

在公众号看到一篇感兴趣的论文：

在这里插入图片描述

pdf地址：https://arxiv.org/pdf/1807.05560.pdf

代码地址：https://github.com/xptree/DeepInf

2. 阅读笔记

DeepInf: Social Influence Prediction with Deep Learning

2.1 研究点的引出

在摘要中作者也提出了本文的研究其实为“social influence prediction”，也即是社交影响预测。

传统的“社交影响预测”方法依赖于人工定义的规则来抽取用户和网络特征，然而这些方法受限于个人相关专业领域的知识。
因此，作者在本文中设计了一个深度神经网络“DeepInf”，来学习用户潜在的特征表示，进而预测社交影响。
具体而言，本文设计了一些策略去整合网络结构、用户特征。

读到这里，那么本文所认为的传统的“社交影响预测”方法有哪些？不妨先看看对比的方法：

Logistic Regression（LR），逻辑回归。
Support Vector Machine (SVM)，支持向量机。
PSCN，作者原文是这样说的：“As we model social influence locality prediction as a graph classification problem, we compare our framework with the state-of-the-art graph classification models, PSCN[34].”，译：当我们将社会影响位置预测建模为图分类问题时，我们将我们的框架与最先进的图分类模型 PSCN [34] 进行比较。

那么这三个基线方法为什么能够成为作者本文所描述的“传统”社交网络预测方法？
作者将其分为两类，在 LR 和 SVM 中，均考虑使用三类特征（Vertex, Embedding, Ego)，如下图：

在这里插入图片描述

对于Vertex和Embedding很好理解，也就是节点特征和DeepWalk64 维的嵌入表示，那么在Ego中的active neighbors是什么意思？继续查看一下给出的相关参考文献：

Group formation in large social networks: membership, growth, and evolution.

在这篇论文中也提到了，“Of those communities which had at least 1 post, we selected the 700 most active communities along with 300 at random from the others with at least 1 post.”
也就是选择预定义数目的最活跃（有发帖）的群体以及随机选择。

也就是第三个特征Ego其实也就是作者自己定义的“最活跃”。

回到主题，也就是在LR和SVM中将上述三种特征（Vertex, Embedding, Ego)作为分类器的训练数据，进而进行分类训练。

注意到，PSCN 来自论文：“Learning convolutional neural networks for graphs, ICML’2016”，也是一个考虑使用图网络的分类研究。

2.2 相关工作

从前一小节我们知道，可以说本文（DeepInf: Social Influence Prediction with Deep Learning）的研究其实也就是一篇图分类的研究，使用的手段为图深度神经网络。那么和“Social Influence Prediction”之间有什么关联？下面带着这个疑问开始引言的阅读。

社交影响：“refers to the phenomenon that a person’s emotions, opinions, or behaviors are affected by others.”
“there is little doubt that social influence has become a prevalent, yet complex force that drives our social decisions, making a clear need for methodologies to characterize, understand, and quantify the underlying mechanisms and dynamics of social influence.”

简单来说，社交影响也就是对他人观点、情感、决策的影响，且在文献[26, 32, 42, 43]中进行了研究。
作者目标：“We aim to predict the action status of a user given the action statuses of her near neighbors and her local structural information.”

具体过程为：DeepInf, to represent both influence dynamics and network structures into a latent space. To predict the action status of a user v, we first sample her local neighbors through random walks with restart. After obtaining a local network as shown in Figure 1, we leverage both graph convolution and attention techniques to learn latent predictive signals.

使用随机游走来获取用户结构特征，关于社交影响，文中在第二节进行介绍。

2.3 社交影响

2.3.1 r-neighbors

也就是最短路径小于等于r的节点的集合：

在这里插入图片描述

而上述节点的集合构成的子图，称为r-ego netwrok，描述为：

在这里插入图片描述

2.3.2 Social Action

这里的描述很高级，记录一下：

在这里插入图片描述

2.3.3 Social Influence Locality

根据上述引入两个定义，这里引入了“Social Influence Locality”的概念。因为在“Social Action”中引入了时间序列的转发行为表示，故而在“Social Influence Locality”这个概念中同样引入了时间序列的概念：
在下一个时刻激活概率为：

在这里插入图片描述

假定有 N 个实例，那么总体的社交影响预测问题的目标为：

在这里插入图片描述

2.4 DeepInf

步骤一：邻居采样，使用BFS抽取出节点v的r-ego网络，表示为 $G^{r}_v$ 。但是这个网络的大小可能由于“小世界”特性而特别大，为了解决这个问题，进行了大小的控制，可以理解为在r-ego网络上进行二次采样；采样过程根据边的权重的比例进行采样（即有偏），此外在每个时间步骤都有一定概率会重头开始随机游走，
步骤二：神经网络模型，主要目的是为了节点整合结构属性以及行为状态，最终用来预测用户的转发行为状态（即：0 或者 1）。

在这里插入图片描述

注意到（d）的特征由两部分组成，（是否激活，是否是 ego）以及其网络嵌入。

在上图中的中间部分都比较常见，这里不再介绍。着重看下最终的对比部分：

在这里插入图片描述

完全变成了一个分类问题，也就是是否转发。

最终的对比实验也是引用了逻辑回归、支持向量机等，将其分类，然后对比分类效果的好坏。

3. 总结

根据前面的阅读，我们知道其实作者论文“DeepInf: Social Influence Prediction with Deep Learning”
在解决的问题也就是嵌入可能影响因素，然后预测用户是否会转发消息。

个人感觉题目的“Social Influence Prediction”很高级。

【兴趣阅读】DeepInf: Social Influence Prediction with Deep Learning

1. 前言

2. 阅读笔记

2.1 研究点的引出

2.2 相关工作

2.3 社交影响

2.3.1 r-neighbors

2.3.2 Social Action

2.3.3 Social Influence Locality

2.4 DeepInf

3. 总结

猜你喜欢