论文来源

Rakesh V, Ding W, Ahuja A, et al. A Sparse Topic Model for Extracting Aspect-Specific Summaries from Online Reviews[C]//Proceedings of the 2018 World Wide Web Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 2018: 1573-1582.
论文来来自于2018年WWW会议。学习该模型主要是为了看稀疏性先验。如果文章看的多的话，就会发现作者的模型就是有其他文章的模型的一部分，拼凑起来的。具体的抽样都没怎么变。
对于作者的描述，本人看的并不是很细致。但模型部分以及推理部分重点看了一些。

文章介绍

在线评论是消费者制定决策的重要组成部分，消费者是否会购买不仅取决于产品的总体评分，也取决于描述的方面。电子商务网站鼓励消费者写高质量的评论并对产品的不同方面进行总结。例如，玩家可能会感兴趣购买具有快速刷新率的监视器，并支持Gsync和FieleSyc技术，而摄影师可能对诸如颜色深度和精度等方面感兴趣。作者提出了一个提取挖掘评论中描述方面的模型。
以下这几篇文章的组合，拼凑起来了作者的模型。但该模型还是很好借鉴意义的，尤其是做管理的研究者。
[1]Wang S, Chen Z, Fei G, et al. Targeted topic modeling for focused analysis[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2016: 1235-1244. 16年KDD的文章
[2]Zuo Y, Wu J, Zhang H, et al. Topic modeling of short texts: A pseudo-document view[C]//Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, 2016: 2105-2114. 14年KDD的文章
[3]Petterson J, Buntine W, Narayanamurthy S M, et al. Word features for latent dirichlet allocation[C]//Advances in Neural Information Processing Systems. 2010: 1921-1929. 10年NIPS上的文章

模型及推理

关于源码

作者提供了python源码，源码地址为:https://github.com/VRM1/WWW18。但所给的源码并不全，所以无法直接运行。这篇文章的唯一收获就是可以通过Word features for latent dirichlet allocation这篇文章提供的方法限制词控制，关于该部分的抽样，可以通过该源码学习。关于spike-and-slab prior辅助参数学习变量的方式，可以参考《Topic modeling of short texts: A pseudo-document view》这篇文章提供了Java版的源码。

2018 A Sparse Topic Model for Extracting Aspect-Specific Summaries from Online Reviews 稀疏主题模型学习笔记

论文来源

文章介绍

模型及推理

关于源码

猜你喜欢