(1) 中科院自动化所的博士,用神经网络做自然语言处理:http://licstar.net
(2) 分词项目:https://github.com/fxsjy/jieba
(3) 清华大学搞的中文分词等NLP开源项目:https://github.com/thunlp
(4)一个轻量级的分词开源软件:https://github.com/lionsoul2014/jcseg
(5) 有一些信息检索的笔记:http://www.cnblogs.com/jcli/category/315064.html
(6)word2vec的资源汇总:http://blog.csdn.net/itplus/article/details/37969519
(7)Deep Learning实战之word2vec:http://techblog.youdao.com/?p=915#LinkTarget_699
(8)中英文维基百科语料上的Word2Vec实验:
http://www.52nlp.cn/%E4%B8%AD%E8%8B%B1%E6%96%87%E7%BB%B4%E5%9F%BA%E7%99%BE%E7%A7%91%E8%AF%AD%E6%96%99%E4%B8%8A%E7%9A%84word2vec%E5%AE%9E%E9%AA%8C
(9)很多NLP的原创中文论文:https://liweinlp.com/?p=342
(10) 文本特征提取:http://blog.csdn.net/qll125596718/article/details/8306767
(11)CSDN对文本分类的简单介绍:http://blog.csdn.net/yangliuy/article/details/7316494
(12) LDA(Latent Dirichlet Allocation)第一作者的主页:http://www.cs.columbia.edu/~blei/
(13) LDA的详细介绍:http://blog.csdn.net/v_july_v/article/details/41209515
(14) CNN用于自然语言处理:http://blog.csdn.net/zhdgk19871218/article/details/51387197
(15)elasticSearch高手:http://log.medcl.net/
(16)文档相似度的参考文献:(a) https://www.zhihu.com/question/29094227
(b)http://www.52nlp.cn/
(17) word2vector or doc2 vector :http://weixin.niurenqushi.com/article/2016-06-15/4322378.html