word2vec学习笔记(待续。。。)

word2vec建模数据下载(维基百科):https://dumps.wikimedia.org/zhwiki/

“TF的word2vec实现里,词频越大,词的类别编号越小,被采样到的概率越大。”

用tf实现一个word2vec模型
文本预处理(去掉停用词,去掉符号)
句子分词
训练
测试
---------------------------------------------------------------------------------------------------------------------------------------------------------

hit(中心词)
窗口大小2
生成窗口中the man  his son(背景词)的概率
中心词和背景词通过最大似然估计找到hit的向量表达式,最大化hit的似然估计

https://github.com/mli/gluon-tutorials-zh

https://www.cnblogs.com/en-heng/p/6899820.html

https://blog.csdn.net/du_qi/article/details/51564303

https://www.cnblogs.com/Haichao-Zhang/p/5220974.html

https://www.baidu.com/s?ie=UTF-8&wd=%E5%88%A9%E7%94%A8Glove%E8%AE%AD%E7%BB%83%E5%A5%BD%E7%9A%84%E8%AF%8D%E5%90%91%E9%87%8F%E5%A6%82%E4%BD%95%E5%B0%86%E7%8E%B0%E6%9C%89%E6%95%B0%E6%8D%AE%E5%90%91%E9%87%8F%E5%8C%96

https://www.jianshu.com/p/795a5e2cd10c

http://alwa.info/2016/09/26/Keras-%E5%AE%9E%E7%8E%B0-LSTM/

https://www.cnblogs.com/gcczhongduan/p/5198169.html

------------------------------------------------------------------------------------------------------------------------------------------------------------------

猜你喜欢

转载自blog.csdn.net/weixin_31866177/article/details/81217849
今日推荐