[NLP] word2vec

Introduction to word2vec

Function: Convert words in natural language into dense vectors that computers can understand .
Before word2vec, one-hot was used to represent words, such as:

杭州 [0,0,0,0,0,0,0,1,0,……,0,0,0,0,0,0,0]
上海 [0,0,0,0,1,0,0,0,0,……,0,0,0,0,0,0,0]
宁波 [0,0,0,1,0,0,0,0,0,……,0,0,0,0,0,0,0]
北京 [0,0,0,0,0,0,0,0,0,……,1,0,0,0,0,0,0]

But one-hot has the following problems: (1) The vectors are independent of each other. (2) The dimension of the vector is too large, and the matrix is ​​too sparse, which may cause dimension disaster.
word2vec can solve these problems: convert the one-hot vector to a low dimension The dense vector of . Essentially the solution used is matrix factorization.


Understanding related concepts

Word vector : Also known as word embeddings (word embeddings), word2vec is a common word vector, in addition to word2vec, the more famous one is GloVe.
LDA : It is a calculation method for Topic Models.
Language Model : In statistical natural language processing, a language model refers to a probabilistic model that computes a sentence.
Neural Probabilistic Language Model : Neural Probabilistic Language Model, the representation of words is in vector form and semantically oriented. The vectors corresponding to two semantically similar words are also similar, which is reflected in the angle or distance .


Reference:
The vernacular explains what
word vectors are doing in word2vec, what is the relationship between LDA and word2vec?
Word2vec principle derivation and code analysis (unread)


To be continued...

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324759974&siteId=291194637