Deep learning (natural language processing)-word embedding

Foreword:

Due to the recent study of the entity naming problem of the knowledge graph, in the experiment, we used word2vec to reduce the processing of words. Therefore, for this tool, the basic situation needs to be understood.

10.1 Word Embedding (word2vec)

10.1.1 Why not use one-hot vectors

A one-hot vector represents a word (characters are words). Assuming that the index of a word is iii, in order to get the one-hot vector representation of the word, we create a vector of length NNN with all 0s, and set its iii 1. One-hot word vector cannot accurately express the similarity between different words, such as the cosine similarity we often use

Since the cosine similarity of the one-hot vectors of any two different words is 0

It represents each word as a fixed-length vector, and enables these vectors to better express the similarity and analogy relationship between different words. The word2vec tool contains two models, namely skip-gram (skip-gram) [2] and continuous bag of words (CBOW)

10.1.2 Jump word model

Assume that the text sequence is "the" "man" "loves" "his" "son". With "loves" as the central word, set the background window size to 2.

关于SoftMax:https://blog.csdn.net/lz_peter/article/details/84574716

Expand to more general:

Assuming that the generation of background words is independent of each other with a given central word, when the background window size is mm, the likelihood function of the word skip model is the probability of generating all background words given any central word

10.1.2.1. Train skipping model¶

10.1.3. Continuous Bag of Words Model¶

The continuous bag-of-words model assumes that a central word is generated based on the background words before and after the text sequence.

To be continued. . .

references:

Original link: https://zh.d2l.ai/chapter_natural-language-processing/word2vec.html

Maximum likelihood estimation: http://fangs.in/post/thinkstats/likelihood/

softmax function: https://blog.csdn.net/lz_peter/article/details/84574716

Conditional Random Field, CRF

Guess you like

Origin blog.csdn.net/qq_37457202/article/details/108697461