Unsupervised Learning: Word Embedding

Unsupervised Learning: Word Embedding

Word Embedding is an application of Dimension Reduction.

Use a vector to express a word, what method do:

  • 1-of-N Encoding:

Each word corresponds to a one-dimensional vector, such as:

In this manner not reflect the relationship between the word, we can not express the semantic

  • Word Class:

In fact, this is the clustering of disadvantage Clustering not repeat them here, we think that he can not express the link between class and class

  • Word Embedding:

Word Embedding the word is mapped into high-dimensional space (although this is a high-dimensional space, but still better than the dimension 1-of-N of the vector dimension is smaller)

 Word Embedding how to do it? Word Embedding method is an unsupervised, how to make machine reading a lot of articles after they receive his Word Embedding it?

We may think we can get the meaning of words through context, how to reflect the context of it? There are two general methods:

1. Count based

That is, to make V (wi) and the number of inner product V (wj) and the wi and wj appearing in the same article as close as possible.

Actually doing LSA (latent semanticanalysis, latent semantic analysis), the solution method SVD on it.

2. predition based

Its input is W (i-1) th word, is output W (i) vocabulary

We first hiden layer of Input: z1 z2 ...... out, Z represents the term.

The input to a plurality of expanded vocabulary (only link between the two words is relatively weak):

You can see that he is sharing the word input parameters, because:

1. For appear in different positions of i words, to the same output

2. You can reduce the amount of parameters

In doing gradient descent when W1 = W2 how to make it?

 

 

Various Architectures:

  

from:https://www.youtube.com/watch?v=X7PH3NuYW0Q&list=PLJV_el3uVTsPy9oCRY30oBPNLCo89yu49&index=23

 

 

发布了55 篇原创文章 · 获赞 22 · 访问量 4万+

Guess you like

Origin blog.csdn.net/li_k_y/article/details/104088433