In-depth understanding of the nature Embedding layer

Following https://blog.csdn.net/weixin_42078618/article/details/82999906 above discussed embedding dimension reduction layer, after a lapse of one month, to share a huge role in the field of embedded layer in the NPL.

 

The publication of this built on the premise of understanding of the text usage steering amount (such as one-hot) is.

 

First, we continue to assume that we have a word called "Princess is very beautiful," if we use one-hot encoding may get coded as follows:

Public [00001]
main [00010]
very [00100]
float [01000]
light [10000]
ye one seems to have seen nothing wrong, in fact, it could have been people did wrong, or we assume a larger number of bags of words

Public [0000100000]
main [0001000000]
is [0010000000]
drift [010,000,000 0]
Liang [1000000000]
Suppose now, we assume a bag of words of a total of 10 words, which is shown as encoding the sentence.

This coding, the biggest advantage is that, no matter what your word, we can in a one-dimensional array of where you represented by 01. Different words and absolutely not the same, so that no repeat point, the ability to express strong intrinsic.

However, since it is completely independent, its weaknesses came out. Ability to express characteristics associated with almost zero! ! !

I'll give you an example, we have the word "beautiful princess"

So on this basis, we can put that expressed as

Wang [0000000001]
Fei [0000000010]
is [0010000000]
drift [010,000,000 0]
light [1000000000]
from the Chinese point of view expressed, we just feel it, with the Princess Princess is actually a great relationship, such as: the princess is the daughter of the emperor, the emperor Princess concubine, may be from the "emperor" on the word association; the princess lived in the palace, the princess lived in the palace, from the association "palace" is the word; woman is a princess, Princess is also a woman, from " female "on the word association.

But then, we used one-hot encoding, the princess and the princess becomes this:

Public [0000100000]
main [0001000000]
Wang [0000000001]
Fei [000,000,001 0]
you say, if you do not see in front of the Chinese comments, you know what these four lines vector internal matter? I do not see how to do that?

Now, just by association hypothesis, we associate the "emperor", "palace" and "female" three words, so that we try to define Princess and Princess

Princess must be the emperor's daughter, we assume that her relationship with the emperor similarity of 1.0; princess from birth to live in the house until the 20-year-old was married to your family, lived 80 years, we assume that she and the palace similarity relationship 0.25; princess must be woman, relationship with the woman similarity of 1.0;

Princess of the emperor's concubine, unrelated, but there is a relationship, we assume that her relationship with the emperor's similarity to 0.6 bar; concubine from 20 years old to live in the house, lived 80 years, we assume that she relations with the palace of the similarity of 0.75; Princess must be the woman, the relationship with the woman similarity of 1.0;

So we can say the words Princess Princess said:

Palace
Emperor in female
princess [1.0 0.25 1.0]
Princess [0.6 0.75 1.0]
So we put the princess and Princess of two words, with the emperor, palace, female words (features) associated with it, we can think of:

Princess emperor + = 1.0 * 0.25 * 1.0 * female palace +

Princess Emperor * = 0.6 + 1.0 + 0.75 * * F Miyazato

Or so we assume that not every word lyrics peer (Note: only a hypothesis, for the convenience of explanation):

Palace
Emperor in the female
public [0.5 0.125 0.5]
main [0.5 0.125 0.5]
King [0.3 0.375 0.5]
Princess [0.3 0.375 0.5]
In this way, we put some words or even a word, characterized by three features to come out. We then called the emperor feature (1), the palace called the feature (2), called the female feature (3), and thus we come to the implicit relationship between the Princess and Princess features:

Characterized Princess Princess = (1) * 0.6 + Princess feature (2) * 3 + Princess feature (3) * 1

Ever since, we have one-hot encoding text from a sparse state into a dense state, and let the relationship become independent vector vectors are intrinsically linked.

 

So, embedding layers do what it? It is our sparse matrix, through some linear transformation (the connection layer CNN with full conversion, also called table lookup operation), into a dense matrix, the dense matrix with the N (in the example N = 3) features to characterize all of the text in this dense matrix, the dense matrix represents one relationship with a single word on the appearance, in fact, it also contains a lot of words between the words, and even sentences between words the intrinsic relationship between sentences (: come to our relationship with the Princess Princess like) with. The relationship between them, is embedded with the layer parameters to the study were characterized. From sparse to dense matrix matrix process, called embedding, many people also call it a look-up table, because the relationship between them is a one mapping.

More importantly, this relationship during the reverse spread of is has been updated, it can after a number of epoch, so that this relationship is relatively mature, that becomes: correct expression of the entire semantics as well as between individual statements Relationship. This mature relationship, is all the weight parameters embedding layer.

Embedding is one of the most important inventions NPL field, he suddenly associate independent vectors up. This is what is equivalent to it, the equivalent of you are your father's son, your father is a colleague of A, B is the son of A, it seems like you are playing with only eight pole relationship. You saw the result B, are you the same table. Embedding layer is used to discover the secret weapon.
---------------------
Author: Luo big black
Source: CSDN
Original: https: //blog.csdn.net/weixin_42078618/article/details/84553940
copyright notice : This article is a blogger original article, reproduced, please attach Bowen link!

Guess you like

Origin www.cnblogs.com/jfdwd/p/11184283.html