2019-09-09 15:36:13
Problem Description: word2vec and glove both generate word embedding algorithm What is the difference.
Problem Solving:
GloVe (global vectors for word representation) and word2vec, can be based on two models of vocabulary "co-occurrence co-occurrence" information, the word encoded into a vector (the so-called co-occurrence, i.e. the frequency of occurrence together with the corpus vocabulary).
Both the most intuitive difference is that, word2vec is "predictive" model, and GloVe is "count-based" model.
Glove word2vec and from the perspective of the algorithm, they calculate the difference that the loss is different.
For native w2v, which loss is the loss of cross-entropy;
Glove for it, which is necessary to build a co-occurrence matrix X, wherein the X- ij of indicates the number of i and j co-occur, the loss to the following equation.
f (x) is a weighting function, when the X- ij of equal to 0, f (x) = 0, and when X- ij of excessive time, f (x) = 1.