Natural language processing - word2vec project combat - from Word2Vec to FastText

From Word2Vec to FastText

Application of Word2Vec in Deep Learning

  • Text generation (Word2Vec + RNN/LSTM)
  • Text Classification (Word2Vec + CNN)

text generation

Neural network: a nonlinear regression model composed of a bunch of formulas

common neural network

insert image description here

neural network with memory

Therefore, direct feed alone is not enough. We hope that our classifier can remember the contextual relationship:

The purpose of RNN is to allow information with sequential relationships to be considered.

What is a sequential relationship? It is the context of information in time.

RNN

insert image description here

SS in each time pointS calculation: short-term memory
S t = f ( U xt + W st − 1 ) S_t=f(U_{x_t}+W_{s_{t-1}})St=f(Uxt+Wst1)
The final output of this neuron, based on the lastSSS
O t = s o f t m a x ( V S t ) O_t=softmax(V_{S_t}) Ot=softmax(VSt)
Simply put, for t=5, it is actually equivalent to stretching one neuron into five

In other words, S is what we call memory (because the information of t from 1-5 is recorded)

LSTM

[External link picture transfer failed, the source site may have an anti-theft link mechanism, it is recommended to save the picture and upload it directly

long short-term memory —— long-term short-term memory

insert image description here

The most important thing in LSTM is the Cell State, which goes all the way down and runs through this timeline, representing the bond of memory.

It will be messed with by XOR and AND operators to update the memory

The control of the increase and decrease of information depends on these valves: Gate

insert image description here

1 means, remember all the information of this trip

0 means, the information of this trip can be forgotten

  1. forget the door

to decide what information we should forget

It compares the last state ht-1 with the current input xt. Output a value from 0 to 1 through the gate (just like an activation function)

1 represents: Remember it for me! 0 means: forget it!
insert image description here

f t = σ ( w f ⋅ [ h t − 1 , x t ] + b f ) f_t=\sigma(w_f\cdot[h_{t-1},x_t]+b_f) ft=s ( wf[ht1,xt]+bf)

  1. memory door

what to remember

This door is more complicated, divided into two steps:

The first step, use sigmoid to decide what information needs to be updated by us (forget the old)

The second step is to create a new Cell State (updated cell state) with Tanh

insert image description here

i t = σ ( W i ⋅ [ h t − 1 , x t ] + b i ) C t ~ = t a n h ( W C ⋅ [ h t − 1 , x t ] + b C ) i_t=\sigma(W_i\cdot[h_t-1,x_t]+b_i) \\\tilde{C_t}=tanh(W_C\cdot[h_{t-1},x_t]+b_C) it=s ( Wi[ht1,xt]+bi)Ct~=t a n h ( WC[ht1,xt]+bC)

  1. update gate

Update the old cell state to the new cell state

Update our cell state with gates like XOR and AND:

insert image description here

C t = f t ∗ C t − 1 + i t ∗ C t ~ C_t=f_t*C_{t-1}+i_t*\tilde{C_t} Ct=ftCt1+itCt~

  1. output gate

memory to determine what value to output

Our Cell State has been updated,

So we use this memory link to determine our output:

(The Ot here is similar to the output we just ran out of the RNN in one step)
insert image description here

O t = σ ( W σ [ h t − 1 , x t ] + b o ) h t = O t ∗ t a n h ( C t ) O_t=\sigma(W_{\sigma}[h_{t-1},x_t]+b_o) \\h_t=O_t*tanh(C_t) Ot=s ( Wp[ht1,xt]+bo)ht=Ott a n h ( Ct)

Text Categorization

insert image description here

Baseline: BoW + SVM

Deep Learning: CNN for Text

CNN4Text

insert image description here

Blur + Sharpen

insert image description here

How to migrate to word processing?

C i = f ( W T X i : i + h − 1 + b ) C_i=f(W^TX_{i:i+h-1}+b) Ci=f(WTXi:i+h1+b)

  1. convert text into pictures
    insert image description here

insert image description here

  1. Make CNN into 1D

insert image description here

RNN generates text——→ strict logic, order guaranteed

CNN Text Classification —— → error tolerance

Boundary handling:

Narrow vs Wide

insert image description here

Stride size: (sweep in a few steps)

insert image description here

Fast Text

word2vec:

insert image description here

Fast Text:

insert image description here

  1. Bow ——→ Bi-Gram
  2. Hashing Trick
  3. Hierachy Softmax

Focus on text classification, space compression and time acceleration

Guess you like

Origin blog.csdn.net/weixin_46489969/article/details/125070263