CA-RNN read papers

*** CA-RNN: Using Context- Aligned Recurrent Neural Networks for Modeling Sentence Similarity (CA-RNN: Align the context of recurrent neural network modeling sentence similarity) ***
** paper read: **
## 1. Summary:
 Most RNN focus on the hidden states are modeled based on the current sentence, but in a hidden state generation process, well researched information on the context of other sentences has not been. In this paper, we propose a context-aligned ** RNN (CA-RNN) ** model, the model of the sentence in the merger context information ** ** aligned word to generate internal hidden. Specifically, we first perform word alignment detector to identify two sentences aligned words. Then, we propose a gating mechanism ** ** context aligned and embedded into our model, in order to automatically absorb aligned word context for hidden update
## 2. Specific tasks:
 
 1 presents a new alignment RNN context model, wherein two sentences are aligned word in the context of good use to produce better hidden;
 2 presents a context-aligned gate mechanism, and it is well fitted to our model, this mechanism may be automatically absorb and reduce noise generation related to a specific context hidden state;
 3. we similarity of sentences two tasks of a detailed analysis of the experimental results to model better understood the effectiveness of work-related
 
 
## 3. Model Decomposition:
 - Neural Network:
  which model is as follows:
  
  Input feature items: x1, x2, x3, and finally h (x) is output.
layer2 as hidden layer, and there's a value we do not know. All input layer and output layer are hidden layers.
Above neurons, is calculated as follows:

each input value, in a node, different weights, depending on the calculated output weights.
Wherein knowledge of matrix multiplication, g is sigmode functions:

 - loss function, back propagation
  loss function:
  backpropagation :( specific formula is abbreviated): i.e., using the forward propagating the result is compared with the actual value, to obtain the error, and the cost function is then derived by an error before the error output of the neural network layer, and has been deduced by deriving the forward error until the next layer is the input layer, and then by adjusting the weights, adjustment error, try so that the error is smaller.
 - RNN: Recurrent Neural Network, which model is as follows:

we are dealing with characters and the like in question, our data input will output a time as input data for processing at a time.
For example: We have some, we would word it, get t data, we are each a word passed to x0, x1 ... xt inside, when x0 passed, will get a result h0, and we will deal with after the data is passed to the next time, the next time, we will then pass a data x1, as well as data processing on a time, these two data integration calculation, then down transfer until the end.
rnn essentially bp or a loop, but he is more than just one more link bp network that it can be fed on a point in time data processing.
 - LSTM (short and long term memory network)

FIG There are three gates: gate input gate output gates forget
1. Enter the door: to determine whether the input by input * g, if you do not enter 0, 0 is input in order to determine whether the input signal
2. Forget door: whether this signals how much attenuation, probably 50% attenuation The signal is determined.
3. The input gate: output by determining whether, or how much output, for example, 50% of output.
Therefore, the above neural network RNN model and LSTM and we can read this paper.
According to the above requirements:
** context aligned gate mechanisms: **
 - 1. Based on word overlap
 overlaps i.e. the same words in the word, obtaining context information to the same word.
 - 2. Based on the semantic similarity
 based on the semantic, i.e. similar words, such as: father called father, be selected by a similar context of the word or words. (Based Stanford Core NLP tools monolingual word 2 aligner1 algorithm (semantic algorithm))
** ** context aligned gating mechanism
 - 1. The correlation measure
 - 2. context absorbent
correlation between sentences (HX) 1. Measure the alignment word indicates where the current word corresponding to the hidden states (hy j), which is determined by how much context information good another standard alignment word sentence portion to be absorbed. (That is, the probability of excitation function evaluation)
2. Original hidden (hy j) obtained by the RNN absorbent contextual information directly aligned word in other sentences (hx i) according to the measured correlation. As a result, generates a new hidden, the formula is: hyj = g hx i + ( 1- g) hy j (3) wherein, g is obtained by the equation (2) obtained in the interpolation parameters showing element-wise multiplication and hyj is a new generation of hidden

In summary, ca-rnn namely its processes

Guess you like

Origin www.cnblogs.com/lh9527/p/9527-12.html
Recommended