A Summary of Getting Started with Recurrent Neural Networks

1. One of the commonly used neural network structures is called RNN, that is, recurrent neural network.

Suppose there are n cells, starting from the first cell.

state 0+time0 enters the cell, the cell is processed, and the processed result can be divided into two identical ones, one is used to output the output of this layer, and the other is sent to the next cell. Of course, after dividing into two identical ones, what do you want to do? How to change.

For the second cell, the first output and the current time are his output, so the previous output is always used as the next input.

What does deep loop mean? Don't we have two outputs, one is used for the next cell, and the other can be spread deep on the basis of this cell, of course, this depth is in natural language In processing, it is generally used to process word vectors of words.

2. In the recurrent neural network, there is a structure called long short-term memory network (LSTM)

Why is there a long short-term memory network, because the recurrent network is too long, but we can't use many of the previous things, so, for some cells, I have to forget something. So there is a long short-term memory network. The overall structure of the long short-term memory network is very similar to the above recurrent neural network, but it will have an additional forgetting gate. For the previously input data, he will selectively delete it according to the current data.

#Define LSTM structure lstm 
= rnn_cell.BasicLSTMCell(lstm_hidden_size) #State
 initialization , initialized to an all-zero array 
state = lstm.zero_state(batch_size,tf.float32)
 # loss initialization
loss = 0.0
#Specify a length, and perform operations within this length. If it is too long, it will be too burdensome for the computer and it is unnecessary. This for statement is actually a loop operation. Each time get-variable-socpe, the data calculated by lstm is calculated. output, add at the end. 
for i in range(num_steps):
     if i > 0 : tf.get_variable_scope().reuse_variables()
    lstm_output, state = lstm(current_input, state)
    final_output = fully_connected(lstm_output)
    
    loss += calc_loss(final_output, expected_output)

 

3. The code base of deep neural network

lstm = rnn_cell.BasicLSTMCell(lstm_size) #Actually
 , we found that compared to the ordinary deep recurrent network, it is based on the original lstm, with the number of lstm* layers as the parameter, and input it into the new function: MultiRNNCell. 
stacked_lstm = rnn_cell.MultiRNNCell([lstm]* number_of_layers)

state = stacked_lstm.zero_state(batch_size, tf.float32)

for i in range(len(num_steps)):
    if i > 0: tf.get_variable_scope().reuse_variables()
    stacked_lstm_output, state = stacked_lstm(current_input,state)
    final_output = fully_connected(stacked_lstm_output)
    loss += calc_loss(final_output,expected_output)

4. Dropout of Recurrent Neural Networks

lstm = rnn_cell.BasicLSTMCell(lstm_size)
 # dropout is actually inputting the original lstm into the function to generate a dropout latm 
dropout_lstm = tf.nn.rnn_cell.DropoutWrapper(lstm, output_keep_prob=0.5 )
 #Actually , we found that the deep Compared with the ordinary cyclic network, it is based on the original lstm, using dropout_lstm* layers as parameters, and inputting it into a new function: MultiRNNCell. 
stacked_lstm = rnn_cell.MultiRNNCell([dropout_lstm]* number_of_layers)

state = stacked_lstm.zero_state(batch_size, tf.float32)

for i in range(len(num_steps)):
    if i > 0: tf.get_variable_scope().reuse_variables()
    stacked_lstm_output, state = stacked_lstm(current_input,state)
    final_output = fully_connected(stacked_lstm_output)
    loss += calc_loss(final_output,expected_output)

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325348393&siteId=291194637