Fit LSTM Network/拟合LSTM网络

Next, we need to fit an LSTM network model to the training data.

接下来，我们需要将LSTM网络模型拟合到训练数据中。

This first requires that the training dataset be transformed from a 2D array [samples, features] to a 3D array [samples, timesteps, features]. We will fix time steps at 1, so this change is straightforward.

首先需要把训练数据集从二维数组【样品，特征】转换到三维数组【样品，时间步，特征】，我们将把时间步固定为1，所以这个改变非常简单。

Next, we need to design an LSTM network. We will use a simple structure with 1 hidden layer with 1 LSTM unit, then an output layer with linear activation and 3 output values. The network will use a mean squared error loss function and the efficient ADAM optimization algorithm.

接下来，我们需要设计一个LSTM网络。我们将使用一个具有1个隐藏层和1个LSTM单元的简单结构，然后是一个具有线性激活和3个输出值的输出层。网络将使用均方误差损失函数和高效的ADAM优化算法。

The LSTM is stateful; this means that we have to manually reset the state of the network at the end of each training epoch. The network will be fit for 1500 epochs.

LSTM是有状态的;这意味着我们必须在每个训练epoch结束时手动重置网络状态。该网络将适合1500个epochs。

The same batch size must be used for training and prediction, and we require predictions to be made at each time step of the test dataset. This means that a batch size of 1 must be used. A batch size of 1 is also called online learning as the network weights will be updated during training after each training pattern (as opposed to mini batch or batch updates).

必须使用相同的batch size进行训练和预测，并且我们需要在测试数据集的每个时间步进行预测。这意味着必须使用批量大小1。批量大小1也称为在线学习，因为网络权重将在每个训练对（而不是小批量或批量更新）之后的训练期间更新。

We can put all of this together in a function called fit_lstm(). The function takes a number of key parameters that can be used to tune the network later and the function returns a fit LSTM model ready for forecasting.

我们可以将所有这些放在一个名为fit_lstm（）的函数中。该函数需要一些关键参数，可用于稍后调整网络，并且该函数返回一个准备好的用于预测的拟合的LSTM模型。

 
    
     
      
          1 
        

          2 
        

          3 
        

          4 
        

          5 
        

          6 
        

          7 
        

          8 
        

          9 
        

          10 
        

          11 
        

          12 
        

          13 
        

          14 
        

          15 
        
 
       
         # fit an LSTM network to training data 
        
 
         def  
         fit_lstm 
         ( 
         train 
         , 
           
         n_lag 
         , 
           
         n_seq 
         , 
           
         n_batch 
         , 
           
         nb_epoch 
         , 
           
         n_neurons 
         ) 
         : 
        
 
           
         # reshape training into [samples, timesteps, features] 
        
 
           
         X 
         , 
           
         y 
           
         = 
           
         train 
         [ 
         : 
         , 
           
         0 
         : 
         n_lag 
         ] 
         , 
           
         train 
         [ 
         : 
         , 
           
         n_lag 
         : 
         ] 
        
 
           
         X 
           
         = 
           
         X 
         . 
         reshape 
         ( 
         X 
         . 
         shape 
         [ 
         0 
         ] 
         , 
           
         1 
         , 
           
         X 
         . 
         shape 
         [ 
         1 
         ] 
         ) 
        
 
           
         # design network 
        
 
           
         model 
           
         = 
           
         Sequential 
         ( 
         ) 
        
 
           
         model 
         . 
         add 
         ( 
         LSTM 
         ( 
         n_neurons 
         , 
           
         batch_input_shape 
         = 
         ( 
         n_batch 
         , 
           
         X 
         . 
         shape 
         [ 
         1 
         ] 
         , 
           
         X 
         . 
         shape 
         [ 
         2 
         ] 
         ) 
         , 
           
         stateful 
         = 
         True 
         ) 
         ) 
        
 
           
         model 
         . 
         add 
         ( 
         Dense 
         ( 
         y 
         . 
         shape 
         [ 
         1 
         ] 
         ) 
         ) 
        
 
           
         model 
         . 
         compile 
         ( 
         loss 
         = 
         'mean_squared_error' 
         , 
           
         optimizer 
         = 
         'adam' 
         ) 
        
 
           
         # fit network 
        
 
           
         for 
           
         i 
           
         in 
           
         range 
         ( 
         nb_epoch 
         ) 
         : 
        
 
           
         model 
         . 
         fit 
         ( 
         X 
         , 
           
         y 
         , 
           
         epochs 
         = 
         1 
         , 
           
         batch_size 
         = 
         n_batch 
         , 
           
         verbose 
         = 
         0 
         , 
           
         shuffle 
         = 
         False 
         ) 
        
 
           
         model 
         . 
         reset_states 
         ( 
         ) 
        
 
           
         return 
           
         model 
        
 
     
 
    
  

The function can be called as follows:

 
          1 
        
          2 
        
         # fit model 
        
         model 
           
         = 
           
         fit_lstm 
         ( 
         train 
         , 
           
         1 
         , 
           
         3 
         , 
           
         1 
         , 
           
         1500 
         , 
           
         1 
         )

The configuration of the network was not tuned; try different parameters if you like.

网络的配置没有调整; 如果你喜欢，尝试不同的参数如果你喜欢。

Report your findings in the comments below. I’d love to see what you can get.

在下面的评论中报告你的发现。我很想看看你能得到什么。

机器学习-Keras之stateful LSTM全面解析+实例测试

Keras中的stateful LSTM可以说是所有学习者的梦魇，令人混淆的机制，说明不到位的文档，中文资料的匮乏。注意，此处的状态表示的是原论文公式里的c，h，即LSTM特有的一些记忆参数，并非w权重。在stateless时，长期记忆网络并不意味着你的LSTM将记住之前 batch 的内容。当我们在默认状态 stateless 下，Keras会在训练每个sequence小序列（=sample）开始时，将LSTM网络中的记忆状态参数reset初始化（指的是 c，h 而并非权重 w ），即调用 model.reset_states() 。为啥stateless LSTM每次训练都要初始化记忆参数? 因为Keras在训练时会默认地 shuffle samples ，所以导致 sequence 之间的依赖性消失， sample 和 sample 之间就没有时序关系，顺序被打乱，这时记忆参数在 batch 、小序列之间进行传递就没意义了，所以Keras要把记忆参数初始化。无论是stateful还是stateless，都是在模型接受一个 batch 后，计算每个sequence的输出，然后平均它们的梯度，反向传播更新所有的各种参数。

https://www.toutiao.com/a6532553094650135044/

愉快的学习就从翻译开始吧_Multi-step Time Series Forecasting_9_Multi-Step LSTM Network_Fit LSTM Network

Fit LSTM Network/拟合LSTM网络

机器学习-Keras之stateful LSTM全面解析+实例测试

猜你喜欢