keras快速上手——学习笔记(五)循环神经网络训练情感分析

1、导入模块

另外部分导入的在这里

from keras.layers import LSTM

2、预处理数据

参考这里

3、搭建模型

对于LSTM神经块输出的shape:
如果return_sequences=True:返回形如(samples,timesteps,output_dim)的3D张量
否则,返回形如(samples,output_dim)的2D张量
在keras开发文档中写明:
to stack recurrent layers, you must use return_sequences=True
为了堆叠循环图层,你需要使用return_sequences=True
到最后的全连接层就不再需要堆叠,所以不使用return_sequences=True

model = Sequential()
model.add(Embedding(vocab_size, 64, input_length = maxword))

model.add(LSTM(128, return_sequences=True))
model.add(Dropout(0.2))

model.add(LSTM(64, return_sequences=True))
model.add(Dropout(0.2))

model.add(LSTM(32))
model.add(Dropout(0.2))

model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
print(model.summary())

运行结果:

Layer (type)                 Output Shape              Param #   
=================================================================
embedding_4 (Embedding)      (None, 400, 64)           5669568   
_________________________________________________________________
lstm_4 (LSTM)                (None, 400, 128)          98816     
_________________________________________________________________
dropout_6 (Dropout)          (None, 400, 128)          0         
_________________________________________________________________
lstm_5 (LSTM)                (None, 400, 64)           49408     
_________________________________________________________________
dropout_7 (Dropout)          (None, 400, 64)           0         
_________________________________________________________________
lstm_6 (LSTM)                (None, 32)                12416     
_________________________________________________________________
dropout_8 (Dropout)          (None, 32)                0         
_________________________________________________________________
dense_10 (Dense)             (None, 1)                 33        
=================================================================
Total params: 5,830,241
Trainable params: 5,830,241
Non-trainable params: 0
_________________________________________________________________

参数比起卷积网络并没有多少增减

4、训练模型

model.fit(X_train, Y_train, validation_data=(X_test, Y_test), epochs=5, batch_size=100)

运行结果:

Train on 10000 samples, validate on 1000 samples
Epoch 1/5
10000/10000 [==============================] - 265s 26ms/step - loss: 0.4772 - acc: 0.7861 - val_loss: 0.4103 - val_acc: 0.8300
Epoch 2/5
10000/10000 [==============================] - 275s 28ms/step - loss: 0.2904 - acc: 0.8900 - val_loss: 0.3819 - val_acc: 0.8480
Epoch 3/5
10000/10000 [==============================] - 270s 27ms/step - loss: 0.1899 - acc: 0.9345 - val_loss: 0.3689 - val_acc: 0.8480
Epoch 4/5
10000/10000 [==============================] - 270s 27ms/step - loss: 0.1305 - acc: 0.9580 - val_loss: 0.4750 - val_acc: 0.8570
Epoch 5/5
10000/10000 [==============================] - 265s 26ms/step - loss: 0.0855 - acc: 0.9718 - val_loss: 0.6072 - val_acc: 0.8110

训练集准确率为97%,测试集为81%,存在过拟合现象
运行速度较慢,但是迭代的速度较快
尝试把dropout调为0.5

model = Sequential()
model.add(Embedding(vocab_size, 64, input_length = maxword))

model.add(LSTM(128, return_sequences=True))
model.add(Dropout(0.5))

model.add(LSTM(64, return_sequences=True))
model.add(Dropout(0.5))

model.add(LSTM(32))
model.add(Dropout(0.5))

model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
print(model.summary())

运行结果:

Train on 10000 samples, validate on 1000 samples
Epoch 1/5
10000/10000 [==============================] - 270s 27ms/step - loss: 0.6315 - acc: 0.6381 - val_loss: 0.4503 - val_acc: 0.8000
Epoch 2/5
10000/10000 [==============================] - 267s 27ms/step - loss: 0.4039 - acc: 0.8414 - val_loss: 0.4800 - val_acc: 0.8010
Epoch 3/5
10000/10000 [==============================] - 265s 27ms/step - loss: 0.2697 - acc: 0.9068 - val_loss: 0.4014 - val_acc: 0.8330
Epoch 4/5
10000/10000 [==============================] - 264s 26ms/step - loss: 0.1843 - acc: 0.9403 - val_loss: 0.4198 - val_acc: 0.8550
Epoch 5/5
10000/10000 [==============================] - 266s 27ms/step - loss: 0.1322 - acc: 0.9581 - val_loss: 0.5904 - val_acc: 0.8300

过拟合现象有一定的改善,准确率也上升了2%,再修改一下测试集的数量应该会有一定的效果。

总结:

本章介绍了不同种类的神经网络,有多层神经网络(MLP),卷积神经网络(CNN)和长短记忆模型(LSTM)。它们的共同点是有很多参数,需要通过后向传播来更新参数。CNN和LSTM作为神经网络的不同类型的模型,需要的参数相对较少,这也反映了它们的一个共性:参数共享。这和传统的机器学习原理很类似:对参数或者模型加的限制越多,模型的自由度越小,越不容易过度拟合。反过来,模型参数越多,模型越灵活,越容易拟合噪声,从而对预测造成负面影响。通常,我们通过交叉验证技术选取最优参数(比如,几层模型、每层节点数、Dropout概率等)。最后需要说明的是,情感分析本质是一个分类问题,是监督学习的一种。

猜你喜欢

转载自blog.csdn.net/m0_38106113/article/details/81478807