[Read the paper notes] A Sensitivity Analysis of (and Practitioner Guide to) Convolutional Neural Networks for Sent

     In this paper, TextCNN (original thesis: Convolutional Neural Networks for Sentence Classification) done a lot of experimental parameter adjustment, numerous specific recommended TextCNN for text classification. TextCNN analytical papers on the Internet there are many existing blog can refer here only briefly.

FIG TextCNN structure:

DESCRIPTION network theory: network input sentences, where each word has passed through the one-hot, word2Vec gloVe mapped word or a vector, a number of words in a sentence * form a matrix embedding dimension word size, the same format as similar images. Filter and then through the convolution of different sizes, the paper is set (region_size * embedding dimension word), output (sentence length - region_size + 1) * 1 dimensional vector. After max-pooling obtain a value, the value of each output value of the elapsed filier max-pooling stitching together the final classification.

Tunable network parameters: a vector representation of the input word, filter region_size, the number of feature map, the type of activation function, pooling strategy, regularization method.

Experimental results:

1. Use a pre-trained word vector (word2Vec or gloVe) classification performance significantly, if the data can be trained much for the task from the beginning of training better. This paper also attempts to use the word word2Vec + vector gloVe, the effect is not obvious.

2. filier region_size对效果有很大影响,需要重点调参。文中给出的建议是,先使用单个filter线性搜索最佳的region_size,通常使1~10。找到最小大小后,组合多个region_size与最佳大小相同的filter,或多个region_size与最佳大小相近的filter可以实现很明显的想过提升。注意组合时候region_size不要偏离最佳大小太多。

3.针对每种region_size,filter map的数量可以探索,但超过600以后提升不明显甚至性能下降。

4.相比global/local average pooling,k-max pooling,local max-pooling,gobal max-pooling效果最好。

5.正则化对性能的提升影响不大,文中猜想由于词向量本身就有缓解过拟合的作用。

6.Relu和tanh激活效果比其他方式好。

7.尽量使用cross-fold应对单词训练的variance。

Guess you like

Origin blog.csdn.net/cskywit/article/details/93169329