[] Huawei cloud sharing processing method sequence features of the two: a convolutional neural network-based method

ABSTRACT This paper describes a method of treatment for the use of two sequence features: a convolutional neural network-based methods and analyzes why good convolutional neural network for extracting local features.

Foreword

Previous article describes the basic attention mechanism based on processing of the sequence features of this mainly introduce the basic convolution neural network based on processing sequence features, that is TextCNN method. Sequence features introduction and background applications can refer to the detailed one, and here a brief review of the definition, the user when using the APP or website, the user will have some items for the behavior, such as click items of interest, or collection purchase items, and these behaviors often represent the users of these items are of interest, but these interacted items on the timeline view, to form a sequence of items of interest to the user, similar data objects we have to deal FIG timing relationship having the sequence characteristics shown in FIG. 1, where the sequence of interest the user to take items processed as an example.

1580213390841414.png

Item 1. FIG sequence of interest ▲ user

We all know the history of the user's behavior there may be some partial continuous behavior, such as the recent novel coronavirus serious epidemic, the user may continuously bought masks and alcohol such disinfection and protective equipment in the last few days, it can be recommended in according to this information to recommend some local protection and disinfection with related merchandise. And since the shallow convolutional neural network convolution receptive field is relatively small, good capture local information, it is possible to model the behavior of local sequence features by using shallow convolutional neural network. Given TextCNN also do processing for embedding matrix for the convolution neural network modeling sequence of sentences, so the choice of goods to TextCNN sequence of interest to the user for processing.

TextCNN principle

TextCNN sentence shown as sequence and a schematic model classification 2:

1580213462869169.png

 

FIG 2. TextCNN schematic ▲ [1]

1. Embedding: each word including punctuation are treated as dimensional embedding vector 5, 7 sentence length, so after the process of the 7 × 5 matrix, as shown in the first column of FIG.

 

2. Convolution:经过 kernel_sizes分别为2,3,4的一维卷积层,每个kernel_size 都有2个卷积核,因此经过卷积之后输出6个卷积结果,其中卷积核的示意图为图 2 第二列所示,卷积后的结果为图 2 第三列所示。需要说明的一点是,从图 2 可以看出卷积核的高分别有2,3和4,而卷积核的宽都是都和embedding vector的维度一致,这是因为每一个向量代表一个词,在抽取特征的过程中,词做为文本的最小粒度,应该保证其信息的完整性。

 

3. MaxPooling:对卷积后得到的6个结果进行MaxPooling,然后进行concact,最终得到一个6维的特征向量,如图 2 第四列所示。

 

4. Fully Connect and Softmax:在6维的特征向量后加上一个神经元为2的全连接层,并进行softmax归一化得到分类概率值,如图 2 第五列所示。

应用在序列特征上

我们主要借鉴的是上面介绍的TextCNN提取特征的方法,也就是上面介绍的如何把embedding matrix变为最终的6维特征向量的方法。如图 3 所示:

1580213514843457.png

▲ 图 3. TextCNN的序列特征处理

一般在把TextCNN应用在序列特征的处理上时一维卷积的kernel_sizes设置为2或3,或者同时都用,每个kernel_sizes的卷积核个数一般都为1。一维卷积核大小分别设置为2和3时可以提取不同范围大小的局部信息,保证了特征的多样性。

除此之外,为了弥补有关全局信息提取的不足之处,还可以结合max/mean/sum pooling提取全局特征的方法,使得提取的特征既有全局信息又有局部信息[2],如图 4 所示,其中全局pooling和textCNN共享序列特征的embedding matrix。

1580213547705885.png

 

▲ 图 4. 结合TextCNN和全局pooling的序列特征处理

总结

Convolutional neural network convolution sliding its calculated local feature extraction of natural advantages, it is the preferred method for extracting local modeling information. In addition, the method can also be combined global pooling of extracting global features, compensate for the lack of global feature extraction, increase the diversity of features. At the same time, before the introduction of too much processing method class characteristic value, it can be similarly employed textCNN local feature extraction.

references

[1] Convolutional Neural Networks for Sentence Classification

[2] Convolutional Sequence Embedding Recommendation Model

Author: wanderist

Published 996 original articles · won praise 5406 · Views 850,000 +

Guess you like

Origin blog.csdn.net/devcloud/article/details/104206741