【l论文阅读】An Interactive Multi-Task Learning Network for End-to-End Aspect-Based Sentiment Analysis

An Interactive Multi-Task Learning Network for End-to-End Aspect-Based Sentiment Analysis

simply put

This article is to integrate the aspect extraction and aspect senmetiment classification into a task and multi-task while training part of the shared parameters document corpus to address the problem as a problem of small data sets.

Summary

This article presents an interactive multi-task learning network interactive multi-task learning network, you can learn more at the same time co-related tasks, both token level another document level. Unlike traditional multi-task learning methods that only rely on different tasks common features, IMN introduction of a message passing architecture, a set of information can be shared with other countries iteration of latent variables passed between different tasks.
Good experimental results.

Introduction

ABSA problem with conventional treatment is split into two questions, pipeline carried out, there are some joint training, joint training but those methods, just contact two tasks through a unified label together, but each link is not clear. In addition, only learning from the class aspects corpus, but aspects of the class and small corpus, no other unexpected good use of information, such as related to document-level sentiment analysis corpus, the corpus in linguistics and useful knowledge related to emotions and easier to get.

This article describes the IMN, while addressing AE and AS, and better use of the link between two tasks. In addition, where AE and AS-level tasks and documents with training in the use of a larger corpus of information; also introduced a relatively new messaging mechanism, potentially useful information is sent back to the shared representation from different tasks. And sharing information is then combined to represent potential for subsequent processes. This iterative process is carried out, with the increase in the number of iterations, the information can be modified and spread across multiple links. Multi-Task Learning than most, not only allows the IMN shared characteristics, but also by the interaction between the message passing mechanism, explicit modeling tasks.

In addition IMN also integrates two document-level classification tasks: sentiment classification (DS) and classified by field (DD), and AE AS training together.

Related work

Multi-task learning

Conventional multi-task learning is a shared task specific networks and two shared network to obtain a feature space and feature space two task-specific. By using a shared semantic representation of parallel learning-related tasks, in some cases, multi-task learning can capture the correlation between tasks and improve the model generalization.
But the traditional multi-task learning and no correlation between explicit modeling tasks - two counter-propagating tasks interact only through mistakes, learning characteristics, and this interaction is not controllable.
IMN is shared not only expressed, but also through the relationship between the message passing mechanism explicit modeling tasks.

Information transmission mechanism

CV and NLP are studied messaging graphical model inference algorithm network representation. These messaging to perform modeling algorithm for recurrent neural network can cause network architecture. We are also updating a similar study in the dissemination of information in the network, but the structure is designed to solve the multi-task learning problems. Our algorithm can be roughly called RNN structure, because each iteration update potential represented by the same network structure.

proposed method

Structure shown in Figure 1, the input is a set of tokens, the f i s f_{\theta_s} Extracting features.
Here Insert Picture Description
CNN is further extracted with several layers of the features embedded in the word basis.
This level of output is Here Insert Picture Description
shared between the various tasks. This indicates that the following information will be used to propagate from different tasks over, to be updated.
This group indicated that they would in turn be used to enter a different task. Each task has its own set of output components, it is a set of tag sequences. In the AE mission, the output of each word label indicates whether a word or emotional aspects of the word; in the AS, part of speech the word calibration. In the classification task, the output and input for examples, DS output emotion, DD output domain. Each iteration, the right information returns above the set of shared representation. Information can be transmitted between various components, especially for the subject concerned, the task information transmitted from AE to AS. After T iterations times the transmission of information to do with predicting output variables.

3.1 aspect-level tasks

AE want to extract all the aspect and opinion words, with five marks: Here Insert Picture Description, AS is the sequence labeling problem, three Tags: Here Insert Picture Description, marked only aspect words. An example is this:

Here Insert Picture Description
in the AS layer there is an additional self-attention, AS layer AE output layer can be obtained, so that AS task number using the extracted opinion word AE, specific Attention is calculated as follows:
Here Insert Picture Description
In Formula 1, the first factor is a measure of correlation between the two, the second distance, the smaller the larger this distance a between two words indicate that the third term is the probability of j opinion word is output by summing AE BP IP tab to find the. In this way, AS AE directly affected by the impact.
In the AE, the word embedding, and represents the initial shared task specific spliced together as the final representation of the i-th token.
In the AS, after indicating the splicing of a shared representation and self-attention up as the final representation. Each task has a full calculation of the probability softmax connection layer before.

document-level tasks

为了解决aspect-level 训练数据不足的问题,IMN引入有情感信息的文档级别语料。引入两个文档级分类任务加入联合训练,分别是文档级情感分类和文档级领域分类。
两个都是多层CNN然后attention层 然后decode层,共享表示过了CNN之后,用softmax方式计算attention,最终得到加权和的表示过全连接和softmax

message passing mechanism

为了利用不同任务之间的交互,消息传递机制汇总了上一次迭代中不同任务的预测,并用这些信息来更新这次迭代中的共享表示。
Here Insert Picture Description
:表示拼接,f表示ReLu激活的全连接。a表示注意力之后的表示,与情感或领域相关度更高的词更可能成为opinion word或aspect word

learning

交替在aspect level 和document level上进行训练。在asoect-level:
Here Insert Picture Description
T 是迭代的最大次数, N a N_a 表示aspect-levle训练数据总数,ni表示第i个训练实例中的token数,y表示的是各自标签的one-hot编码,l表示cross entropy loss。只考虑aspect term。
在document-level loss中:
Here Insert Picture Description
在训练document level时不需要用到message passing mechanism
首先用document-level数据预训练网络几轮,然后以比率r在aspect level和 document level交替训练。算法如下:
D表示训练数据
Here Insert Picture Description

experiments

Data: Semeval2014 Semeval2015, document-level and the two data sets.
This part is a set of experimental parameters

Result analysis

Table 3 shows the results.
Here Insert Picture Description
The results of the case study:
Here Insert Picture Description
Here Insert Picture Description
By comparison, the introduction of document-level discovery information is useful.

in conclusion

He proposed a multi-task learning network, joint training extraction and classification aspect and the emotional aspect of opinion term. IMN also proposed the introduction of a new message passing mechanism, the link between better use of the task. In addition, also focused on obtaining information from other training data to obtain valuable information from the aspect of document-level data. This model can also be used in a number of other related tasks.

Published 171 original articles · won praise 34 · views 80000 +

Guess you like

Origin blog.csdn.net/BeforeEasy/article/details/104219019