Joint Learning of Entity Recognition and Relation Extraction Based on Neural Network

Reprinted: http://www.cnblogs.com/DjangoBlog/p/6782872.html

   The term Joint Learning is not a term that appeared recently. In the field of natural language processing, some researchers have used traditional machine learning-based joint models for some closely related natural languages. Process tasks for federated learning. For example, joint learning of entity recognition and entity standardization, joint learning of word segmentation and part-of-speech tagging, etc. Recently, researchers have carried out joint learning of entity recognition and relation extraction based on neural network methods. I have read some related work and shared my learning with you here. ( The PPT report of some paper authors Suncong Zheng is cited in this article)

Introduction

   The task of this paper is to extract entities and relationships between entities (entity 1- relation - entity 2 , triples) from unstructured text, where the relationship is our predefined relationship type. For example, the following figure,

   At present, there are two types of methods, one is to use the pipeline method ( Pipelined Method ) for extraction: input a sentence, first perform named entity recognition, then combine the recognized entities in pairs, then perform relationship classification, and finally put the existence A triple of entity-relationships as input. The shortcomings of the pipeline method are: 1 ) Error propagation, the error of the entity recognition module will affect the following relationship classification performance; 2 ) The relationship between the two subtasks is ignored, such as the example in the figure, if there is a Country- President relationship, then we can know that the former entity must belong to the Location type, and the latter entity belongs to the Person type. The pipeline method cannot use such information. 3 ) Unnecessary redundant information is generated. Since the identified entities are paired in pairs, and then the relationship is classified, those entity pairs that have no relationship will bring redundant information and increase the error rate.

   The ideal joint learning should be as follows: input a sentence, through the joint model of entity recognition and relation extraction, directly obtain the entity triples with relations. This can overcome the shortcomings of the above pipeline method, but there may be more complex structures.

Federated Learning

   Here I mainly focus on joint learning based on neural network methods. I mainly divide my current work into two categories: 1 ) Parameter Sharing and 2 ) Tagging Scheme . It mainly involves the following related work.

2.1  Parameter sharing

   In the paper " Joint Entity and Relation Extraction Based on A Hybrid Neural Network ", Zheng et al. used a shared neural network underlying representation for joint learning. Specifically, the input sentence is encoded by a shared word embedding layer and then a bidirectional LSTM layer. Then use an LSTM for Named Entity Recognition ( NER ) and a CNN for Relation Classification ( RC ) respectively. Compared with the current mainstream NER model BiLSTM-CRF model, the previous predicted label is embedded and passed into the current decoding to replace the CRF layer to solve the label dependence problem in NER . When performing relationship classification, it is necessary to pair entities according to the results of NER prediction, and then use a CNN to classify the text between entities. Therefore, the model is mainly shared through the underlying model parameters. During training, the two tasks will update the shared parameters through the back propagation algorithm to realize the dependency between the two subtasks.

   The paper " End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures " is also a similar idea, joint learning through parameter sharing. It's just that they differ in the decoding models of NER and RC . In this paper, Miwa et al. also share parameters. NER uses a NN for decoding, adds dependency information to RC , and uses a BiLSTM for relationship classification according to the shortest path of the dependency tree.

   According to the experiments of these two papers, using parameter sharing for joint learning achieves better results than the pipelined method and improves the F value by about 1% on their task , which is a simple and general method. The paper " A Neural Joint Model for Entity and Relation Extraction from Biomedical Text " applies the same idea to the task of entity relation extraction from biomedical text.

2.2  Labeling strategy

   But we can see that the method of parameter sharing actually has two subtasks, but the two subtasks interact through parameter sharing. Moreover, during training, it is still necessary to perform NER first , and then perform pairwise matching according to the prediction information of NER to classify the relationship. This kind of redundant information will still be generated for entities that have no relationship. With this motivation, Zheng et al . proposed a new tagging strategy for relation extraction in the paper " Joint Extraction of Entities and Relations Based on a Novel Tagging Scheme ", which was published on ACL 2017 and selected for Outstanding Paper .

   By proposing a new labeling strategy, they turned the original relation extraction involving sequence labeling and classification tasks into a sequence labeling problem. The relation entity triples are then directly obtained through an end-to-end neural network model.

   The new tagging strategy they proposed is mainly composed of three parts in the following figure: 1 ) Position information of words in entities {B (entity start), I (entity interior), E (entity end), S (single entity) } ; 2 ) relationship type information { encoded according to a predefined relationship type } ; 3 ) entity role information {1 (entity 1 ), 2 (entity 2 ) } . Note that all the labels here are "O" as long as the words are not in the entity relation triple .

   According to the label sequence, the entities of the same relationship type are combined into a triple as the final result. If a sentence contains more than one relationship of the same type, the proximity principle is used for pairing. The current set of tags does not support overlapping entity relationships.

   Then the task becomes a sequence labeling problem, and the overall model is as shown below. A BiLSTM is first used for encoding, and then the LSTM mentioned in parameter sharing is used for decoding.

   The difference from classical models is that they use a biased objective function. When the label is "O" , it is the normal objective function. When the label is not "O" , that is, the relational entity label is involved, and the influence of the label is increased by α . Experimental results show that this biased objective function can more accurately predict entity-relationship pairs.

Summary

   The joint learning of entity recognition and relation extraction based on neural network mainly consists of two types of methods. Among them, the method of parameter sharing is simple and easy to implement, and has a wide range of applications in multi-task learning. The new labeling strategy proposed by Zheng et al., although there are still some problems (such as the inability to identify overlapping entity relationships), but a new idea is given, which truly combines two subtasks into a sequence labeling problem. More improvements and developments can also be made on this set of annotation strategies to further improve the end-to-end relation extraction task.

 

references

[1] S. Zheng, Y. Hao, D. Lu, H. Bao, J. Xu, H. Hao, et al., Joint Entity and Relation Extraction Based on A Hybrid Neural NetworkNeurocomputing. (2017) 1–8.

[2] M. Miwa, M. Bansal, End-to-End Relation Extraction using LSTMs on Sequences and Tree StructuresACL, (2016).

[3] F. Li, M. Zhang, G. Fu, D. Ji, A Neural Joint Model for Entity and Relation Extraction from Biomedical TextBMC Bioinformatics. 18 (2017).

[4] S. Zheng, F. Wang, H. Bao, Y. Hao, P. Zhou, B. Xu, Joint Extraction of Entities and Relations Based on a Novel Tagging SchemeAcl. (2017).

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324643927&siteId=291194637