"Natural Language Processing (NLP) paper reading" Chinese named entity recognition (Lattice-LSTM optimization model)

Source: AINLPer micro-channel public number
editor: ShuYini
proofreading: ShuYini
Time: 2019-8-27

introduction

    Today, the major share with you an article about Chinese NER article, this paper analyzes Lattice-LSTM models, and for the shortcomings of the method proposed to merge the character symbol information to a character vector representation, to improve the performance of the model (computational effects ).

First Blood

TILE: the Simplify the Usage The Lexicon of the NER in Chinese
the Contributor: Fudan (count EPRI)
Paper: https://arxiv.org/pdf/1908.05969v1.pdf
Code: https://github.com/v-mipeng/LexiconAugmentedNER

Thesis

    For the Chinese named entity is to identify, taking into account the practical application, this paper for the shortcomings of Lattice-LSTM models (model complex structure and low computational efficiency), we propose a simple and effective way, is about to sign character information into character vector representation. Thus, our method avoids the introduction of a complex series modeling architecture to model lexical information. Instead, it needs to fine-tune the character sequence of neural representation of the model layer. Verified in four Chinese NER reference data set, the method may be found faster speed inference, its derivative with respect to LSTM model has better performance.

The core idea of ​​the model

    The core objective of this paper is to find an easier way to achieve LSTM grid idea. About to sentence all the words that match the model incorporated into the NER character-based. The first principle is to achieve rapid speed of reasoning. Therefore, we propose representation of the matching word from the dictionary obtained encoded into characters. Compared with LSTM, this method is more simple and easy to implement.

This article describes the model

Lattice-LSTM Model

    Advantages: First, it saves all possible matching words for each character. This can be avoided by error propagation heuristics to select a character matching results with the NER system. Second, it can be introduced into a pre-trained word embedded in the system, which is very helpful for the final performance.
    Disadvantages: drawback Lattice-LSTM model form it is input from a sentence is converted to a chain sequence of FIG. This will greatly increase the computational modeling of the cost of a sentence.

Proposed Model

    Consideration Lattice-LSTM based design herein should be kept chained input sentence form, while maintaining the advantages of the two Lattice-LSTM model.

    First, this paper ExSoftWord, but the analysis of ExSoftword found ExSoftword two methods can not fully inherited the advantages of Lattice-LSTM. First, it can not introduce a pre-trained word embedded. Second, while it tries to maintain the existing matching result is more divided label, but it still lost a lot of information. For this reason this paper is not only reserved characters may be split label, and retain their corresponding matching word. Specifically, in this improved process, the sentence s c corresponding to each character set consists of four four word segment label "bmes" tag. Word set B (c) a lexicon of all matches in the sentence beginning with c s-word. Similarly all matching word thesaurus, m (c) appears in the middle of a sentence s c by the composition, e (c) ending by a thesaurus in all c-word matches, s (c) c is a single character word thereof. If a word set is empty, we will add a special word "None" to indicate this fact.

    The word is then compressed four sets of each character into a fixed-dimensional vector. In order to retain as much information, we will choose four words represent a connection set up is represented as a whole, and add it to a character representation.
    In addition, we have the right to try every word in heavy smoothing, to increase the weight of heavy infrequent word.

    Finally, based on the enhanced character that we use any suitable sequence tags nerve model sequence tags, such as those based LSTM sequence modeling layer and CRF mark reasoning layer.

Experimental results

    different v s v ^ p F1 under the proposed method of scoring    whether bichar, the proposed method for the number of training iterations on OntoNotes contrast.     Compared with Lattice LSTM and LR-CNN, the present method of calculating the velocity series model under different layers (the number of sentences per second, the better).      OntoNotes performance on the [image dump outer link fails, the source station may have security chain mechanism, it is recommended to save the picture down uploaded directly (img-FLk0EcCH-1579348245258) (18628169-11860b35824b525a.png? imageMogr2 / auto-orient /on the performance of MRSA

ACED

Attention

More natural language processing knowledge, also please pay attention ** AINLPer ** No public, the best dry instantly delivered.

Published 43 original articles · won praise 3 · Views 3816

Guess you like

Origin blog.csdn.net/yinizhilianlove/article/details/104033082