Source: AINLPer micro-channel public number
editor: ShuYini
proofreading: ShuYini
Time: 2019-8-27
introduction
Today, the major share with you an article about Chinese NER article, this paper analyzes Lattice-LSTM models, and for the shortcomings of the method proposed to merge the character symbol information to a character vector representation, to improve the performance of the model (computational effects ).
First Blood
TILE: the Simplify the Usage The Lexicon of the NER in Chinese
the Contributor: Fudan (count EPRI)
Paper: https://arxiv.org/pdf/1908.05969v1.pdf
Code: https://github.com/v-mipeng/LexiconAugmentedNER
Thesis
For the Chinese named entity is to identify, taking into account the practical application, this paper for the shortcomings of Lattice-LSTM models (model complex structure and low computational efficiency), we propose a simple and effective way, is about to sign character information into character vector representation. Thus, our method avoids the introduction of a complex series modeling architecture to model lexical information. Instead, it needs to fine-tune the character sequence of neural representation of the model layer. Verified in four Chinese NER reference data set, the method may be found faster speed inference, its derivative with respect to LSTM model has better performance.
The core idea of the model
The core objective of this paper is to find an easier way to achieve LSTM grid idea. About to sentence all the words that match the model incorporated into the NER character-based. The first principle is to achieve rapid speed of reasoning. Therefore, we propose representation of the matching word from the dictionary obtained encoded into characters. Compared with LSTM, this method is more simple and easy to implement.
This article describes the model
Lattice-LSTM Model
Advantages: First, it saves all possible matching words for each character. This can be avoided by error propagation heuristics to select a character matching results with the NER system. Second, it can be introduced into a pre-trained word embedded in the system, which is very helpful for the final performance.
Disadvantages: drawback Lattice-LSTM model form it is input from a sentence is converted to a chain sequence of FIG. This will greatly increase the computational modeling of the cost of a sentence.
Proposed Model
Consideration Lattice-LSTM based design herein should be kept chained input sentence form, while maintaining the advantages of the two Lattice-LSTM model.
First, this paper ExSoftWord, but the analysis of ExSoftword found ExSoftword two methods can not fully inherited the advantages of Lattice-LSTM. First, it can not introduce a pre-trained word embedded. Second, while it tries to maintain the existing matching result is more divided label, but it still lost a lot of information. For this reason this paper is not only reserved characters may be split label, and retain their corresponding matching word. Specifically, in this improved process, the sentence s c corresponding to each character set consists of four four word segment label "bmes" tag. Word set B (c) a lexicon of all matches in the sentence beginning with c s-word. Similarly all matching word thesaurus, m (c) appears in the middle of a sentence s c by the composition, e (c) ending by a thesaurus in all c-word matches, s (c) c is a single character word thereof. If a word set is empty, we will add a special word "None" to indicate this fact.
The word is then compressed four sets of each character into a fixed-dimensional vector. In order to retain as much information, we will choose four words represent a connection set up is represented as a whole, and add it to a character representation.
In addition, we have the right to try every word in heavy smoothing, to increase the weight of heavy infrequent word.
Finally, based on the enhanced character that we use any suitable sequence tags nerve model sequence tags, such as those based LSTM sequence modeling layer and CRF mark reasoning layer.
Experimental results
different F1 under the proposed method of scoring whether bichar, the proposed method for the number of training iterations on OntoNotes contrast. Compared with Lattice LSTM and LR-CNN, the present method of calculating the velocity series model under different layers (the number of sentences per second, the better). OntoNotes performance on the [image dump outer link fails, the source station may have security chain mechanism, it is recommended to save the picture down uploaded directly (img-FLk0EcCH-1579348245258) (18628169-11860b35824b525a.png? imageMogr2 / auto-orient /on the performance of MRSA
ACED
Attention
More natural language processing knowledge, also please pay attention ** AINLPer ** No public, the best dry instantly delivered.