Method summary of the paper "A Method of Extracting Financial Event Information Based on Lexical-Semantic Models"

1. The basic method of information extraction:

1. The rule-based approach

1. Advantages : fewer annotations are expected, rules are more interpretable, and easy to adjust.

2. Disadvantages : poor flexibility, low recall, and poor portability.

3. Main methods : regular expression method, semi-structured tree method, vocabulary-syntax model, vocabulary-semantic model.

2. Method based on machine learning

1. Main difficulties : The effectiveness of the learning model mainly depends on the size of the training corpus and the quality of annotation. The running time and efficiency also increase linearly with the number of corpus symbol categories.

2. Related work of this article:

1. Use a natural language text processing framework-the Java Annotation Pattern Engine ( JAPE ) language in the general text processing framework to develop the vocabulary-semantic rule model and implement semantic annotation.

2. Using a method based on a hierarchical vocabulary-semantic rule model to extract a large amount of semantic element information related to financial events, such as event category, participant, time, location, transaction amount, etc., from financial news text.

Innovation:

1. Adopt a new method based on deep learning method (Word2Vector) to automatically generate the concept thesaurus, which solves the time-consuming and laborious problem of the traditional manual method of compiling the concept thesaurus.

2. A hierarchical vocabulary-semantic rule tagging model method based on finite state machine-driven is designed to achieve hierarchical extraction and abstraction of event semantic tagging information.

3. A hierarchical vocabulary-semantic rule tagging model method based on finite state machine-driven is designed to achieve hierarchical extraction and abstraction of event semantic tagging information.

advantage:

When using a hierarchical annotation structure, users can flexibly extract or insert a certain annotation layer in the vocabulary-semantic rule file according to their needs, so that when writing a vocabulary-semantic rule, it is not necessary to consider the impact of a certain annotation type on the rule grammar. Greatly simplifies the work of writing rules.

3. Evaluation:

Effectively solve the traditional methods based on vocabulary or lexical syntax rules: too much dependence on the results of syntactic analysis, unable to accurately describe the connection between synonyms, antonyms and hypernyms, and unable to realize the vocabulary according to the business needs of the professional field Conceptualize and abstract issues.

Guess you like

Origin blog.csdn.net/qq_43631037/article/details/112986687