Automatic construction technology of knowledge graph

Recently, I saw CCKS2020 based on the ontology-based financial knowledge graph automated construction technology  related technical solutions.

First attach some reference links:

Fifth place method summary

I haven’t found the others for the time being. If you see some friends, you can leave a message, thanks~~

 

This piece of content is rather fragmented, so there is no systematic introduction.

 

Technical details

I want to sort out the technology of related technical scheme design

1. Multi-example learning

Multiple Instance Learning

Multi-Instance Learning

Divide the training set into multiple multi-instance bags with classification labels, each containing several instances. Multi-instance learning trains a classifier that can classify packets by learning the examples in the packet, and applies the classifier to the prediction of multi-instance packets with unknown labels.

 

In the process of multi-instance learning and training, there are three main ideas for how to select positive instances from the package for relation classification:

①Based on the "at least one" assumption, that is, assuming that there is at least one sentence instance in the package that can represent the relationship between the entity pairs, the task goal at this time is to train a classifier, and take the sentence in the package most likely to represent the relationship between the entities as input To classify the relationship. This idea is the method adopted by the PCNN-One model.

② Based on the attention mechanism, use a vector that can represent the relationship between entities and the sentence examples in the package to find the similarity, obtain a weight parameter, and assign different weights to different examples to reduce the sum through attention. The influence of noise data. This idea is the method adopted by the PCNN-ATT model.

③Use reinforcement learning to denoise and screen out positive examples for relationship classification.

2. PCNN

pcnn relation extraction paper reading summary: try to show the details

PCNN for knowledge graph relationship extraction-tensorflow implementation

It seems to be a model of learning relationship extraction with multiple examples. I didn’t look closely. I still have to read the paper to be more realistic~ 

3. Snowball

Neural Snowball for Few-Shot Relation Learning
Tsinghua, produced by Tencent, 19 years, a paper of few shot, used for relation extraction

 

4. Document-level relation extraction

Summary of document-level relationship extraction methods

Things involving many pictures

5. Vocabulary enhancement in NER

Vocabulary enhancement methods in NER (LatticeLSTM, CGN, FLAT, Simple-Lexicon)

Simple-Lexicon and FLAT are the latest papers in 20 years, the effect is relatively good, Simple-Lexicon is relatively simple to implement.

Note that this is vocabulary enhancement, not data enhancement. Logically, the word segmentation information is added to the model input to know that the model can get better NER effects.

In fact, the CRF method has been tried before, and the single-character model is better than the token-based model. At the same time, the result of the word segmentation is added as a feature on the basis of the single word. In addition, there are also many model inputs that encode part of speech and add them to the input, but relatively speaking, tasks such as classification have little effect, and NER has a greater impact. Although the effect of the single-character model is good, there will be some errors in extracting entity boundaries in practical applications.

 

 

 

Guess you like

Origin blog.csdn.net/katrina1rani/article/details/112528704