Industry NER

1. Background

In the industry, such as the search field, entity recognition in the text field often requires consideration of cost, efficiency, and the use of downstream application scenarios for entities extracted by ner, in addition to meeting higher requirements for f1-score under each label. Query-ner problems generally have the following characteristics:

  • The number of new entities is huge and the growth rate is fast : the business field is developing rapidly, and new stores, new products, and new service categories are emerging in an endless stream; user queries are often mixed with many non-standardized expressions, abbreviations, and hot words (such as "careful", "sucking cats", etc. ), which poses a great challenge to achieve NER with high accuracy and high coverage.
  • Strong domain correlation : Entity recognition in search is highly related to business supply. In addition to general semantics, business-related knowledge needs to be added to assist in judgment, such as "I cut my hair". The general understanding is a generalized description of an entity, but it is a business entity in search .
  • High performance requirements : The time from when the user initiates the search to when the final result is presented to the user is very short. As the basic module of DQU, NER needs to be completed within milliseconds. Recently, many researches and practices based on deep networks have significantly improved the effect of NER, but these models often have a large amount of calculation and take a long time to predict. How to optimize the performance of the model so that it can meet the requirements of NER for calculation time is also an important issue in NER practice. a major challenge.

2. Method

The overall framework is implemented using the method of "entity dictionary matching + model prediction"

  1. Score the result of the entity dictionary by training the crf scorer, and perform model prediction when there is no match in the entity dictionary or the score of the matching result is low.

  2. Entity dictionary construction

    1. Obtained from structured information, such as business spuname, brand, category name
    2. Mining in unstructured text, such as mining product details, merchant introductions, etc.
    3. new word discovery
      1. Unsupervised: Screening by tightness, degree of freedom indicators
      2. Supervised: Experts design grammars, rules are mined
      3. Distant Supervision: Few-Shot Learning
  3. Online vocabulary matching strategy

    1. two-way maximum matching

      This strategy is relatively simple and requires extremely high lexicon accuracy and coverage

    2. crf word segmentation preprocessing

    3. pattern regex fix

  4. model prediction

    1. bert distillation

      Depending on the amount of unlabeled data, logtis, distribution, and value approximation can be used to achieve distillation

    2. Online model prediction acceleration

      1. mixed precision
      2. batching
      3. operator fusion
    3. knowledge enhancement

      1. Combining lattice and flat fusion word features
    4. Two-stage ner attempt

    5. Weakly supervised ner

reference

Guess you like

Origin blog.csdn.net/be_humble/article/details/130490766