NLP tasks and related concepts

Named entity recognition

NER (Named Entity Recognition, referred NER), also known as "the proper name recognition" refers to the entity that has a specific meaning in the recognized text, including place names, organization names, and other proper nouns.

NER is information extraction, question answering system, parsing, machine translation, an important basic tool for Semantic Web metadata tagging and other application areas, plays an important role in natural language processing technology to practical use in the process. In general, the NER task is to identify the text to be processed in three categories (entity class, class time and class numbers), seven subcategories (name, organization name, place name, time, date, currency, and percentage) named entity.

Generally consists of two parts: (1) physical boundary recognition; (2) determining the category entity (person, place, name, or other means). English named entity with obvious signs of the form (that is, the first letter of each word in the entity should be capitalized), it is relatively easy to identify the entity boundary, the key tasks is to determine the entity category. Than English, Chinese named entity recognition task more complicated, but also with respect to the entity category marked subtasks entity boundary identification more difficult.

Speech Recognition Difficulties (1) Chinese text is not similar to the English text explicitly marked word spaces or boundary identifier, named entity recognition of the first step is to determine the word boundary, i.e., word; (2) Chinese word and the named entity recognition affect each other; (3) in addition to entities defined in English, translation of foreign names and place names translation are two types of special entity type exists in Mandarin Chinese; (4) modern Chinese text, especially online Chinese text, often appear in English alternately, when Chinese NER task further comprises identifying the named entities in which English; (5) different named entities have different internal characteristics, it is impossible to describe a unified model of internal features of all entities.

Guess you like

Origin www.cnblogs.com/qccz123456/p/11627395.html