自然语言处理基础技术之命名实体识别实战

声明:转载请注明出处,谢谢:https://blog.csdn.net/m0_37306360/article/details/84592596
另外,更多实时更新的个人学习笔记分享,请关注:
知乎:https://www.zhihu.com/people/yuquanle/columns
公众号:StudyForAI


Stanford CoreNLP命名实体类识别

安装:pip install stanfordcorenlp

国内源安装:pip install stanfordcorenlp -i https://pypi.tuna.tsinghua.edu.cn/simple

使用stanfordcorenlp进行命名实体类识别

先下载模型,下载地址:https://nlp.stanford.edu/software/corenlp-backup-download.html

对中文进行实体识别

from stanfordcorenlp import StanfordCoreNLP
zh_model = StanfordCoreNLP(r'stanford-corenlp-full-2018-02-27', lang='zh')
s_zh = '我爱自然语言处理技术!'
ner_zh = zh_model.ner(s_zh)
s_zh1 = '我爱北京天安门!'
ner_zh1 = zh_model.ner(s_zh1)
print(ner_zh)
print(ner_zh1)
[('我爱', 'O'), ('自然', 'O'), ('语言', 'O'), ('处理', 'O'), ('技术', 'O'), ('!', 'O')]
[('我爱', 'O'), ('北京', 'STATE_OR_PROVINCE'), ('天安门', 'FACILITY'), ('!', 'O')]

对英文进行实体识别

eng_model = StanfordCoreNLP(r'stanford-corenlp-full-2018-02-27')
s_eng = 'I love natural language processing technology!'
ner_eng = eng_model.ner(s_eng)
s_eng1 = 'I love Beijing Tiananmen!'
ner_eng1 = eng_model.ner(s_eng1)
print(ner_eng)
print(ner_eng1)
[('I', 'O'), ('love', 'O'), ('natural', 'O'), ('language', 'O'), ('processing', 'O'), ('technology', 'O'), ('!', 'O')]
[('I', 'O'), ('love', 'O'), ('Beijing', 'CITY'), ('Tiananmen', 'LOCATION'), ('!', 'O')]

Hanlp命名实体类识别

安装:pip install pyhanlp

国内源安装:pip install pyhanlp -i https://pypi.tuna.tsinghua.edu.cn/simple

通过crf算法识别实体

from pyhanlp import *
# 音译人名示例
CRFnewSegment = HanLP.newSegment("crf")
term_list = CRFnewSegment.seg("我爱北京天安门!")
print(term_list)
[我/r, 爱/v, 北京/ns, 天安门/ns, !/w]

NLTK词性标注

安装:pip install nltk

国内源安装:pip install nltk -i https://pypi.tuna.tsinghua.edu.cn/simple

import nltk
s = 'I love natural language processing technology!'
s_token = nltk.word_tokenize(s)
s_tagged = nltk.pos_tag(s_token)
s_ner = nltk.chunk.ne_chunk(s_tagged)
print(s_ner)

spaCy命名实体识别

安装:pip install spaCy

国内源安装:pip install spaCy -i https://pypi.tuna.tsinghua.edu.cn/simple

import spacy 
eng_model = spacy.load('en')
s = 'I want to Beijing learning natural language processing technology!'

命名实体识别

s_ent = eng_model(s)
for ent in s_ent.ents:
    print(ent, ent.label_, ent.label)
Beijing GPE 382

猜你喜欢

转载自blog.csdn.net/m0_37306360/article/details/84592596