GlobalPointer NER实体提取

参考:https://www.kexue.fm/archives/8373
https://github.com/bojone/CLUE-bert4keras(cluener.py)

bert4keras>=0.10.8
在这里插入图片描述

数据集下载:
https://github.com/CLUEbenchmark/CLUENER2020/files/6371700/cluener_public.zip
参考:https://github.com/CLUEbenchmark/CLUENER2020/issues/47

模型结构简单说明:
BERT+GlobalPointer


# 建立分词器
tokenizer = Tokenizer(dict_path, do_lower_case=True)

# 预训练模型
base = build_transformer_model(
    config_path, checkpoint_path, application='unilm', return_keras_model=False
)

# 模型参数
last_layer = 'Transformer-%s-FeedForward-Norm' % (base.num_hidden_layers - 1)
# 构建模型

output = base.model.get_layer(last_layer).output
output = GlobalPointer(
    heads=num_classes,
    head_size=base.attention_head_size,
    use_bias=False,
    kernel_initializer=base.initializer
)(output)

运行cluener.py文件注意:

snippets.py文件

1、主要改下bert预训练模型的路径

# 模型路径
config_path = r'D:\***ert\chinese_L-12_H-768_A-12\bert_config.json'
checkpoint_path = r'D:\***\chinese_L-12_H-768_A-12\bert_model.ckpt'
dict_path = r'D:\****t\chinese_L-12_H-768_A-12\vocab.txt'


2、改下数据源的路径
# 通用参数
data_path = r'D:\clue_bert4keras\\'

cluener.py文件

预测的时候注释,和open添加encoding='utf-8’不然会报错

在这里插入图片描述
在这里插入图片描述

Guess you like

Origin blog.csdn.net/weixin_42357472/article/details/121261294