中文预训练模型泛化能力挑战赛(上):Baseline
目录
- 中文预训练模型泛化能力挑战赛(上):Baseline
-
- 一、 Docker的使用
- 二、baseline的实现
- 三、遇到的问题
-
- 1、win 10系统不满足要求
- 2、安装完docker,重启电脑出现“WSL 2 installation is incomplete.”,而且鲸鱼图案是红色
- 3、ImportError: Keras requires TensorFlow 2.2 or higher. Install TensorFlow via `pip install tensorflow`
- 4、ImportError: You need to first `import keras` in order to use `keras_applications`. For instance, you can do:
- 5、AttributeError: module 'keras_applications' has no attribute 'set_keras_submodules'
- 6、AssertionError: Torch not compiled with CUDA enabled
- 四、核心代码
- 参考资料
一、 Docker的使用
1、Windows Docker 安装
Docker 并非是一个通用的容器工具,它依赖于已存在并运行的 Linux 内核环境。
Docker 实质上是在已经运行的 Linux 下制造了一个隔离的文件环境,因此它执行的效率几乎等同于所部署的 Linux 主机。
因此,Docker 必须部署在 Linux 内核的系统上。如果其他系统想部署 Docker 就必须安装一个虚拟 Linux 环境。
在 Windows 上部署 Docker 的方法都是先安装一个虚拟机,并在安装 Linux 系统的的虚拟机中运行 Docker。
2、Win10 系统
Docker Desktop 是 Docker 在 Windows 10 和 macOS 操作系统上的官方安装方式,这个方法依然属于先在虚拟机中安装 Linux 然后再安装 Docker 的方法。
Docker Desktop 官方下载地址: https://hub.docker.com/editions/community/docker-ce-desktop-windows
注意:此方法仅适用于 Windows 10 操作系统专业版、企业版、教育版和部分家庭版!
3、安装 Hyper-V
Hyper-V 是微软开发的虚拟机,类似于 VMWare 或 VirtualBox,仅适用于 Windows 10。这是 Docker Desktop for Windows 所使用的虚拟机。
但是,这个虚拟机一旦启用,QEMU、VirtualBox 或 VMWare Workstation 15 及以下版本将无法使用!如果你必须在电脑上使用其他虚拟机(例如开发 Android 应用必须使用的模拟器),请不要使用 Hyper-V!
4、开启 Hyper-V
程序和功能
启用或关闭Windows功能
选中Hyper-V
也可以通过命令来启用 Hyper-V ,请右键开始菜单并以管理员身份运行 PowerShell,执行以下命令:
Enable-WindowsOptionalFeature -Online -FeatureName Microsoft-Hyper-V -All
5、安装 Docker Desktop for Windows
点击 Get started with Docker Desktop ,并下载 Windows 的版本,如果你还没有登录,会要求注册登录:
6、运行安装文件
双击下载的 Docker for Windows Installer 安装文件,一路 Next,点击 Finish 完成安装。
安装完成后,Docker 会自动启动。通知栏上会出现个小鲸鱼的图标 ,这表示 Docker 正在运行。
桌边也会出现三个图标,如下图所示:
我们可以在命令行执行 docker version 来查看版本号,docker run hello-world 来载入测试镜像测试。
如果没启动,你可以在 Windows 搜索 Docker 来启动:
启动后,也可以在通知栏上看到小鲸鱼图标:
如果启动中遇到因 WSL 2 导致地错误,请安装 WSL 2。
安装之后,可以打开 PowerShell 并运行以下命令检测是否运行成功:
docker run hello-world
在成功运行之后应该会出现以下信息:
二、baseline的实现
1、运行过程
1)下载Bert全权重,下载 https://huggingface.co/bert-base-chinese/tree/main 下载config.json vocab.txt pytorch_model.bin,把这三个文件放进tianchi-multi-task-nlp/bert_pretrain_model文件夹下。
2) 下载比赛数据集,把三个数据集分别放进 tianchi-multi-task-nlp/tianchi_datasets/数据集名字/ 下面:
- OCEMOTION/total.csv:
http://tianchi-competition.oss-cn-hangzhou.aliyuncs.com/531841/OCEMOTION_train1128.csv- OCEMOTION/test.csv:
http://tianchi-competition.oss-cn-hangzhou.aliyuncs.com/531841/b/ocemotion_test_B.csv- TNEWS/total.csv:
http://tianchi-competition.oss-cn-hangzhou.aliyuncs.com/531841/TNEWS_train1128.csv- TNEWS/test.csv:
http://tianchi-competition.oss-cn-hangzhou.aliyuncs.com/531841/b/tnews_test_B.csv- OCNLI/total.csv:
http://tianchi-competition.oss-cn-hangzhou.aliyuncs.com/531841/OCNLI_train1128.csv- OCNLI/test.csv:
http://tianchi-competition.oss-cn-hangzhou.aliyuncs.com/531841/b/ocnli_test_B.csv
文件目录样例:
tianchi-multi-task-nlp/tianchi_datasets/OCNLI/total.csv
tianchi-multi-task-nlp/tianchi_datasets/OCNLI/test.csv
3)分开训练集和验证集,默认验证集是各3000条数据,参数可以自己修改:
python ./generate_data.py
4)训练模型,一个epoch:
python ./train.py
会保存验证集上平均f1分数最高的模型到 ./saved_best.pt
5)用训练好的模型 ./saved_best.pt 生成结果:
python ./inference.py
6) 打包预测结果。
zip -r ./result.zip ./*.json
7)生成Docker并进行提交,参考:https://tianchi.aliyun.com/competition/entrance/231759/tab/174
- 创建云端镜像仓库:https://cr.console.aliyun.com/
- 创建命名空间和镜像仓库;
- 然后切换到submission文件夹下,执行下面命令;
# 用于登录的用户名为阿里云账号全名,密码为开通服务时设置的密码。
sudo docker login --username=xxx@mail.com registry.cn-hangzhou.aliyuncs.com
# 使用本地Dockefile进行构建,使用创建仓库的【公网地址】
# 如 docker build -t registry.cn-shenzhen.aliyuncs.com/test_for_tianchi/test_for_tianchi_submit:1.0 .
docker build -t registry.cn-shenzhen.aliyuncs.com/test_for_tianchi/test_for_tianchi_submit:1.0 .
输出构建过程:
Sending build context to Docker daemon 18.94kB
Step 1/4 : FROM registry.cn-shanghai.aliyuncs.com/tcc-public/python:3
---> a4cc999cf2aa
Step 2/4 : ADD . /
---> Using cache
---> b18fbb4425ef
Step 3/4 : WORKDIR /
---> Using cache
---> f5fcc4ca5eca
Step 4/4 : CMD ["sh", "run.sh"]
---> Using cache
---> ed0c4b0e545f
Successfully built ed0c4b0e545f
# ed0c4b0e545f 为镜像id,上面构建过程最后一行
sudo docker tag ed0c4b0e545f registry.cn-shenzhen.aliyuncs.com/test_for_tianchi/test_for_tianchi_submit:1.0
# 提交镜像到云端
docker push registry.cn-shenzhen.aliyuncs.com/test_for_tianchi/test_for_tianchi_submit:1.0
8)比赛提交页面,填写镜像路径+版本号,以及用户名和密码则可以完成提交。
2、比赛改进思路
1)修改 calculate_loss.py 改变loss的计算方式,从平衡子任务难度以及各子任务类别样本不均匀入手;
2)修改 net.py 改变模型的结构,加入attention层,或者其他层;
3)使用 cleanlab 等工具对训练文本进行清洗;
4)做文本数据增强,或者在预训练时候用其他数据集pretrain;
5)对训练好的模型再在完整数据集(包括验证集和训练集)上用小的学习率训练一个epoch;
6)调整bathSize和a_step,变更梯度累计的程度,当前是batchSize=16,a_step=16;
7)用 chinese-roberta-wwm-ext 作为预训练模型;
三、遇到的问题
1、win 10系统不满足要求
Docker Desktop for Windows 支持 64 位版本的 Windows 10 Pro,且必须开启 Hyper-V(若版本为 v1903 及以上则无需开启 Hyper-V),或者 64 位版本的 Windows 10 Home v1903 及以上版本。
解决办法:
你还在找Windows10家庭版中开启Hyper-v的方法?如果你是因为要用Docker for Windows版本的话,我建议你去升级专业版
Windows10家庭版添加Hyper-V的方法
将下面的内容复制到编辑器或者记事本当中
pushd "%~dp0"
dir /b %SystemRoot%\servicing\Packages\*Hyper-V*.mum >hyper-v.txt
for /f %%i in ('findstr /i . hyper-v.txt 2^>nul') do dism /online /norestart /add-package:"%SystemRoot%\servicing\Packages\%%i"
del hyper-v.txt
Dism /online /enable-feature /featurename:Microsoft-Hyper-V-All /LimitAccess /ALL
进行保存,保存为Hyper-V.cmd
在系统桌面上,我们找到并右键点击【Hyper-V.cmd】文件图标,在右键菜单中点击:以管理员身份运行(A)
然后弹出一个 用户帐户控制 – Windows命令处理程序 对话框,我们点击:是
紧接着进行Windows命令处理,我们等待处理完成以后
在最末处输入:Y,电脑自动重启,进行配置更新。注意:不能关闭计算机
配置更新重启完成以后,我们去控制面板-所有控制面板项-程序和功能,点击启用或关闭Windows功能,就会发现我们已经有了Hyper-v功能
2、安装完docker,重启电脑出现“WSL 2 installation is incomplete.”,而且鲸鱼图案是红色
解决办法:
点击https://aka.ms/wsl2kernel,
检查运行 WSL 2 的要求
若要更新到 WSL 2,需要运行 Windows 10。
• 对于 x64 系统:版本 1903 或更高版本,采用 内部版本 18362 或更高版本。
• 对于 ARM64 系统:版本 2004 或更高版本,采用 内部版本 19041 或更高版本。
• 低于 18362 的版本不支持 WSL 2。 使用 Windows Update 助手更新 Windows 版本。
若要检查 Windows 版本及内部版本号,选择 Windows 徽标键 + R,然后键入“winver”,选择“确定”。 (或者在 Windows 命令提示符下输入 ver 命令)。 更新到“设置”菜单中的最新 Windows 版本。
PS:如果运行的是 Windows 10 版本1903 或 1909,请在 Windows 菜单中打开“设置”,导航到“更新和安全性”,然后选择“检查更新”。 内部版本号必须是 18362.1049+ 或 18363.1049+,次要内部版本号需要高于 .1049。 阅读详细信息:WSL 2 即将支持 Windows 10 版本 1903 和 1909 请参阅疑难解答说明
查看后,我的电脑是win 10版本1909
我是直接下载 Linux 内核更新包
1)下载最新包:
适用于 x64 计算机的 WSL2 Linux 内核更新包
2)运行上一步中下载的更新包。 (双击以运行 - 系统将提示你提供提升的权限,选择“是”以批准此安装。)
3、ImportError: Keras requires TensorFlow 2.2 or higher. Install TensorFlow via pip install tensorflow
解决办法
• pip list
• keras==2.4.3(目前版本太高,需要降版本到2.2)
pip install keras==2.2 -i https://pypi.douban.com/simple
4、ImportError: You need to first import keras
in order to use keras_applications
. For instance, you can do:
import keras
from keras_applications import vgg16
Or, preferably, this equivalent formulation:
from keras import applications
解决办法:
核对tensorflow和keras版本对应关系
5、AttributeError: module ‘keras_applications’ has no attribute ‘set_keras_submodules’
解决办法:
pip install keras-models
6、AssertionError: Torch not compiled with CUDA enabled
解决办法:
此错误是由于下载的torch没有cuda,在运行时就会出错,经过查阅,在程序最开始的地方加上:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
代码其余地方出现.cuda()的地方改成.to(device)就可以在无gpu的环境中运行了。
还有将 cuda:0 改为 cpu:0
四、核心代码
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
import torch
from transformers import BertModel, BertTokenizer
import json
from utils import get_f1, print_result, load_pretrained_model, load_tokenizer
from net import Net
from data_generator import Data_generator
from calculate_loss import Calculate_loss
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
def train(epochs=20, batchSize=64, lr=0.0001, device='cpu:0', accumulate=True, a_step=16, load_saved=False, file_path='./saved_best.pt', use_dtp=False, pretrained_model='./bert_pretrain_model', tokenizer_model='bert-base-chinese', weighted_loss=False):
device = device
tokenizer = load_tokenizer(tokenizer_model)
my_net = torch.load(file_path) if load_saved else Net(load_pretrained_model(pretrained_model))
my_net.to(device, non_blocking=True)
label_dict = dict()
with open('./tianchi_datasets/label.json') as f:
for line in f:
label_dict = json.loads(line)
break
label_weights_dict = dict()
with open('./tianchi_datasets/label_weights.json') as f:
for line in f:
label_weights_dict = json.loads(line)
break
ocnli_train = dict()
with open('./tianchi_datasets/OCNLI/train.json') as f:
for line in f:
ocnli_train = json.loads(line)
break
ocnli_dev = dict()
with open('./tianchi_datasets/OCNLI/dev.json') as f:
for line in f:
ocnli_dev = json.loads(line)
break
ocemotion_train = dict()
with open('./tianchi_datasets/OCEMOTION/train.json') as f:
for line in f:
ocemotion_train = json.loads(line)
break
ocemotion_dev = dict()
with open('./tianchi_datasets/OCEMOTION/dev.json') as f:
for line in f:
ocemotion_dev = json.loads(line)
break
tnews_train = dict()
with open('./tianchi_datasets/TNEWS/train.json') as f:
for line in f:
tnews_train = json.loads(line)
break
tnews_dev = dict()
with open('./tianchi_datasets/TNEWS/dev.json') as f:
for line in f:
tnews_dev = json.loads(line)
break
train_data_generator = Data_generator(ocnli_train, ocemotion_train, tnews_train, label_dict, device, tokenizer)
dev_data_generator = Data_generator(ocnli_dev, ocemotion_dev, tnews_dev, label_dict, device, tokenizer)
tnews_weights = torch.tensor(label_weights_dict['TNEWS']).to(device, non_blocking=True)
ocnli_weights = torch.tensor(label_weights_dict['OCNLI']).to(device, non_blocking=True)
ocemotion_weights = torch.tensor(label_weights_dict['OCEMOTION']).to(device, non_blocking=True)
loss_object = Calculate_loss(label_dict, weighted=weighted_loss, tnews_weights=tnews_weights, ocnli_weights=ocnli_weights, ocemotion_weights=ocemotion_weights)
optimizer=torch.optim.Adam(my_net.parameters(), lr=lr)
best_dev_f1 = 0.0
best_epoch = -1
for epoch in range(epochs):
my_net.train()
train_loss = 0.0
train_total = 0
train_correct = 0
train_ocnli_correct = 0
train_ocemotion_correct = 0
train_tnews_correct = 0
train_ocnli_pred_list = []
train_ocnli_gold_list = []
train_ocemotion_pred_list = []
train_ocemotion_gold_list = []
train_tnews_pred_list = []
train_tnews_gold_list = []
cnt_train = 0
while True:
raw_data = train_data_generator.get_next_batch(batchSize)
if raw_data == None:
break
data = dict()
data['input_ids'] = raw_data['input_ids']
data['token_type_ids'] = raw_data['token_type_ids']
data['attention_mask'] = raw_data['attention_mask']
data['ocnli_ids'] = raw_data['ocnli_ids']
data['ocemotion_ids'] = raw_data['ocemotion_ids']
data['tnews_ids'] = raw_data['tnews_ids']
tnews_gold = raw_data['tnews_gold']
ocnli_gold = raw_data['ocnli_gold']
ocemotion_gold = raw_data['ocemotion_gold']
if not accumulate:
optimizer.zero_grad()
ocnli_pred, ocemotion_pred, tnews_pred = my_net(**data)
if use_dtp:
tnews_kpi = 0.1 if len(train_tnews_pred_list) == 0 else train_tnews_correct / len(train_tnews_pred_list)
ocnli_kpi = 0.1 if len(train_ocnli_pred_list) == 0 else train_ocnli_correct / len(train_ocnli_pred_list)
ocemotion_kpi = 0.1 if len(train_ocemotion_pred_list) == 0 else train_ocemotion_correct / len(train_ocemotion_pred_list)
current_loss = loss_object.compute_dtp(tnews_pred, ocnli_pred, ocemotion_pred, tnews_gold, ocnli_gold,
ocemotion_gold, tnews_kpi, ocnli_kpi, ocemotion_kpi)
else:
current_loss = loss_object.compute(tnews_pred, ocnli_pred, ocemotion_pred, tnews_gold, ocnli_gold, ocemotion_gold)
train_loss += current_loss.item()
current_loss.backward()
if accumulate and (cnt_train + 1) % a_step == 0:
optimizer.step()
optimizer.zero_grad()
if not accumulate:
optimizer.step()
if use_dtp:
good_tnews_nb, good_ocnli_nb, good_ocemotion_nb, total_tnews_nb, total_ocnli_nb, total_ocemotion_nb = loss_object.correct_cnt_each(tnews_pred, ocnli_pred, ocemotion_pred, tnews_gold, ocnli_gold, ocemotion_gold)
tmp_good = sum([good_tnews_nb, good_ocnli_nb, good_ocemotion_nb])
tmp_total = sum([total_tnews_nb, total_ocnli_nb, total_ocemotion_nb])
train_ocemotion_correct += good_ocemotion_nb
train_ocnli_correct += good_ocnli_nb
train_tnews_correct += good_tnews_nb
else:
tmp_good, tmp_total = loss_object.correct_cnt(tnews_pred, ocnli_pred, ocemotion_pred, tnews_gold, ocnli_gold, ocemotion_gold)
train_correct += tmp_good
train_total += tmp_total
p, g = loss_object.collect_pred_and_gold(ocnli_pred, ocnli_gold)
train_ocnli_pred_list += p
train_ocnli_gold_list += g
p, g = loss_object.collect_pred_and_gold(ocemotion_pred, ocemotion_gold)
train_ocemotion_pred_list += p
train_ocemotion_gold_list += g
p, g = loss_object.collect_pred_and_gold(tnews_pred, tnews_gold)
train_tnews_pred_list += p
train_tnews_gold_list += g
cnt_train += 1
#torch.cuda.empty_cache()
if (cnt_train + 1) % 1000 == 0:
print('[', cnt_train + 1, '- th batch : train acc is:', train_correct / train_total, '; train loss is:', train_loss / cnt_train, ']')
if accumulate:
optimizer.step()
optimizer.zero_grad()
train_ocnli_f1 = get_f1(train_ocnli_gold_list, train_ocnli_pred_list)
train_ocemotion_f1 = get_f1(train_ocemotion_gold_list, train_ocemotion_pred_list)
train_tnews_f1 = get_f1(train_tnews_gold_list, train_tnews_pred_list)
train_avg_f1 = (train_ocnli_f1 + train_ocemotion_f1 + train_tnews_f1) / 3
print(epoch, 'th epoch train average f1 is:', train_avg_f1)
print(epoch, 'th epoch train ocnli is below:')
print_result(train_ocnli_gold_list, train_ocnli_pred_list)
print(epoch, 'th epoch train ocemotion is below:')
print_result(train_ocemotion_gold_list, train_ocemotion_pred_list)
print(epoch, 'th epoch train tnews is below:')
print_result(train_tnews_gold_list, train_tnews_pred_list)
train_data_generator.reset()
my_net.eval()
dev_loss = 0.0
dev_total = 0
dev_correct = 0
dev_ocnli_correct = 0
dev_ocemotion_correct = 0
dev_tnews_correct = 0
dev_ocnli_pred_list = []
dev_ocnli_gold_list = []
dev_ocemotion_pred_list = []
dev_ocemotion_gold_list = []
dev_tnews_pred_list = []
dev_tnews_gold_list = []
cnt_dev = 0
with torch.no_grad():
while True:
raw_data = dev_data_generator.get_next_batch(batchSize)
if raw_data == None:
break
data = dict()
data['input_ids'] = raw_data['input_ids']
data['token_type_ids'] = raw_data['token_type_ids']
data['attention_mask'] = raw_data['attention_mask']
data['ocnli_ids'] = raw_data['ocnli_ids']
data['ocemotion_ids'] = raw_data['ocemotion_ids']
data['tnews_ids'] = raw_data['tnews_ids']
tnews_gold = raw_data['tnews_gold']
ocnli_gold = raw_data['ocnli_gold']
ocemotion_gold = raw_data['ocemotion_gold']
ocnli_pred, ocemotion_pred, tnews_pred = my_net(**data)
if use_dtp:
tnews_kpi = 0.1 if len(dev_tnews_pred_list) == 0 else dev_tnews_correct / len(
dev_tnews_pred_list)
ocnli_kpi = 0.1 if len(dev_ocnli_pred_list) == 0 else dev_ocnli_correct / len(
dev_ocnli_pred_list)
ocemotion_kpi = 0.1 if len(dev_ocemotion_pred_list) == 0 else dev_ocemotion_correct / len(
dev_ocemotion_pred_list)
current_loss = loss_object.compute_dtp(tnews_pred, ocnli_pred, ocemotion_pred, tnews_gold,
ocnli_gold,
ocemotion_gold, tnews_kpi, ocnli_kpi, ocemotion_kpi)
else:
current_loss = loss_object.compute(tnews_pred, ocnli_pred, ocemotion_pred, tnews_gold, ocnli_gold, ocemotion_gold)
dev_loss += current_loss.item()
if use_dtp:
good_tnews_nb, good_ocnli_nb, good_ocemotion_nb, total_tnews_nb, total_ocnli_nb, total_ocemotion_nb = loss_object.correct_cnt_each(
tnews_pred, ocnli_pred, ocemotion_pred, tnews_gold, ocnli_gold, ocemotion_gold)
tmp_good += sum([good_tnews_nb, good_ocnli_nb, good_ocemotion_nb])
tmp_total += sum([total_tnews_nb, total_ocnli_nb, total_ocemotion_nb])
dev_ocemotion_correct += good_ocemotion_nb
dev_ocnli_correct += good_ocnli_nb
dev_tnews_correct += good_tnews_nb
else:
tmp_good, tmp_total = loss_object.correct_cnt(tnews_pred, ocnli_pred, ocemotion_pred, tnews_gold, ocnli_gold, ocemotion_gold)
dev_correct += tmp_good
dev_total += tmp_total
p, g = loss_object.collect_pred_and_gold(ocnli_pred, ocnli_gold)
dev_ocnli_pred_list += p
dev_ocnli_gold_list += g
p, g = loss_object.collect_pred_and_gold(ocemotion_pred, ocemotion_gold)
dev_ocemotion_pred_list += p
dev_ocemotion_gold_list += g
p, g = loss_object.collect_pred_and_gold(tnews_pred, tnews_gold)
dev_tnews_pred_list += p
dev_tnews_gold_list += g
cnt_dev += 1
#torch.cuda.empty_cache()
#if (cnt_dev + 1) % 1000 == 0:
# print('[', cnt_dev + 1, '- th batch : dev acc is:', dev_correct / dev_total, '; dev loss is:', dev_loss / cnt_dev, ']')
dev_ocnli_f1 = get_f1(dev_ocnli_gold_list, dev_ocnli_pred_list)
dev_ocemotion_f1 = get_f1(dev_ocemotion_gold_list, dev_ocemotion_pred_list)
dev_tnews_f1 = get_f1(dev_tnews_gold_list, dev_tnews_pred_list)
dev_avg_f1 = (dev_ocnli_f1 + dev_ocemotion_f1 + dev_tnews_f1) / 3
print(epoch, 'th epoch dev average f1 is:', dev_avg_f1)
print(epoch, 'th epoch dev ocnli is below:')
print_result(dev_ocnli_gold_list, dev_ocnli_pred_list)
print(epoch, 'th epoch dev ocemotion is below:')
print_result(dev_ocemotion_gold_list, dev_ocemotion_pred_list)
print(epoch, 'th epoch dev tnews is below:')
print_result(dev_tnews_gold_list, dev_tnews_pred_list)
dev_data_generator.reset()
if dev_avg_f1 > best_dev_f1:
best_dev_f1 = dev_avg_f1
best_epoch = epoch
torch.save(my_net, file_path)
print('best epoch is:', best_epoch, '; with best f1 is:', best_dev_f1)
if __name__ == '__main__':
print('---------------------start training-----------------------')
pretrained_model = './bert_pretrain_model'
tokenizer_model = './bert_pretrain_model'
train(batchSize=16, device='cpu:0', lr=0.0001, use_dtp=True, pretrained_model=pretrained_model, tokenizer_model=tokenizer_model, weighted_loss=True)
PS:结果还没有跑出来,电脑跟不上哎
参考资料
1、https://www.runoob.com/docker/windows-docker-install.html
2、https://www.jb51.net/article/182013.htm
3、https://github.com/datawhalechina/team-learning-nlp/tree/master/PretrainModelsGeneralization