Author | huggingface
compiled | VK
source | Github
Note: This is the start of the experiment carried out using TorchScript, we are still exploring features variable input size of the model. It is the focus of our attention, we will deepen our analysis in an upcoming release, the more code samples, and more flexible implementation will be based on comparative benchmarking of python code and compiled TorchScript.
According to Pytorch document: "TorchScript is a method of creating serializable and optimization models from PyTorch code." Pytorch two modules JIT和TRACE
allow developers to export their models that can be reused in other programs, such as efficiency-oriented C ++ programs.
We offer an interface that allows transformers to export the model to TorchScript, so that they can be reused in python based program Pytorch of different environments. Here, we explain how to use our model, so that they can be exported, as well as matters when used in conjunction with these models TorchScript to note.
Export models need two things:
- Virtualization input to perform forward propagation model.
- You need to use the
torchscript
logo to instantiate the model.
The necessity means that developers should pay attention to a few things. These details below.
meaning
TorchScript sign reconciliation tied weights
This flag is necessary because most of the language model of the repository in their Embedding
layer and Decoding
having a weight relations binding the right layer. TorchScript not allowed to re-export the bindings right model, therefore, it is necessary to advance unbundling weight.
This means that in order to torchscript
sign instantiated model makes Embedding
layer and the Decoding
layers are separated, which means that it should not be on them while training, leading to unexpected results.
For the model does not model the first language (Language Model head) is not the case, because those models are not bound weight. These models can not torchscript
derive safely under the sign of the situation.
Virtual (dummy) input and standard length
Enter the virtual model for the propagation ago. When the input value of the propagation in layers, Pytorch track different operations performed on each tensor. Then use those records to create a model of the "track."
Size relative to the input trace is created. Thus, it is limited by the size of a virtual input, and does not apply to any other sequence length or batch size. When you try to use a different size, the following error occurs, such as:
The expanded size of the tensor (3) must match the existing size (7) at non-singleton dimension 2
Therefore, it is recommended to use at least the same size and the maximum input size of the virtual input tracking model. For input model, padding may be performed to fill in missing values during inference. As a model
will be a larger size of the input to be tracked, but also different sizes of the matrix will be large, leading to more calculations.
Note that the total number of recommendations for each input operation is completed, and that various changes closely corresponding to the sequence length properties.
Use TorchScript in Python
The following is the use of Python save, load model and an example of how "trace" reasoning.
Save Model
This code snippet shows how to use TorchScript export BertModel
. Here is instantiated BertModel
, according to BertConfig
the class, then the file name traced_bert.pt
to save to disk
from transformers import BertModel, BertTokenizer, BertConfig
import torch
enc = BertTokenizer.from_pretrained("bert-base-uncased")
# 标记输入文本
text = "[CLS] Who was Jim Henson ? [SEP] Jim Henson was a puppeteer [SEP]"
tokenized_text = enc.tokenize(text)
# 输入标记之一进行掩码
masked_index = 8
tokenized_text[masked_index] = '[MASK]'
indexed_tokens = enc.convert_tokens_to_ids(tokenized_text)
segments_ids = [0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1]
# 创建虚拟输入
tokens_tensor = torch.tensor([indexed_tokens])
segments_tensors = torch.tensor([segments_ids])
dummy_input = [tokens_tensor, segments_tensors]
# 使用torchscript标志初始化模型
# 标志被设置为True,即使没有必要,因为该型号没有LM Head。
config = BertConfig(vocab_size_or_config_json_file=32000, hidden_size=768,
num_hidden_layers=12, num_attention_heads=12, intermediate_size=3072, torchscript=True)
# 实例化模型
model = BertModel(config)
# 模型设置为评估模式
model.eval()
# 如果您要使用from_pretrained实例化模型,则还可以设置TorchScript标志
model = BertModel.from_pretrained("bert-base-uncased", torchscript=True)
# 创建迹
traced_model = torch.jit.trace(model, [tokens_tensor, segments_tensors])
torch.jit.save(traced_model, "traced_bert.pt")
Loading Model
This code fragment shows how to load a previously under the name traced_bert.pt
saved to disk BertModel
.
We re-initialized before use dummy_input
.
loaded_model = torch.jit.load("traced_model.pt")
loaded_model.eval()
all_encoder_layers, pooled_output = loaded_model(dummy_input)
Use tracking model reasoning
Use tracking model of reasoning as using its __call__
method is as simple as:
traced_model(tokens_tensor, segments_tensors)
Original link: https: //huggingface.co/transformers/torchscript.html
AI welcomes the attention Pan Chong station blog:
http://panchuang.net/
OpenCV Chinese official document:
http://woshicver.com/
Welcome attention Pan Chong blog resources Summary station:
http://docs.panchuang.net/