Transformers are used TorchScript | four

Author | huggingface
compiled | VK
source | Github

Note: This is the start of the experiment carried out using TorchScript, we are still exploring features variable input size of the model. It is the focus of our attention, we will deepen our analysis in an upcoming release, the more code samples, and more flexible implementation will be based on comparative benchmarking of python code and compiled TorchScript.

According to Pytorch document: "TorchScript is a method of creating serializable and optimization models from PyTorch code." Pytorch two modules JIT和TRACEallow developers to export their models that can be reused in other programs, such as efficiency-oriented C ++ programs.

We offer an interface that allows transformers to export the model to TorchScript, so that they can be reused in python based program Pytorch of different environments. Here, we explain how to use our model, so that they can be exported, as well as matters when used in conjunction with these models TorchScript to note.

Export models need two things:

  • Virtualization input to perform forward propagation model.
  • You need to use the torchscriptlogo to instantiate the model.

The necessity means that developers should pay attention to a few things. These details below.

meaning

TorchScript sign reconciliation tied weights

This flag is necessary because most of the language model of the repository in their Embeddinglayer and Decodinghaving a weight relations binding the right layer. TorchScript not allowed to re-export the bindings right model, therefore, it is necessary to advance unbundling weight.

This means that in order to torchscriptsign instantiated model makes Embeddinglayer and the Decodinglayers are separated, which means that it should not be on them while training, leading to unexpected results.

For the model does not model the first language (Language Model head) is not the case, because those models are not bound weight. These models can not torchscriptderive safely under the sign of the situation.

Virtual (dummy) input and standard length

Enter the virtual model for the propagation ago. When the input value of the propagation in layers, Pytorch track different operations performed on each tensor. Then use those records to create a model of the "track."

Size relative to the input trace is created. Thus, it is limited by the size of a virtual input, and does not apply to any other sequence length or batch size. When you try to use a different size, the following error occurs, such as:

The expanded size of the tensor (3) must match the existing size (7) at non-singleton dimension 2

Therefore, it is recommended to use at least the same size and the maximum input size of the virtual input tracking model. For input model, padding may be performed to fill in missing values during inference. As a model
will be a larger size of the input to be tracked, but also different sizes of the matrix will be large, leading to more calculations.

Note that the total number of recommendations for each input operation is completed, and that various changes closely corresponding to the sequence length properties.

Use TorchScript in Python

The following is the use of Python save, load model and an example of how "trace" reasoning.

Save Model

This code snippet shows how to use TorchScript export BertModel. Here is instantiated BertModel, according to BertConfigthe class, then the file name traced_bert.ptto save to disk

from transformers import BertModel, BertTokenizer, BertConfig
import torch

enc = BertTokenizer.from_pretrained("bert-base-uncased")

# 标记输入文本
text = "[CLS] Who was Jim Henson ? [SEP] Jim Henson was a puppeteer [SEP]"
tokenized_text = enc.tokenize(text)

# 输入标记之一进行掩码
masked_index = 8
tokenized_text[masked_index] = '[MASK]'
indexed_tokens = enc.convert_tokens_to_ids(tokenized_text)
segments_ids = [0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1]

# 创建虚拟输入
tokens_tensor = torch.tensor([indexed_tokens])
segments_tensors = torch.tensor([segments_ids])
dummy_input = [tokens_tensor, segments_tensors]

# 使用torchscript标志初始化模型
# 标志被设置为True,即使没有必要,因为该型号没有LM Head。
config = BertConfig(vocab_size_or_config_json_file=32000, hidden_size=768,
    num_hidden_layers=12, num_attention_heads=12, intermediate_size=3072, torchscript=True)

# 实例化模型
model = BertModel(config)

# 模型设置为评估模式
model.eval()

# 如果您要​​使用from_pretrained实例化模型,则还可以设置TorchScript标志
model = BertModel.from_pretrained("bert-base-uncased", torchscript=True)

# 创建迹
traced_model = torch.jit.trace(model, [tokens_tensor, segments_tensors])
torch.jit.save(traced_model, "traced_bert.pt")

Loading Model

This code fragment shows how to load a previously under the name traced_bert.ptsaved to disk BertModel.
We re-initialized before use dummy_input.

loaded_model = torch.jit.load("traced_model.pt")
loaded_model.eval()

all_encoder_layers, pooled_output = loaded_model(dummy_input)

Use tracking model reasoning

Use tracking model of reasoning as using its __call__method is as simple as:

traced_model(tokens_tensor, segments_tensors)

Original link: https: //huggingface.co/transformers/torchscript.html

AI welcomes the attention Pan Chong station blog:
http://panchuang.net/

OpenCV Chinese official document:
http://woshicver.com/

Welcome attention Pan Chong blog resources Summary station:
http://docs.panchuang.net/

Published 372 original articles · won praise 1063 · Views 670,000 +

Guess you like

Origin blog.csdn.net/fendouaini/article/details/105092319