The AMD graphics card training model under Windows is saved: run Transformers under pytorch_directml

Train the transformer model on the amd graphics card under Windows. For the installation method, see:  Training with amd graphics card under Windows: Pytorch-directml major upgrade, changed to pytorch plug-in form, better compatibility_amd graphics card pytorch_znsoft's blog-CSDN blog 

import os
import imp
try:
    imp.find_module('torch_directml')
    found_directml = True
    import torch_directml
except ImportError:
    found_directml = False

import torch
from transformers import RobertaTokenizer, RobertaConfig, RobertaModel, RobertaForMaskedLM,pipeline

DIR="E:/transformers"
MODEL_NAME="microsoft/codebert-base"
from transformers import AutoTokenizer, AutoModel

if found_directml:
    device=torch_directml.device()
else:
    device=torch.device("cpu")

# tokenizer = AutoTokenizer.from_pretrained(DIR+os.sep+MODEL_NAME)
# model = AutoModel.from_pretrained(DIR+os.sep+MODEL_NAME).to(device)
# nl_tokens=tokenizer.tokenize("return maximum value")

# code_tokens=tokenizer.tokenize("def max(a,b): if a>b: return a else return b")

# tokens=[tokenizer.cls_token]+nl_tokens+[tokenizer.sep_token]+code_tokens+[tokenizer.eos_token]

# tokens_ids=tokenizer.convert_tokens_to_ids(tokens)
# tokens_ids=torch.tensor(tokens_ids)[None,:]
# tokens_ids.to(device)
# context_embeddings=model()[0]

# print(context_embeddings)



MODEL_NAME="microsoft/codebert-base-mlm"
model = RobertaForMaskedLM.from_pretrained(DIR+os.sep+MODEL_NAME)
tokenizer = RobertaTokenizer.from_pretrained(DIR+os.sep+MODEL_NAME)
model.to(device)
CODE = "if (x is not None) <mask> (x>1)"
code=tokenizer(CODE)
#.to(device)
input_ids=torch.tensor([code["input_ids"]]).to(device)
attention_mask=torch.tensor([code["attention_mask"]]).to(device)
for i in range(1000):
    out=model(input_ids=input_ids,attention_mask=attention_mask)
print(out)

Note that if you use the pipeline directly, there may be problems, which should be caused by the incompatibility of the pipeline. You only need to write the specific code yourself and avoid the pipeline. amd GPU usage can go up.

Guess you like

Origin blog.csdn.net/znsoft/article/details/129135679