解决“topk_cpu“ not implemented for ‘Half‘

一、问题描述

如题报错:“topk_cpu” not implemented for ‘Half’
是在使用transformers库时本地导入某个模型,完整报错如下:

  File "/Users/guomiansheng/anaconda3/envs/ep1/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/Users/guomiansheng/.cache/huggingface/modules/transformers_modules/chatglm2-6b/modeling_chatglm.py", line 1028, in chat
    outputs = self.generate(**inputs, **gen_kwargs)
  File "/Users/guomiansheng/anaconda3/envs/ep1/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/Users/guomiansheng/anaconda3/envs/ep1/lib/python3.8/site-packages/transformers/generation/utils.py", line 1485, in generate
    return self.sample(
  File "/Users/guomiansheng/anaconda3/envs/ep1/lib/python3.8/site-packages/transformers/generation/utils.py", line 2538, in sample
    next_token_scores = logits_warper(input_ids, next_token_scores)
  File "/Users/guomiansheng/anaconda3/envs/ep1/lib/python3.8/site-packages/transformers/generation/logits_process.py", line 92, in __call__
    scores = processor(input_ids, scores)
  File "/Users/guomiansheng/anaconda3/envs/ep1/lib/python3.8/site-packages/transformers/generation/logits_process.py", line 302, in __call__
    indices_to_remove = scores < torch.topk(scores, top_k)[0][..., -1, None]
RuntimeError: "topk_cpu" not implemented for 'Half'

二、解决方法

如果模型权重做了半精度(fp16),如半精度的chatglm2-6b模型需要13GB内存,如果是16GB显存的macbook pro运行时显存不足时会很卡。刚才的问题是torch.topk不支持半精度fp16计算,可以使用float()转换为fp32后再将to("mps")。当然除非迫不得已还是使用cuda更好啦。

Reference

[1] Bug: RuntimeError: “topk_cpu” not implemented for ‘Half’

猜你喜欢

转载自blog.csdn.net/qq_35812205/article/details/132378526
今日推荐