RuntimeError: “LayerNormKernelImpl“ not implemented for ‘Half‘

RuntimeError: "LayerNormKernelImpl" not implemented for 'Half'

error message

RuntimeError: "LayerNormKernelImpl" not implemented for 'Half'

More complete error message:

/home/anaconda3/envs/py385/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1188         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1189                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1190             return forward_call(*input, **kwargs)
   1191         # Do not call functions when jit is used
   1192         full_backward_hooks, non_full_backward_hooks = [], []

/home/anaconda3/envs/py385/lib/python3.8/site-packages/torch/nn/modules/normalization.py in forward(self, input)
    188 
    189     def forward(self, input: Tensor) -> Tensor:
--> 190         return F.layer_norm(
    191             input, self.normalized_shape, self.weight, self.bias, self.eps)
    192 

/home/anaconda3/envs/py385/lib/python3.8/site-packages/torch/nn/functional.py in layer_norm(input, normalized_shape, weight, bias, eps)
   2513             layer_norm, (input, weight, bias), input, normalized_shape, weight=weight, bias=bias, eps=eps
   2514         )
-> 2515     return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
   2516 
   2517 

RuntimeError: "LayerNormKernelImpl" not implemented for 'Half'

Solution

Recently, I was studying a large model and encountered this problem. In fact, the reason for the error was that I forgot to put the model on the GPU when I was reasoning, so the solution is very simple:

model.to('cuda:0')

Put the model on the GPU to solve it.

Reference: https://github.com/huggingface/transformers/issues/21989

Error reason

A more detailed explanation is because the model is half-precision, that is, fp16, that is to say, in the previous code, you should have executed this sentence:

model = model.half()

However, fp16 cannot take effect on the CPU. If you use CPU for reasoning, you can only use fp32 honestly.

Replenish

If you encounter this error when deploying stable-diffusion, you can refer to this issue of hf:
https://huggingface.co/CompVis/stable-diffusion-v1-4/discussions/64

Guess you like

Origin blog.csdn.net/weixin_44826203/article/details/130112858