TypeError: sum() received an invalid combination of arguments - got (axis=NoneType, out=NoneType, ),

Some errors encountered when training deep learning code (continuous error reporting, continuous updating...)
Of course, I hope it is best not to report errors...

1. TypeError: sum() received an invalid combination of arguments - got (axis=NoneType, out=NoneType, ), but expected one of:
(*, torch.dtype dtype)
didn't match because some of the keywords were incorrect: axis, out
(tuple of names dim, bool keepdim, *, torch .dtype dtype)
(tuple of ints dim, bool keepdim, *, torch.dtype dtype)

Possible solution:
The error message shows that the np.sum method has a problem when receiving PyTorch tensor. It seems to receive a PyTorch tensor, but numpy's functions do not support handling PyTorch tensors directly.
Convert PyTorch tensor to numpy array using:

numpy_array = tensor.cpu().numpy()
#给所有的numpy前面都加上cpu()

2. RuntimeError: Invalid device string: 'cuda: 0'
When referencing CUDA devices in PyTorch, you usually use 'cuda' or < /span> should resolve the issue. with and replacing all has a space in the middle. Changing the device string to The currently used device string , etc.). , 'cuda:0' (if you have multiple GPUs, use 'cuda:1''cuda:2'
'cuda: 0''cuda:0''cuda: 0''cuda:0'

disc_net.to(device='cuda: 0')

Should be rewritten as:

disc_net.to(device='cuda:0')

3. RuntimeError: CUDA error:
CUBLAS_STATUS_NOT_INITIALIZED when calling cublasCreate(handle)

①Check the CUDA version first: Please make sure you are using CUDA version is compatible with your pytorch version. Recommended CUDA versions can be found on the pytorch official website.
Clean GPU memory: Before each run of the code, try to clear your GPU memory. You can use the following code to free unused memory.

torch.cuda.empty_cache()

③ Check the code: Make sure your code does not produce inconsistencies when running in parallel on multiple GPUs. Sometimes running the model on one GPU without splitting it across multiple GPUs can solve the problem. You can use torch.device("cuda:0") to try to allocate your model and data to a specific GPU.
④ Check system resources: If you run some programs that take up a lot of GPU resources, it may affect the operation of this code. Try closing these resource-intensive programs and see if that resolves the issue.
⑤Restart: Finally, if none of the above operations work, try restarting the machine. Sometimes, this simple action can solve many seemingly complex problems.
(I solved it using ②)

Guess you like

Origin blog.csdn.net/change_xzt/article/details/134577664