question
Newly applied for several H100 graphics cards, but a prompt will appear when running the program
NVIDIA H100 PCIe with CUDA capability sm_90 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70 sm_75 sm_80 sm_86.
If you want to use the NVIDIA H100 PCIe GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/
The original cuda version is 12.1, and the torch version is 2.0.1
solve
Uninstall the previous installation and reinstall the 11.8 version of cuda
pip install torch2.0.0+cu118 torchaudio2.0.0+cu118 torchvision==0.15.0+cu118 --extra-index-url https://download.pytorch.org/whl/cu118
conda install cudnn
conda install -c “nvidia/label/cuda-11.8.0 ” cuda-toolkit
conda install -c “nvidia/label/cuda-11.8.0” cuda-nvcc
conda install -c “nvidia/label/cuda-11.8.0” cuda-runtime
verify
import torch
print("PyTorch Version: ",torch.__version__) ;
print("Is available: ", torch.cuda.is_available()) ;
print("Current Device: ", torch.cuda.current_device()) ;
print("Number of GPUs: ",torch.cuda.device_count())
result
import torch
print("PyTorch Version: ",torch.__version__) ;
# PyTorch Version: 2.0.0+cu118
print("Is available: ", torch.cuda.is_available()) ;
# Is available: True
print("Current Device: ", torch.cuda.current_device()) ;
# Current Device: 0
print("Number of GPUs: ",torch.cuda.device_count())
# Number of GPUs: 8