foreword
Some projects need to use onnxruntime-gpu for reasoning. I thought that onnxruntime-gpu can be installed directly in the case of cuda like windows, but I didn’t expect it to be so troublesome, so I share this article to help the latecomers.
environment
GPU computing NVIDIA v100 cloud server.
When choosing to install the system, the highest version has been selected as follows:
Ubuntu20.04
cuda11.0.3
cudnn8.5.0
Use the command line to view nvidia-smi:
you can see that cuda has indeed been installed.
Error log
1.[W:onnxruntime:Default, onnxruntime_pybind_state.cc:541 CreateExecutionProviderInstance] Failed to create CUDAExecutionProvider. Please reference https://onnxruntime.ai/docs/reference/execution-providers/CUDA-ExecutionProvider.html#requirements to ensure all dependencies are met.
It is a version incompatibility problem caused by directly using the following command to install:
pip install onnxruntime-gpu
The specific version correspondence is shown in the following table:
because cuda is 11.0.3, it can only be installed with onnxruntime-gpu==1.8 or 1.7.
2.onnxruntime OSError: libcurand.so.10: cannot open shared object file: No such file or directory
This is because the location where Tencent Cloud cuda is installed is very strange. It cannot be found by default as an environment variable, so it needs to be added manually. Let me demonstrate how to add it manually.
solve
1. If you keep the default cuda11.0.3, then run the following installation command:
pip install onnxruntime-gpu==1.8.0
或者
pip install onnxruntime-gpu==1.7.0
2. Find the cuda file directory:
find / -name "libcudart.so.11.0"
-name is followed by the file name, replace it as needed, check what is missing, I show them in the following directory:
3. Configure environment variables:
first:
vi /etc/profile
Then press i to switch the input mode, and add multiple lines of environment variables at the end of the text:
export LD_LIBRARY_PATH=/usr/local/lib/python3.8/dist-packages/nvidia/cudnn/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=/usr/local/lib/python3.8/dist-packages/nvidia/cublas/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=/usr/local/lib/python3.8/dist-packages/nvidia/cuda_runtime/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=/usr/local/lib/python3.8/dist-packages/nvidia/cuda_nvrtc/lib:$LD_LIBRARY_PATH
Please modify the location of the directory according to your own. It is the default on my side. After the modification, it looks like this:
Press esc to exit input mode. Press : Enter wq to save and exit.
Then take effect:
source /etc/profile
Enter the following command to confirm that the environment variable takes effect:
echo $LD_LIBRARY_PATH
If it is correct, then run the framework to create a cuda operator, and there will be no error message.