CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage and speed

background:

The following warning appears when TensorRT runs the test case: CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage and speed up TensorRT initialization. See “Lazy Loading” section of CUDA documentation https://docs.nvidia.com /cuda/cuda-c-programming-guide/index.html#lazy-loading

reason

The NVIDIA Linux driver beta version and the NVIDIA GPU kernel driver open source version are released together with CUDA 11.7.
The NVIDIA CUDA 11.7 Toolkit is now available as the latest feature update for NVIDIA’s proprietary compute stack. CUDA 11.7 brings compatibility support for the new NVIDIA Open GPU kernel module. Another important highlight is lazy loading support.
Lazy loading: Delay the loading of the kernel from the host to the GPU until the kernel is called. This also only loads used kernels, which may significantly save device-side memory. This also delays loading latency from the beginning of the application to the first call to the kernel - the overall binary loading latency is usually significantly reduced, but also shifted to later in the application.
To enable this feature, set the environment variable CUDA_MODULE_LOADING=LAZY before starting the process.
Note that this feature is only compatible with libraries compiled with CUDA version >= 11.7.
Insert image description here
Run it again, there is no alarm message.

おすすめ

転載: blog.csdn.net/s1_0_2_4/article/details/135026761