Cmake compiles opencv-python in Windows environment to use GPU resources

  • Platform and Software

Windows11 system

Visual Studio 2019:Visual Studio Community 2019

Cmake:cmake-3.25.1-windows-x86_64.msi

OpenCV 4.52:opencv-4.5.2.tar.gz

OpenCV_contrib 4.5.2:opencv_contrib-4.5.2.tar.gz

  • question:

The OpenCV library installed by Python through pip or conda only supports CPU; the opencv source code needs to be compiled and installed with CMake and visual studio to use GPU. The ultimate goal is to develop opencv C++ projects that support CUDA in Visual studio, and use GPU to accelerate opencv-python in Python. This involves a series of more complex processes.

1. Install visual studio 2019

To enter the official download page , you may need to log in. After logging in, select the community version to download.

Note that this is just to download the installer, the real installation will execute the file later, configure the installation directory, and download the installation package file online. The whole process is relatively simple and will not be described in detail here. Only one alternate installer download link is provided.

Link: https://pan.baidu.com/s/1l7IZBIXw7qiGZC9upCylgg
extraction code: 62dn

2. Install CUDA and CUDNN

Go to the official website to download CUDA ( the official download address of each version of CUDA toolkit ) and the corresponding CUDNN ( cuDNN Archive ) , remember to be sure to correspond to the CUDNN and CUDA versions. Enter the driver download interface , and select the appropriate graphics card driver according to your graphics card model.

View the highest version of cuda supported by the driver

  • Enter the official website to check the minimum driver version required by different cuda versions.

After successfully installing the NVIDIA driver, press win+R, enter cmd, enter the command line interface, and enter nvdia-smi to view the nvidia driver version and the current highest supported cuda version.

Here is a reminder that some netizens may have installed different versions of CUDA, which can be switched through the setting of environment variables. When installing multiple versions of CUDA, pay attention to only install the CUDA library, do not install the driver, otherwise the installation will fail.

During the process, it will check whether to install the configuration option of visual studio 2019. If not, the configuration cannot be successful, so be sure to install visual studio 2019 first.

Steps: The installation is successful. Enter nvcc -V on the command line to display the cuda version information. The cuda version shown here is the current version set by the environment variable, which may be different from that displayed by nvidia-smi.

The installation and configuration of cudnn is relatively simple. Download the file, decompress it, rename it, and copy and replace the corresponding file under C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v1x.x.

It should be noted here that when compiling opencv later, the identification of cuda and cudnn can be set in the configuration item, so if multiple cuda versions are installed, the corresponding cudnn versions need to be installed respectively. The final generated environment is as follows:

Set the path as follows:

3. Test Visual Studio

  1. Open Visual Studio, choose to create a new project

2.在最下面找到CUDA

3.自定义项目工程的位置

4.打开示例文件

5.运行

4.OpenCV源码下载

进入OpenCV官网,选择自己需要的版本,例如OpenCV-4.5.3。

进入OpenCV GitHub官方仓库,通过tags选择对应的contrib版本

5.CMake安装与设置

进入CMake官网下载就行,我这里下载的CMake 3.25.1.在红框位置分别opencv源文件目录和编译后的输出目录,二者是平级目录。

CMake配置是编译成功的关键,这里面有很多坑。注意CMake的配置是需要至少2次。

build作为输出目录最好是空目录,这样第一次进入CMake中间配置区域为空白。点击Configure选择vs 2019x64架构,点击finish,开始第一次编译生成配置项。

1.编译过程需要下载各种依赖,大概率会因为网络问题会卡住,可参考另一篇博文,直接下载到.cache文件夹,可省去此烦恼。

2.(此步选做) 创建虚拟环境,建议使用anaconda,并在虚拟环境中安装numpy(编译时需要),执行此步骤是为了将CUDA版本的opencv安装到虚拟环境中,只安装到宿主机环境不需要执行此步骤。

3. (此步骤选做,但执行此步骤的前提是必须执行上一个步骤) 更换一下几个变量,分别将路径指向虚拟环境的对应位置 : PYTHON3_EXECUTABLE、PYTHON3_INCLUDE_DIR、PYTHON3_LIBRARY、PYTHON3_NUMPY_INCLUDE_DIRS、PYTHON3_PACKAGES_PATH

这里有个坑:需要安装与虚拟环境python版本一致的原生python,我之前安装了python3.6,虚拟环境python是3.7,可以通过编译但始终无法在opencv-python中识别GPU。

4.configure完成后,在Search框内输入CUDA和fast,勾选三个配置 : WITH_CUDA 、OPENCV_DNN_CUDA、ENABLE_FAST_MATH,要按顺序进行

勾选 WITH CUDA,如果要应用opencv的sift算法,则还需要将OPENCV_ENABLE_NONFREE勾选上:

绿框是要特别注意勾选和修改。TOOLKIT_ROOT_DIR是本机当前配置环境下的cuda版本目录。

5.search框搜MODULES,在OPENCV_EXTRA_MODULES_RATH一项,添加opencv_contrib4.5.1中的modules目录

6.Search框搜world,将build_opencv_world打勾,将所有opencv的库都编译在一起不需要自己一一添加每个小模块。

7.Search框搜BUILD,勾选BUILD_opencv_python3。这一步很重要,决定了能否在python中使用GPU加速。

8. search框搜NON,把OPENCV_ENABLE_NONFREE 打勾

9. 勾选build_opencv_world

10.去掉python,test、java加快后面的编译

11.第二次点击configure,等待下方日志显示configure done

12.搜索框输入cuda,勾选CUDA_FAST_MATH CUDA_ARCH_BIN中将显卡的算力内容改成自己显卡的算力(算力查询)。 默认会从最低的3.0开始,不仅影响配置速度,而且由于最新cuda 11对 compute_30了不支持,会出现如下错误:

nvcc fatal : Unsupported gpu architecture ‘compute_30‘ 错误

解决方法就是重新回到cmake,找到CUDA_ARCH_BIN,需要把这里的3.0删掉之后再重新generate。

12. 再次点击configure,这次的Configuring done终于OK,然后点击Generate,稍等片刻出现Generating done!

点击Open Project,它会启动你的Visual Studio。

13.最后没有错误的情况下,点击Generate按钮,生成需要编译的文件。这里先不要关闭CMAKE软件

点击Open Project用VS编译。

6. VS编译

通过前面打开Project后,就会跳转到VS界面。这里首先要确定的一点是,首先将模式调整为Release版本。

同时必须保证bindings这个目录下有opencv_python3才进行下一步,否则即使编译成功仍然不能使用CUDA加速。

首先选择ALL_BUILD->生成,接下来就是漫长的等待过程。我的机器大概编译了1个小时,图就不贴了,总之没有错误编译就是成功的,类似如下界面。

编译完成后,在INSTALL->生成,这个编译要快一点。下面就是等待他们完成了,编译完成后结果如下:

7. 验证

编译完成后在opencv_cuda\build\lib\python3\Release文件夹下可以看到cv2.cp37-win_amd64.pyd文件(不同的python版本,名称会略有差异)

同时,在虚拟环境或者宿主机环境中,可以在路径Lib\site-packages下看到cv2文件夹

验证opencv环境

使用命令行进入python环境,执行一下代码即可验证:

c:\users\administrator> python
>>> import cv2
>>> cv2.cuda.getCudaEnabledDeviceCount()
>>>1

1 # 得到GPU设备数量,即表示opencv的GPU版本已经安装成功

最后测试能否使用cuda,代码如下:

#读取图片
import cv2
frame=cv2.imread('drip.png')
frame = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR)
cv2.imshow('before',frame)
cv2.waitKey(0)

#上传到gpu进行处理
gpu_frame=cv2.cuda_GpuMat()
gpu_frame.upload(frame)
print(gpu_frame.cudaPtr())

#把图像从RGB转换成BGR(OpenCV格式),然后调整大小
screenshot = cv2.cuda.cvtColor(gpu_frame, cv2.COLOR_RGB2BGR)
screenshot = cv2.cuda.resize(screenshot, (400, 400))
#从GPU下载图像 (cv2.cuda_GpuMat -> numpy.ndarray)
screenshot = screenshot.download()
cv2.imshow('after',screenshot)
cv2.waitKey(0)

原图

调整后

感觉处理速度确实有所提升,后续有待进一步测试。

还有一种简单的方法,可以使用官方预构建源代码配置支持GPU模块的OpenCV,不使用Visual Studio编译,参考Python OpenCV配置CUDA以支持GPU加速

其它参考

Downloads - James Bowley

OpenCV使用CUDA处理图像的教程与实战

OpenCV CUDA for Video Preprocessing

Guess you like

Origin blog.csdn.net/Cretheego/article/details/128993920