OpenCV 源码编译 + cuda + cuDNN(未成功)


前言:
上篇文章搭建 OpenCV 环境的时候,因为显卡太渣,使用 gpu 的加速效果不好,而且配置的 cuda 和 cuDNN 版本较老,索性全部卸载了。但毕竟 gpu 加速是大趋势,折腾一下还是必要的,这次就将 cuda 环境给 OpenCV 配置上。

本文配置:
之前配置 +(cuda 10.2 + cuDNN 7.6.5)
其他和上篇博客一致,本文也是在其基础上重新编译一次。

最终测试没有成功,折腾了好多天,感觉不是特例,有类似情况的欢迎评论区留言讨论。

Ⅰ、安装 cuda cuDNN

1-1. 安装 cuda

本机显卡:GrForce 940MX,最新驱动版本 03/02/2020 发布,从控制面板可以看出支持最新的 cuda 10.2 版本,直接去官网下载 cuda 10.2,可以直接下载 local 版本,方便安装。
cuda支持版本
安装过程1较为简单,直接按默认的就可以,安装时不能打开 Visual Studio。
注意如果之前安装过其他版本的 cuda,系统变量 Path 可能位置不够,安装程序会报如下警告:
Path位置不够
按照报错信息,可以再安装完成后将旧环境变量删除,并将未加上的部分加上,也可以重新运行安装程序。
安装完成后,运行
C:\ProgramData\NVIDIA Corporation\CUDA Samples\v10.2\1_Utilities\deviceQuery目录下的示例程序,直接用 VS2019 打开deviceQuery_vs2019.sln并运行,输出信息如下:
cuda安装成功

1-2. 安装 cuDNN

与其说是安装 cuDNN,不如说是将 cuDNN 源代码文件添加到 cuda 的文件夹中。
下载适配 cuda 10.2 的 cuDNN v7.6.5,需要 Nvidia 账号,将下载后的安装包解压后得到的三个文件夹:binincludelib中的三个文件:cudnn64_7.dllcudnn.hcudnn.lib分别复制到 cuda 的安装目录,默认是在C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2的对应目录下,便完成了 cuDNN 的安装。

Ⅱ、重新编译 OpenCV

为了使环境变量生效,注意要在安装 cuda 后再打开 cmake-gui,并且设置新的目标文件夹。
这次不止需要:
勾选BUILD_opencv_world
设置OPENCV_EXTRA_MODULE_PATH 为安装目录/opencv_contrib/modules
还需要勾选:OPENCV_DNN_CUDAWITH_CUDA
之后点击 configure
如果出现 OpenCVGenSetupVars.cmake 抛出错误的话,可以将OPENCV_GENERATE_SETUPVARS取消勾选。
再将 CUDA_ARCH_BIN 根据自己显卡算力设置,本机设置为 5.0
这里根据自己的需求可以有更多的设置,如:GStreamer、Eigen、OpenGL 等组件可以自行配置,我尝试了 GStreamer,(参考GStreamer+win10+vs2015配置) 最后运行官方文档实例程序(如下),仍然有 Failed 输出信息,但程序正常运行。

#include "opencv2/imgproc.hpp"
#include "opencv2/highgui.hpp"
using namespace cv;
int main(int, char**)
{
    VideoCapture cap(0);
    if(!cap.isOpened()) return -1;
    Mat frame, edges;
    namedWindow("edges", WINDOW_AUTOSIZE);
    for(;;)
    {
        cap >> frame;
        cvtColor(frame, edges, COLOR_BGR2GRAY);
        GaussianBlur(edges, edges, Size(7,7), 1.5, 1.5);
        Canny(edges, edges, 0, 30, 3);
        imshow("edges", edges);
        if(waitKey(30) >= 0) break;
    }
    return 0;
}

具体配置不再一一列举,文末给出最终 cmake 配置输出。
再次点击 configure,确保没有红色显示的设置信息。
完成后可以看到输出信息中 cuda 已经被支持:
cuda配置
之后的操作就和之前一样了,点击 Generate,全部配置好后,打开 VS 进行编译,这次需要的时间会更长,几乎是上次的两倍。
在编译生成 Debug 版本时有一个关于 opencv_python 的生成失败,报错无法打开 python37_d.lib,原因是缺少 Debug 版本的 python,对于 Anaconda 用户,可以参照该文章解决2,即:
首先在 Anaconda 目录下搜索 pyconfig.h,对该文件做下面两处修改:
修改pyconfig.h
修改pyconfig

Ⅲ、测试安装结果

3-1. 添加配置项

上篇文章是直接对测试工程项目进行配置,这种方式,每次创建新工程都需要重新配置,较为麻烦,这次采用添加配置文件的方式,以后新建项目就可以直接导入该配置文件。
下面以添加 Debug | x64 为例:
添加项目属性表
添加
添加包含目录和库目录
添加依赖项
此时该配置文件已经应用到该项目中,并且可以导出复用。

3-2. OpenCV_cuda 测试结果

测试代码:

#include <iostream>
#include <opencv2/core/cuda.hpp>
using namespace std;

int main(int argc, char** argv)
{
	try
	{
		cout << cv::cuda::getCudaEnabledDeviceCount() << endl;
	}
	catch (const cv::Exception & ex)
	{
		cout << "Error:" << ex.what() << endl;
	}
	system("PAUSE");
	return 0;
}

输出结果:
结果
opencv_docs 中的解释:
函数定义
说明这里配置有问题,未成功,原因未知,欢迎有遇到相同情况的留言讨论。

cmake 最终的配置输出:

Selecting Windows SDK version 10.0.18362.0 to target Windows 10.0.18363.
Detected processor: AMD64
libjpeg-turbo: VERSION = 2.0.2, BUILD = opencv-4.2.0-dev-libjpeg-turbo
found Intel IPP (ICV version): 2019.0.0 [2019.0.0 Gold]
at: E:/code_env/opencv/build_test/3rdparty/ippicv/ippicv_win/icv
found Intel IPP Integration Wrappers sources: 2019.0.0
at: E:/code_env/opencv/build_test/3rdparty/ippicv/ippicv_win/iw
CUDA detected: 10.2
CUDA NVCC target flags: -gencode;arch=compute_50,code=sm_50;-D_FORCE_INLINES
Could not find OpenBLAS include. Turning OpenBLAS_FOUND off
Could not find OpenBLAS lib. Turning OpenBLAS_FOUND off
Could NOT find BLAS (missing: BLAS_LIBRARIES) 
LAPACK requires BLAS
A library with LAPACK API not found. Please specify library location.
Could NOT find JNI (missing: JAVA_AWT_LIBRARY JAVA_JVM_LIBRARY JAVA_INCLUDE_PATH JAVA_INCLUDE_PATH2 JAVA_AWT_INCLUDE_PATH) 
VTK is not found. Please set -DVTK_DIR in CMake to VTK build directory, or to VTK install subdirectory with VTKConfig.cmake file
OpenCV Python: during development append to PYTHONPATH: E:/code_env/opencv/build_test/python_loader
Caffe:   NO
Protobuf:   NO
Glog:   NO
freetype2:   NO
harfbuzz:    NO
Module opencv_ovis disabled because OGRE3D was not found
No preference for use of exported gflags CMake configuration set, and no hints for include/library directories provided. Defaulting to preferring an installed/exported gflags CMake configuration if available.
Failed to find installed gflags CMake configuration, searching for gflags build directories exported with CMake.
Failed to find gflags - Failed to find an installed/exported CMake configuration for gflags, will perform search for installed gflags components.
Failed to find gflags - Could not find gflags include directory, set GFLAGS_INCLUDE_DIR to directory containing gflags/gflags.h
Failed to find glog - Could not find glog include directory, set GLOG_INCLUDE_DIR to directory containing glog/logging.h
Module opencv_sfm disabled because the following dependencies are not found: Eigen Glog/Gflags
Tesseract:   NO
Processing WORLD modules...
    module opencv_cudev...
    module opencv_core...
    module opencv_cudaarithm...
    module opencv_flann...
    module opencv_hdf...
    module opencv_imgproc...
    module opencv_intensity_transform...
    module opencv_ml...
    module opencv_phase_unwrapping...
    module opencv_plot...
    module opencv_quality...
    module opencv_reg...
    module opencv_surface_matching...
    module opencv_cudafilters...
    module opencv_cudaimgproc...
    module opencv_cudawarping...
    module opencv_dnn...
Registering hook 'INIT_MODULE_SOURCES_opencv_dnn': E:/code_env/opencv/modules/dnn/cmake/hooks/INIT_MODULE_SOURCES_opencv_dnn.cmake
    module opencv_dnn_superres...
    module opencv_features2d...
    module opencv_fuzzy...
    module opencv_gapi...
    module opencv_hfs...
    module opencv_imgcodecs...
    module opencv_line_descriptor...
    module opencv_photo...
    module opencv_saliency...
    module opencv_text...
    module opencv_videoio...
    module opencv_xphoto...
    module opencv_calib3d...
    module opencv_cudacodec...
    module opencv_cudafeatures2d...
    module opencv_cudastereo...
    module opencv_datasets...
    module opencv_highgui...
    module opencv_objdetect...
    module opencv_rapid...
    module opencv_rgbd...
    module opencv_shape...
    module opencv_structured_light...
    module opencv_video...
    module opencv_xfeatures2d...
    module opencv_ximgproc...
    module opencv_xobjdetect...
    module opencv_aruco...
    module opencv_bgsegm...
    module opencv_bioinspired...
    module opencv_ccalib...
    module opencv_cudabgsegm...
    module opencv_cudalegacy...
    module opencv_cudaobjdetect...
    module opencv_dnn_objdetect...
    module opencv_dpm...
    module opencv_face...
    module opencv_optflow...
    module opencv_stitching...
    module opencv_tracking...
    module opencv_cudaoptflow...
    module opencv_stereo...
    module opencv_superres...
    module opencv_videostab...
Processing WORLD modules... DONE

General configuration for OpenCV 4.2.0-dev =====================================
  Version control:               unknown

  Extra modules:
    Location (extra):            E:/code_env/opencv_contrib/modules
    Version control (extra):     unknown

  Platform:
    Timestamp:                   2020-02-26T04:19:40Z
    Host:                        Windows 10.0.18363 AMD64
    CMake:                       3.15.5
    CMake generator:             Visual Studio 16 2019
    CMake build tool:            C:/accelerate/VisualStudio/MSBuild/Current/Bin/MSBuild.exe
    MSVC:                        1924

  CPU/HW features:
    Baseline:                    SSE SSE2 SSE3
      requested:                 SSE3
    Dispatched code generation:  SSE4_1 SSE4_2 FP16 AVX AVX2 AVX512_SKX
      requested:                 SSE4_1 SSE4_2 AVX FP16 AVX2 AVX512_SKX
      SSE4_1 (16 files):         + SSSE3 SSE4_1
      SSE4_2 (2 files):          + SSSE3 SSE4_1 POPCNT SSE4_2
      FP16 (1 files):            + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 AVX
      AVX (5 files):             + SSSE3 SSE4_1 POPCNT SSE4_2 AVX
      AVX2 (30 files):           + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 FMA3 AVX AVX2
      AVX512_SKX (6 files):      + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 FMA3 AVX AVX2 AVX_512F AVX512_COMMON AVX512_SKX

  C/C++:
    Built as dynamic libs?:      YES
    C++ Compiler:                C:/accelerate/VisualStudio/VC/Tools/MSVC/14.24.28314/bin/Hostx64/x64/cl.exe  (ver 19.24.28316.0)
    C++ flags (Release):         /DWIN32 /D_WINDOWS /W4 /GR  /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi  /fp:precise     /EHa /wd4127 /wd4251 /wd4324 /wd4275 /wd4512 /wd4589 /MP  /MD /O2 /Ob2 /DNDEBUG 
    C++ flags (Debug):           /DWIN32 /D_WINDOWS /W4 /GR  /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi  /fp:precise     /EHa /wd4127 /wd4251 /wd4324 /wd4275 /wd4512 /wd4589 /MP  /MDd /Zi /Ob0 /Od /RTC1 
    C Compiler:                  C:/accelerate/VisualStudio/VC/Tools/MSVC/14.24.28314/bin/Hostx64/x64/cl.exe
    C flags (Release):           /DWIN32 /D_WINDOWS /W3  /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi  /fp:precise     /MP   /MD /O2 /Ob2 /DNDEBUG 
    C flags (Debug):             /DWIN32 /D_WINDOWS /W3  /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi  /fp:precise     /MP /MDd /Zi /Ob0 /Od /RTC1 
    Linker flags (Release):      /machine:x64  /INCREMENTAL:NO 
    Linker flags (Debug):        /machine:x64  /debug /INCREMENTAL 
    ccache:                      NO
    Precompiled headers:         NO
    Extra dependencies:          cudart_static.lib nppc.lib nppial.lib nppicc.lib nppicom.lib nppidei.lib nppif.lib nppig.lib nppim.lib nppist.lib nppisu.lib nppitc.lib npps.lib cublas.lib cudnn.lib cufft.lib -LIBPATH:C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.2/lib/x64
    3rdparty dependencies:

  OpenCV modules:
    To be built:                 aruco bgsegm bioinspired calib3d ccalib core cudaarithm cudabgsegm cudacodec cudafeatures2d cudafilters cudaimgproc cudalegacy cudaobjdetect cudaoptflow cudastereo cudawarping cudev datasets dnn dnn_objdetect dnn_superres dpm face features2d flann fuzzy gapi hdf hfs highgui img_hash imgcodecs imgproc intensity_transform line_descriptor ml objdetect optflow phase_unwrapping photo plot python3 quality rapid reg rgbd saliency shape stereo stitching structured_light superres surface_matching text tracking ts video videoio videostab world xfeatures2d ximgproc xobjdetect xphoto
    Disabled:                    -
    Disabled by dependency:      -
    Unavailable:                 cnn_3dobj cvv freetype java js matlab ovis python2 python2 sfm viz
    Applications:                tests perf_tests apps
    Documentation:               NO
    Non-free algorithms:         YES

  Windows RT support:            NO

  GUI: 
    Win32 UI:                    YES
    VTK support:                 NO

  Media I/O: 
    ZLib:                        build (ver 1.2.11)
    JPEG:                        build-libjpeg-turbo (ver 2.0.2-62)
    WEBP:                        build (ver encoder: 0x020e)
    PNG:                         build (ver 1.6.37)
    TIFF:                        build (ver 42 - 4.0.10)
    JPEG 2000:                   build (ver 1.900.1)
    OpenEXR:                     build (ver 2.3.0)
    HDR:                         YES
    SUNRASTER:                   YES
    PXM:                         YES
    PFM:                         YES

  Video I/O:
    DC1394:                      NO
    FFMPEG:                      YES (prebuilt binaries)
      avcodec:                   YES (58.54.100)
      avformat:                  YES (58.29.100)
      avutil:                    YES (56.31.100)
      swscale:                   YES (5.5.100)
      avresample:                YES (4.0.0)
    GStreamer:                   YES (1.16.2)
    DirectShow:                  YES
    Media Foundation:            YES
      DXVA:                      YES

  Parallel framework:            Concurrency

  Trace:                         YES (with Intel ITT)

  Other third-party libraries:
    Intel IPP:                   2019.0.0 Gold [2019.0.0]
           at:                   E:/code_env/opencv/build_test/3rdparty/ippicv/ippicv_win/icv
    Intel IPP IW:                sources (2019.0.0)
              at:                E:/code_env/opencv/build_test/3rdparty/ippicv/ippicv_win/iw
    Lapack:                      NO
    Eigen:                       NO
    Custom HAL:                  NO
    Protobuf:                    build (3.5.1)

  NVIDIA CUDA:                   YES (ver 10.2, CUFFT CUBLAS FAST_MATH)
    NVIDIA GPU arch:             50
    NVIDIA PTX archs:

  cuDNN:                         YES (ver 7.6.5)

  OpenCL:                        YES (NVD3D11)
    Include path:                E:/code_env/opencv/3rdparty/include/opencl/1.2
    Link libraries:              Dynamic load

  Python 3:
    Interpreter:                 E:/code_env/Anaconda3/python.exe (ver 3.7)
    Libraries:                   E:/code_env/Anaconda3/libs/python37.lib (ver 3.7.0)
    numpy:                       E:/code_env/Anaconda3/lib/site-packages/numpy/core/include (ver 1.15.1)
    install path:                E:/code_env/Anaconda3/Lib/site-packages/cv2/python-3.7

  Python (for build):            E:/code_env/Anaconda3/python.exe

  Java:                          
    ant:                         NO
    JNI:                         NO
    Java wrappers:               NO
    Java tests:                  NO

  Install to:                    E:/code_env/opencv/build_test/install
-----------------------------------------------------------------

Configuring done
Generating done

参考文章


  1. windows 10安装CUDA和cuDNN ↩︎

  2. Opencv编译时出现:无法打开 python37_d.lib 的问题 ↩︎

发布了7 篇原创文章 · 获赞 1 · 访问量 447

猜你喜欢

转载自blog.csdn.net/m0_46318517/article/details/104432082