caffe 问题集锦之使用cmake编译多GPU时,更改USE_NCCL=1无效

cmake ..

出現問題

-- CUDA detected: 9.0
-- Added CUDA NVCC flags for: sm_61
CMake Error at /usr/share/cmake-3.5/Modules/FindPackageHandleStandardArgs.cmake:148 (message):
  Could NOT find NCCL (missing: NCCL_INCLUDE_DIR NCCL_LIBRARIES)
Call Stack (most recent call first):
  /usr/share/cmake-3.5/Modules/FindPackageHandleStandardArgs.cmake:388 (_FPHSA_FAILURE_MESSAGE)
  cmake/Modules/FindNCCL.cmake:21 (find_package_handle_standard_args)
  cmake/Dependencies.cmake:89 (find_package)
  CMakeLists.txt:46 (include)

0.可设置环境变量CUDA_VISIBLE_DEVICES,指明可见的cuda设备

方法1: 在/etc/profile或~/.bashrc的配置文件中配置环境变量(/etc/profile影响所有用户,~/.bashrc影响当前用户使用的bash shell)

在~/.bashrc文件末尾添加以下行:

export CUDA_VISIBLE_DEVICES=0,1,2,3 ##仅显卡设备0,1GPU可见。可用的GPU可通过nvidia-smi -L命令查看

:wq保存并退出

source ~/.bashrc使配置文件生效

1.服務器用戶(無root)下安裝ncll

1.下载编译
https://github.com/NVIDIA/nccl

cd nccl
make CUDA_HOME=/user/local/cuda   test #注意自己的cuda路径

2.测试和配置环境变量


export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:./build/lib
./build/test/single/all_reduce_test
./build/test/single/all_reduce_test 10000000
#注意看出現幾個gpu,如果沒出現完整,就設置“CUDA_VISIBLE_DEVICES,指明可见的cuda设备”

在~/.bashrc文件末尾添加以下行:

export LD_LIBRARY_PATH=/home/neu105/nccl/build/lib:$LD_LIBRARY_PATH

:wq保存并退出

source ~/.bashrc使配置文件生效

3.設置caffe/Makefile.config

1.

取消註釋
USE_NCLL := 1

2.最終要的一步

如果你是在用戶目錄下安裝的NCCL,那麼你需要更改”caffe-master/cmake/Modules/FindNCCL.cmake”

set(NCCL_INC_PATHS
    /usr/include
    /usr/local/include
    /home/neu105/nccl/build/include #增加ncll地址
    $ENV{NCCL_DIR}/include
    )

set(NCCL_LIB_PATHS
    /lib
    /lib64
    /usr/lib
    /usr/lib64
    /usr/local/lib
    /usr/local/lib64
    /home/neu105/nccl/build/lib   #增加ncll地址
    $ENV{NCCL_DIR}/lib
    )

3.在Makefile.config中更改USE_NCCL 后,CMakeLists.txt中的配置是没有发生改变的,手动将OFF改为ON以后,保存再使用cmake编译caffe。

https://blog.csdn.net/u011394059/article/details/73732707

4.重新編譯安裝caffe

參考:

https://blog.csdn.net/u012235003/article/details/54576840

https://blog.csdn.net/u011394059/article/details/73732707

https://www.cnblogs.com/haiyang21/p/7183413.html

https://www.cnblogs.com/HugoLester/p/6489945.html

猜你喜欢

转载自blog.csdn.net/weixin_37251044/article/details/79859393
今日推荐