MobileNet 在caffe训练加速方法:多GPU:解决Multi-GPU execution not available - rebuild with USE_NCCL

1、直接在如下训练指令后加:--gpu all或者 --gpu 0,1报错

./build/tools/caffe train --solver=examples/myExample/0227_gpu/solver_2acc.prototxt

 

报错内容:Multi-GPU execution not available - rebuild with USE_NCCL

 

2、解决方法:

(1)查看cuda和gcc版本

(cuda版本要求:cuda version>6.0,gcc 版本>=gcc 4.8) 

cuda 版本 
cat /usr/local/cuda/version.txt

扫描二维码关注公众号,回复: 5737809 查看本文章

查看gcc版本

 

(2)安装NCCL1

cd nccl-master

make CUDA_HOME=/usr/local/cuda-8.0/

sudo make install

sudo ldconfig

(3)caffe的makefile.config修改

添加:

USE_NCCL := 1

INCLUDE_DIRS += /home/yanghuiyu/make_cmake/nccl-master/build/include

LIBRARY_DIRS += /home/yanghuiyu/make_cmake/nccl-master/build/lib

makefile中要有:

ifeq ($(USE_NCCL), 1)

    LIBRARIES += nccl

    COMMON_FLAGS += -DUSE_NCCL endif

(4)caffe重新编译

sudo make clean 

sudo make all -j 

3、训练速度效果的提升:提升了10倍左右

1s2个batch ----- 1s19个batch

参考:

https://blog.csdn.net/jmu201521121021/article/details/78654037

https://blog.csdn.net/wudi_X/article/details/80012764#commentBox

猜你喜欢

转载自blog.csdn.net/weixin_41770169/article/details/88297266