TX2上安装pytorch0.4.1

说明:默认python版本为python3.5,cuda9.0

1、安装依赖包

sudo apt install libopenblas-dev libatlas-dev liblapack-dev
sudo apt install liblapacke-dev checkinstall
sudo pip3 install numpy scipy  -i http://pypi.mirrors.ustc.edu.cn/simple --trusted-host pypi.mirrors.ustc.edu.cn
sudo pip3 install pyyaml -i http://pypi.mirrors.ustc.edu.cn/simple --trusted-host pypi.mirrors.ustc.edu.cn
sudo pip3 install scikit-build -i http://pypi.mirrors.ustc.edu.cn/simple --trusted-host pypi.mirrors.ustc.edu.cn
sudo apt-get -y install cmake
sudo apt install libffi-dev
sudo pip3 install cffi -i http://pypi.mirrors.ustc.edu.cn/simple --trusted-host pypi.mirrors.ustc.edu.cn
sudo apt-get install alien
sudo apt-get install nano
sudo apt install ninja-build

2、从github上下载pytorch-0.4.1 https://github.com/pytorch/pytorch/tree/v0.4.1 ,手动下载第三方库。

3、编译第三方库:python setup.py build_ext
4、编译pytorch:sudo python setup.py install (若出现内存不足错,重启再运行)

运行是若出现问题:RuntimeError: cuda runtime error (7) : too many resources requested for launch at /home/nvidia/pytorch/torch/lib/THCUNN/generic/SpatialUpSamplingBilinear.cu:63
解决办法:找到/pytorch/torch/lib/THCUNN/generic/SpatialUpSamplingBilinear.cu文件,修改两个地方,修改之后需要重新编译pytorch。

const int num_threads = 512;     大约是62行
  //THCState_getCurrentDeviceProperties(state)->maxThreadsPerBlock;
const int num_threads = 512;     大约是97行
  //THCState_getCurrentDeviceProperties(state)->maxThreadsPerBlock;

记录下之前一块板子安装pytorch出现的问题:当时由于pip损坏,没有成功安装numpy,因此在运行时用到numpy就会报错:RuntimeError: PyTorch was compiled without NumPy support ,最近pip修复好了,装上了numpy,重新编译了pytorch(清除原来的编译结果:sudo python setup.py clean,再编译安装:sudo python setup.py install 共耗时约1小时)。重新编译过后跑原来的模型速度得到了提升。

猜你喜欢

转载自blog.csdn.net/qq_33206394/article/details/87914399