mxnet在linux上的安装

mxnet在linux上的安装

安装MXNET:

http://mxnet.io/get_started/setup.html

问题查找可以首先考虑 github issues.

If you are running Python on Amazon Linux or Ubuntu, you can use Git Bash scripts to quickly install the MXNet libraries and all dependencies. If you are using other languages or operating systems, skip to Standard Installation.(如果是用python运行并且安装在ubuntu/Amazon Linux上,可以使用Git Bash脚本来快速安装;其他的按照标准方式安装)

Quick Installation on ubuntu:

git clone https://github.com/dmlc/mxnet.git ~/MXNet/mxnet --recursive

cd ~/MXNet/mxnet/setup-utils

bash install-mxnet-ubuntu.sh

Standard Installation

Minimum Requirements

You must have the following:

  • A C++ compiler that supports C++ 11 The C++ compiler compiles and builds MXNet source code. Supported compilers include the following:
  • BLAS (Basic Linear Algebra Subprograms) library BLAS libraries contain routines that provide the standard building blocks for performing basic vector and matrix operations. You need a BLAS library to perform basic linear algebraic operations. Supported BLAS libraries include the following:

Build MXNet on Ubuntu/DebianOn Ubuntu versions 13.10 or later, you need the following dependencies:* Git (to pull code from GitHub)* libatlas-base-dev (for linear algebraic operations)* libopencv-dev (for computer vision operations)Install these dependencies using the following commands:```bashsudo apt-get updatesudo apt-get install -y build-essential git libatlas-base-dev libopencv-dev

After you have downloaded and installed the dependencies, use the following commands to pull the MXNet source code from Git and build MXNet:

git clone --recursive https://github.com/dmlc/mxnetcd mxnet; make -j$(nproc)

从安装的命令中可以看出要安装的软件如下:

libatlas-base-devlibopencv-dev

如果是ubuntu系统,上面安装出问题的话,可以一步步安装:

sudo apt-get update

sudo apt-get install -y build-essential git libatlas-base-dev libopencv-dev

git clone --recursive https://github.com/dmlc/mxnet

cd mxnet

make -j4

sudo apt-get install python-numpy# for debian

sudo apt-get install python-setuptools# for debian

cd python; sudo python setup.py install

// 这里不建议使用sudo

// 注意sudo是把install的东西安装到了root用户的python环境变量里,这里的一个坑就是在当前非root用户下执行python后,本地安装是装在anaconda的python环境下,import mxnet后报no module named mxnet的错。切换到root用户(python路径:/usr/bin/python),python可以正常执行import mxnet.

安装完后在python 的lib目录中会发现: ./python2.7/site-packages/mxnet-0.7.0-py2.7.egg/mxnet的目录。

测试是否安装成功:

importmxnetasmx

mxnet依赖opencv,安装opencv的时候可能依赖很多其他库

安装opencv依赖问题

sudo apt-get install -y build-essential git libblas-dev libopencv-dev

正在读取软件包列表... 完成

正在分析软件包的依赖关系树

正在读取状态信息... 完成

build-essential 已经是最新的版本了。

build-essential 被设置为手动安装。

有一些软件包无法被安装。如果您用的是 unstable 发行版,这也许是

因为系统无法达到您要求的状态造成的。该版本中可能会有一些您需要的软件

包尚未被创建或是它们已被从新到(Incoming)目录移出。

下列信息可能会对解决问题有所帮助:

下列软件包有未满足的依赖关系:

libopencv-dev : 依赖: libopencv-objdetect-dev (= 2.4.8+dfsg1-2ubuntu1) 但是它将不会被安装

依赖: libopencv-highgui-dev (= 2.4.8+dfsg1-2ubuntu1) 但是它将不会被安装

依赖: libopencv-legacy-dev (= 2.4.8+dfsg1-2ubuntu1) 但是它将不会被安装

依赖: libopencv-contrib-dev (= 2.4.8+dfsg1-2ubuntu1) 但是它将不会被安装

依赖: libopencv-videostab-dev (= 2.4.8+dfsg1-2ubuntu1) 但是它将不会被安装

依赖: libopencv-superres-dev (= 2.4.8+dfsg1-2ubuntu1) 但是它将不会被安装

依赖: libopencv-ocl-dev (= 2.4.8+dfsg1-2ubuntu1) 但是它将不会被安装

依赖: libcv-dev (= 2.4.8+dfsg1-2ubuntu1) 但是它将不会被安装

依赖: libhighgui-dev (= 2.4.8+dfsg1-2ubuntu1) 但是它将不会被安装

依赖: libcvaux-dev (= 2.4.8+dfsg1-2ubuntu1) 但是它将不会被安装

E: 无法修正错误,因为您要求某些软件包保持现状,就是它们破坏了软件包间的依赖关系。

如果遇到上面的错误,就需要手工安装OpenCV,参考下面。

Centos安装Mxnet

(这里主要是centos6.5, centos7会好安装一些):

Issues: https://github.com/dmlc/mxnet/issues/3324

https://github.com/dmlc/mxnet/issues/1303

https://github.com/dmlc/mxnet/issues/1125

centos安装,官方提供的例子是apt-get,适用于debian系列linux,centos不适用,install-mxnet-ubuntu.sh中是一些apt-get命令。

在centos系统上,安装会比ubuntu系统困难些,文档比较少,参考issues:https://github.com/dmlc/mxnet/issues/1303

rz命令安装(如果未安装):yum -y install lrzsz

问题:初次安装会遇到依赖问题,执行bash install-mxnet-ubuntu.sh后:

Setting up Install Process

No package build-essential available.

No package libatlas-base-dev available.

No package libopencv-dev available.

Error: Nothing to do

问题:

ubuntu和centos问题(apt-get / yum),建议centos版本》=6.5

首先尝试使用yum来安装opencv:

sudo yum install atlas-devel opencv

sudo yum install opencv-devel

可以尝试使用下面的过程:

yum update 

yum install -y build-essential git libatlas-base-dev libopencv-dev 

yum install -y opencv opencv-devel atlas-devel 

yum install gcc gcc-g++ 

ldconfig /etc/ld.so.cache 

git clone --recursive https://github.com/dmlc/mxnet 

cd mxnet 

./prepare_mkl.sh 

cp make/config.mk . 

vim config.mk 

+31 ADD_LDFLAGS = -L/usr/lib64/atlas 

vim mshadow/make/mshadow.mk 

-68 MSHADOW_LDFLAGS += -lcblas 

+68 MSHADOW_LDFLAGS += -lsatlas 

yum info glib2 

yum upgrade glib2 

make -j4 

[root@xdataimg2 mxnet]# ll lib 

总用量 38920 

-rw-r--r-- 1 root root 28637318 11月 12 12:32 libmxnet.a 

-rwxr-xr-x 1 root root 11214217 11月 12 12:32 libmxnet.so

wget https://bootstrap.pypa.io/get-pip.py 

python get-pip.py 

pip install numpy-i http://pypi.mirrors.ustc.edu.cn/simple --trusted-host pypi.mirrors.ustc.edu.cn 

pip install scipy -i http://pypi.mirrors.ustc.edu.cn/simple --trusted-host pypi.mirrors.ustc.edu.cn 

cd mxnet/python 

python setup.py install

源码安装opencv

参考:http://blog.csdn.net/kuaile123/article/details/20870731

首先安装opencv依赖:

yum install cmake gcc gcc-c++ gtk+-devel gimp-devel gimp-devel-tools gimp-help-browser zlib-devel libtiff-devel libjpeg-devel libpng-devel gstreamer-devel libavc1394-devel libraw1394-devel libdc1394-devel jasper-devel jasper-utils swig Python libtool nasm

sudo yum install opencv-devel

sudo yum install atlas-devel

// or sudo yum install atlas-devel opencv

yum install cmake

在OpenCV官网http://sourceforge.net/projects/opencvlibrary/files/ 下载所需版本,解压。

cd  OpenCV-2.4.10

cmake CMakeLists.txt  

make & make install

make的时候可能会报错:

Linking CXX executable ../../bin/opencv_perf_core

../../lib/libopencv_highgui.so.2.4.10: undefined reference to `png_set_longjmp_fn'

collect2: error: ld returned 1 exit status

G++版本:

查看g++版本:

g++ --version / g++ -v

gcc version 4.4.7 20120313 (Red Hat 4.4.7-4) (GCC)

显然g++比要求的版本是要低的,需要升级.

升级GCC/G++(两个是在一起的):

下载地址:http://ftp.tsukuba.wide.ad.jp/software/gcc/releases/gcc-4.8.5/

wget http://ftp.tsukuba.wide.ad.jp/software/gcc/releases/gcc-4.8.5/gcc-4.8.5.tar.gz

tar -jxvf gcc-4.8.2.tar.bz2

cd gcc-4.8.2

./contrib/download_prerequisites

mkdir build

mxnet on yarn:

dmlc-submit --mode <cluster-mode> [arguments] [command]

待测: dmlc-submit -h

--cluster string, {'mpi', 'yarn', 'local', 'sge'}, default to ${DMLC_SUBMIT_CLUSTER}

Job submission mode.

--num-workers integer, required

Number of workers in the job.

--num-servers` integer, default=0

Number of servers in the job.

--worker-cores integer, default=1

Number of cores needed to be allocated for worker job.

--server-cores integer, default=1

Number of cores needed to be allocated for server job.

--worker-memory string, default='1g'

Memory needed for server job.

--server-memory string, default='1g'

Memory needed for server job.

--jobname string, default=auto specify

Name of the job.

--queue string, default='default'

The submission queue we should submit the job to.

--log-level string, {INFO, DEBUG}

The logging level.

--log-file string, default='None'

Output log to the specific log file, the log is still printed on stderr.

tracker]# ./dmlc-submit --cluster=yarn --num-workers=2 --worker-cores=1 --num-servers=2 ../../example/image-classification/train_mnist.py

source activate ml2

hdfs dfs -put train-* /tmp/mnist

hdfs dfs -chomd -R 777 /tmp/mnist

tools/launch.py -n 2 --launcher yarn python train_mnist.py --data-dir hdfs:///tmp/mnist/cd build

../configure --enable-checking=release --enable-languages=c,c++ --disable-multilib

报错:configure: error: Building GCC requires GMP 4.2+, MPFR 2.4.0+ and MPC 0.8.0+.

参考:http://blog.csdn.net/ivanlxf/article/details/19080681

执行./contrib/download_prerequisities脚本会自动下载三个依赖库别为gmp-4.3.2、mpfr-2.4.2、mpc-0.8.1,也可以通过如下地址离线下载安装:

(1)安装gmp:

wget ftp://ftp.gnu.org/gnu/gmp/gmp-4.3.2.tar.bz2

tar -jxf gmp-4.3.2.tar.bz2

cd gmp-4.3.2

mkdir build

cd build

../configure --prefix=/usr/local/gcc/gmp-4.3.2

make && make install

(2)安装mpfr

wget http://www.mpfr.org/mpfr-2.4.2/mpfr-2.4.2.tar.bz2

tar -jxf mpfr-2.4.2.tar.bz2

mkdir build

cd build

../configure --prefix=/usr/local/gcc/mpfr-2.4.2 --with-gmp=/usr/local/gcc/gmp-4.3.2

make && make install

(3)安装mpc

wget http://www.multiprecision.org/mpc/download/mpc-0.8.1.tar.gz

tar zxvf mpc-0.8.1.tar.gz

mkdir build

cd build

../configure --prefix=/usr/local/gcc/mpc-0.8.1 --with-mpfr=/usr/local/gcc/mpfr-2.4.2 --with-gmp=/usr/local/gcc/gmp-4.3.2 

make && make install

(4)添加共享库路径,su到root编辑ld.so.conf文件,添加如下内容到文件中:

编辑ld.so.conf文件,添加如下内容到文件中:

/usr/local/gcc/gmp-4.3.2/lib

/usr/local/gcc/mpfr-2.4.2/lib

/usr/local/gcc/mpc-0.8.1/lib

保存退出,执行ldconfig命令

继续执行gcc的configure,依然报上面的错误,手工指定上面三个库的路径:

../configure --prefix=/usr/local/gcc --enable-threads=posix --disable-checking --enable-languages=c,c++ --disable-multilib --with-gmp=/usr/local/gcc/gmp-4.3.2 --with-mpfr=/usr/local/gcc/mpfr-2.4.2 --with-mpc=/usr/local/gcc/mpc-0.8.1

通过之后,执行 make && make install (等待时间比较长)

(5)卸载旧的,配置新的:

yum remove gcc

yum remove gcc-c++

updatedb

cd /usr/bin // gcc,g++所在路径,可以通过which g++查看

ln -s /usr/local/gcc/bin/gcc gcc

ln -s /usr/local/gcc/bin/g++ g++

Clang安装

sudo yum install clang

源码安装mxnet:

git clone --recursivehttps://github.com/dmlc/mxnet

cd mxnet;

cp make/config.mk .

make -j4

sudo yum install python-numpy# for redhat

cd python; sudo python setup.py install

import mxnet as mx

make报错:

/usr/local/include/c++/4.8.0/condition_variable:83:5: note: no known conversion for implicit ‘this’ parameter from ‘const std::condition_variable*’ to ‘std::condition_variable*’

经查(https://github.com/dmlc/mxnet/issues/530)是由于gcc版本过低引起的,升级gcc参考上面。

报错:

/usr/bin/ld: cannot find -lcblas

collect2: error: ld returned 1 exit status

make: *** [lib/libmxnet.so] Error 1

确保安装了cblas和atlas

相关issues:https://github.com/dmlc/mxnet/issues/1442

报错:

checking whether the C compiler works... no

configure: error: in `/root/App/MXNet/mxnet/ps-lite/protobuf-2.5.0':

configure: error: C compiler cannot create executables

See `config.log' for more details

make[1]: *** [/root/App/MXNet/mxnet/deps/include/google/protobuf/message.h] Error 77

make[1]: Leaving directory `/root/App/MXNet/mxnet/ps-lite'

make: *** [PSLITE] Error 2

Run MxNet on yarn:

http://mxnet.io/how_to/cloud.html

官方文档中的描述:

Use YARN, MPI, SGE

While ssh can be simple for cases when we do not have a cluster scheduling framework. MXNet is designed to be able to port to various platforms. We also provide other scripts in tracker to run on other cluster frameworks, including Hadoop(YARN) and SGE. Your contribution is more than welcomed to provide examples to run MXNet on your favourite distributed platform.

mxnet on yarn:

dmlc-submit --mode <cluster-mode> [arguments] [command]

待测: dmlc-submit -h

--cluster string, {'mpi', 'yarn', 'local', 'sge'}, default to ${DMLC_SUBMIT_CLUSTER}

Job submission mode.

--num-workers integer, required

Number of workers in the job.

--num-servers` integer, default=0

Number of servers in the job.

--worker-cores integer, default=1

Number of cores needed to be allocated for worker job.

--server-cores integer, default=1

Number of cores needed to be allocated for server job.

--worker-memory string, default='1g'

Memory needed for server job.

--server-memory string, default='1g'

Memory needed for server job.

--jobname string, default=auto specify

Name of the job.

--queue string, default='default'

The submission queue we should submit the job to.

--log-level string, {INFO, DEBUG}

The logging level.

--log-file string, default='None'

Output log to the specific log file, the log is still printed on stderr.

tracker]# ./dmlc-submit --cluster=yarn --num-workers=2 --worker-cores=1 --num-servers=2 ../../example/image-classification/train_mnist.py

source activate ml2

hdfs dfs -put train-* /tmp/mnist

hdfs dfs -chomd -R 777 /tmp/mnist

tools/launch.py -n 2 --launcher yarn python train_mnist.py --data-dir hdfs:///tmp/mnist/

mxnet on multiple cpus:

http://mxnet.io/how_to/multi_devices.html

在一台机器上跑可以直接运行: python train_mnist.py --network lenet

前提: 所有机器都编译通过并且安装了mxnet,并且机器之间可以通过ssh连接。

cd mxnet/example/image-classification

echo "192.168.177.77" >> hosts //当前机器192.168.177.78,

../../tools/launch.py -n2 --launcher ssh -H hosts python train_mnist.py --network lenet --kv-store dist_sync

note that:

use launch.py to submit the job

  • provide launcher, ssh if all machines are ssh-able, mpi if mpirun is available, sge for Sun Grid Engine, and yarn for Apache Yarn.
  • -n number of worker nodes to run
  • -H the host file which is required by ssh and mpi
  • --kv-store use either dist_sync or dist_async

效果对比:

77,78两台机器上跑:

单台机器:

python train_mnist.py --network lenet

19:26:27-20:44:57 78minutes

两台机器:

../../tools/launch.py -n 2 --launcher ssh -H hosts python train_mnist.py --network lenet --kv-store dist_sync

18:43:51-19:22:53 39minutes

猜你喜欢

转载自blog.csdn.net/CVAIDL/article/details/88801274
今日推荐