python下mxnet 编译安装遇到问题汇总(二)

上次讲到编译安装mxnet, 这次讲一下对因特尔CPU处理器优化的mxnet(mxnet-mkl)编译安装,也是工作需要,目前mxnet在最模型推理时,耗时长(与开发相比慢了将近10倍)。

首先还是先将源码下载下来,我这边下载的是最新发布版mxnet-1.6.0。值得注意的一点细节, mxnet-1.6.0版本也是最后一个支持pyhon2的版本,之后不再对python2进行支持。

wget  https://github.com/apache/incubator-mxnet/archive/1.6.0.tar.gz

第一步、 环境准备:

第二步、切换代码安装路径:

cd python

pip install setup.py

第三步、验证是否安装正确

import mxnet as mx
import numpy as np

shape_x = (1, 10, 8)
shape_w = (1, 12, 8)

x_npy = np.random.normal(0, 1, shape_x)
w_npy = np.random.normal(0, 1, shape_w)

x = mx.sym.Variable('x')
w = mx.sym.Variable('w')
y = mx.sym.batch_dot(x, w, transpose_b=True)
exe = y.simple_bind(mx.cpu(), x=x_npy.shape, w=w_npy.shape)

exe.forward(is_train=False)
o = exe.outputs[0]
t = o.asnumpy()

          更详细的验证结果:

# You can open the MKL_VERBOSE flag by setting environment variable:
export MKL_VERBOSE=1

         输出结果

Numpy + Intel(R) MKL: THREADING LAYER: (null)
Numpy + Intel(R) MKL: setting Intel(R) MKL to use INTEL OpenMP runtime
Numpy + Intel(R) MKL: preloading libiomp5.so runtime
MKL_VERBOSE Intel(R) MKL 2019.0 Update 3 Product build 20190125 for Intel(R) 64 architecture Intel(R) Advanced Vector Extensions 512 (Intel(R) AVX-512) enabled processors, Lnx 2.40GHz lp64 intel_thread NMICDev:0
MKL_VERBOSE SGEMM(T,N,12,10,8,0x7f7f927b1378,0x1bc2140,8,0x1ba8040,8,0x7f7f927b1380,0x7f7f7400a280,12) 8.93ms CNR:OFF Dyn:1 FastMM:1 TID:0  NThr:40 WDiv:HOST:+0.000

第四步、基于mkl 推理优化设置

export MXNET_SUBGRAPH_BACKEND=MKLDNN

更多MKL-DNN计算图优化请详看https://cwiki.apache.org/confluence/display/MXNET/MXNet+Graph+Optimization+and+Quantization+based+on+subgraph+and+MKL-DNN

最后, 对mkl支持的计算算子操作,

更加具体请查看https://github.com/apache/incubator-mxnet/blob/v1.5.x/docs/tutorials/mkldnn/operator_list.md

猜你喜欢

转载自blog.csdn.net/jinhao_2008/article/details/104718114