记mxnet cudnn的一个错误及解决办法

解决办法是,把系统里面安装的cudnn, mv走。据说是因为mxnet内部已经有cudnn  https://discuss.mxnet.io/t/cudnn-status-success-8-vs-0-cudnn-cudnn-status-execution-failed/2424

[17:14:43] src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:97: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)
Traceback (most recent call last):
  File "train_yolo.py", line 345, in <module>
    train(net, train_data, val_data, eval_metric, ctx, args)
  File "train_yolo.py", line 278, in train
    obj_metrics.update(0, obj_losses)
  File "/home/zhangjianwei/.conda/envs/py27/lib/python2.7/site-packages/mxnet/metric.py", line 1636, in update
    loss = ndarray.sum(pred).asscalar()
  File "/home/zhangjianwei/.conda/envs/py27/lib/python2.7/site-packages/mxnet/ndarray/ndarray.py", line 2014, in asscalar
    return self.asnumpy()[0]
  File "/home/zhangjianwei/.conda/envs/py27/lib/python2.7/site-packages/mxnet/ndarray/ndarray.py", line 1996, in asnumpy
    ctypes.c_size_t(data.size)))
  File "/home/zhangjianwei/.conda/envs/py27/lib/python2.7/site-packages/mxnet/base.py", line 253, in check_call
    raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: [17:14:45] src/operator/nn/./cudnn/cudnn_convolution-inl.h:287: Check failed: e == CUDNN_STATUS_SUCCESS (8 vs. 0) : cuDNN: CUDNN_STATUS_EXECUTION_FAILED
Stack trace:
  [bt] (0) /home/zhangjianwei/.conda/envs/py27/lib/python2.7/site-packages/mxnet/libmxnet.so(+0x4b03ab) [0x2afb4a9e93ab]
  [bt] (1) /home/zhangjianwei/.conda/envs/py27/lib/python2.7/site-packages/mxnet/libmxnet.so(+0x31b28c3) [0x2afb4d6eb8c3]
  [bt] (2) /home/zhangjianwei/.conda/envs/py27/lib/python2.7/site-packages/mxnet/libmxnet.so(+0x31dc375) [0x2afb4d715375]
  [bt] (3) /home/zhangjianwei/.conda/envs/py27/lib/python2.7/site-packages/mxnet/libmxnet.so(mxnet::imperative::PushFCompute(std::function<void (nnvm::NodeAttrs const&, mxnet::OpContext const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&, std::vector<mxnet::TBlob, std::allocator<mxnet::TBlob> > const&)> const&, nnvm::Op const*, nnvm::NodeAttrs const&, mxnet::Context const&, std::vector<mxnet::engine::Var*, std::allocator<mxnet::engine::Var*> > const&, std::vector<mxnet::engine::Var*, std::allocator<mxnet::engine::Var*> > const&, std::vector<mxnet::Resource, std::allocator<mxnet::Resource> > const&, std::vector<mxnet::NDArray*, std::allocator<mxnet::NDArray*> > const&, std::vector<mxnet::NDArray*, std::allocator<mxnet::NDArray*> > const&, std::vector<unsigned int, std::allocator<unsigned int> > const&, std::vector<mxnet::OpReqType, std::allocator<mxnet::OpReqType> > const&)::{lambda(mxnet::RunContext)#1}::operator()(mxnet::RunContext) const+0x307) [0x2afb4cba1037]
  [bt] (4) /home/zhangjianwei/.conda/envs/py27/lib/python2.7/site-packages/mxnet/libmxnet.so(+0x25b50c0) [0x2afb4caee0c0]
  [bt] (5) /home/zhangjianwei/.conda/envs/py27/lib/python2.7/site-packages/mxnet/libmxnet.so(+0x25c1559) [0x2afb4cafa559]
  [bt] (6) /home/zhangjianwei/.conda/envs/py27/lib/python2.7/site-packages/mxnet/libmxnet.so(+0x25c49c0) [0x2afb4cafd9c0]
  [bt] (7) /home/zhangjianwei/.conda/envs/py27/lib/python2.7/site-packages/mxnet/libmxnet.so(+0x25c4c56) [0x2afb4cafdc56]
  [bt] (8) /home/zhangjianwei/.conda/envs/py27/lib/python2.7/site-packages/mxnet/libmxnet.so(+0x25bfd64) [0x2afb4caf8d64]

发布了159 篇原创文章 · 获赞 55 · 访问量 36万+

猜你喜欢

转载自blog.csdn.net/northeastsqure/article/details/103480061