mxnet src/resource.cc:443: Check failed: e == CUDNN_STATUS_SUCCESS (8 vs. 0) : cuDNN: CUDNN_STATUS_

报错:

Traceback (most recent call last):
  File "train_0723.py", line 434, in <module>
    main()
  File "train_0723.py", line 430, in main
    train_net(args)
  File "train_0723.py", line 424, in train_net
    epoch_end_callback=epoch_cb)
  File "/home/user1/recognition/parall_module_local_v1_gluon_group.py", line 569, in fit
    self.forward_backward(data_batch, eval_metric)
  File "/home/user1/recognition/parall_module_local_v1_gluon_group.py", line 441, in forward_backward
    eval_metric.update(data_batch.label[0], preds, )
  File "/home/user1/miniconda3/lib/python3.7/site-packages/mxnet/metric.py", line 363, in update
    metric.update(labels, preds)
  File "/home/user1/miniconda3/lib/python3.7/site-packages/mxnet/metric.py", line 494, in update
    pred_label = pred_label.asnumpy().astype('int32')
  File "/home/user1/miniconda3/lib/python3.7/site-packages/mxnet/ndarray/ndarray.py", line 2535, in asnumpy
    ctypes.c_size_t(data.size)))
  File "/home/user1/miniconda3/lib/python3.7/site-packages/mxnet/base.py", line 255, in check_call
    raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: [17:41:29] src/resource.cc:443: Check failed: e == CUDNN_STATUS_SUCCESS (8 vs. 0) : cuDNN: CUDNN_STATUS_EXECUTION_FAILED
Stack trace:
  [bt] (0) /home/user1/miniconda3/lib/python3.7/site-packages/mxnet/libmxnet.so(+0x6b41eb) [0x7f060fcc41eb]
  [bt] (1) /home/user1/miniconda3/lib/python3.7/site-packages/mxnet/libmxnet.so(+0x40c62fa) [0x7f06136d62fa]
  [bt] (2) /home/user1/miniconda3/lib/python3.7/site-packages/mxnet/libmxnet.so(+0x4634f7d) [0x7f0613c44f7d]
  [bt] (3) /home/user1/miniconda3/lib/python3.7/site-packages/mxnet/libmxnet.so(+0x463fe26) [0x7f0613c4fe26]
  [bt] (4) /home/user1/miniconda3/lib/python3.7/site-packages/mxnet/libmxnet.so(+0x464340c) [0x7f0613c5340c]
  [bt] (5) /home/user1/miniconda3/lib/python3.7/site-packages/mxnet/libmxnet.so(+0x37d6909) [0x7f0612de6909]
  [bt] (6) /home/user1/miniconda3/lib/python3.7/site-packages/mxnet/libmxnet.so(+0x37e33d5) [0x7f0612df33d5]
  [bt] (7) /home/user1/miniconda3/lib/python3.7/site-packages/mxnet/libmxnet.so(+0x37bf6d1) [0x7f0612dcf6d1]
  [bt] (8) /home/user1/miniconda3/lib/python3.7/site-packages/mxnet/libmxnet.so(+0x37c2c10) [0x7f0612dd2c10]

解决:把batch_size调小点。

https://discuss.gluon.ai/t/topic/6309/8

猜你喜欢

转载自blog.csdn.net/qxqxqzzz/article/details/107643172
今日推荐