Loss.cu:97,RuntimeError: copy_if failed to synchronize: cudaErrorAssert: device-side assert triggere

服务器:Ubuntu18,
环境:python3.6,pytorch
报错信息
/opt/conda/conda-bld/pytorch_1591914838379/work/aten/src/ATen/native/cuda/Loss.cu:97: operator(): block: [0,0,0], thread: [31,0,0] Assertion input_val >= zero && input_val <= one failed.
Traceback (most recent call last):
File “trainning.py”, line 126, in
loss, outputs = model(imgs, targets)
File “/.conda/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py”, line 550, in call
result = self.forward(*input, **kwargs)
File “/models.py”, line 262, in forward
x, layer_loss = module[0](x, targets, img_dim)
File /.conda/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File /models.py", line 200, in forward
loss_conf_noobj = self.bce_loss(pred_conf[noobj_mask], tconf[noobj_mask])
RuntimeError: copy_if failed to synchronize: cudaErrorAssert: device-side assert triggered

之前出现的信息:

在这里插入图片描述
在这里插入图片描述
loss:出现nan的情况

解决办法:
1.修改learning rate,将学习率往下调
2.修改
optimizer.step()
optimizer.zero_grad()
迭代次数改为一个step一次

猜你喜欢

转载自blog.csdn.net/qq_42321818/article/details/107640387