成功解决RuntimeError: one of the variables needed for gradient computation has been modified by an inpla

Problem Description

This problem occurs when using PyTorchthe trained network.RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [3008, 128]], which is output 0 of AsStridedBackward0, is at version 4; expected version 3 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

Traceback (most recent call last):
  File "E:\Code\PyCharm\Contrast learning\main_real.py", line 57, in <module>
    main()
  File "E:\Code\PyCharm\Contrast learning\main_real.py", line 51, in main
    train_causal_real(dataset, model_func, args, file)
  File "E:\Code\PyCharm\Contrast learning\train_causal.py", line 128, in train_causal_real
    args)
  File "E:\Code\PyCharm\Contrast learning\train_causal.py", line 582, in train_causal_epoch
    loss.backward()
  File "D:\anaconda3\envs\graph\lib\site-packages\torch\_tensor.py", line 363, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
  File "D:\anaconda3\envs\graph\lib\site-packages\torch\autograd\__init__.py", line 175, in backward
    allow_unreachable=True, accumulate_grad=True)  # Calls into the C++ engine to run the backward pass
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [3008, 128]], which is output 0 of AsStridedBackward0, is at version 4; expected version 3 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

Cause Analysis:

This error usually occurs when performing automatic differentiation PyTorchin . This error is usually (inplace operation)caused , which means modifying the value of the original variable directly, rather than creating a new variable to store the result. PyTorchIn , in-place operations corrupt the computation graph, making it impossible to compute gradients.

The problem where I went wrong was that I used BatchNormlayers :

x = torch.tensor([1])
x = bn(x)

solution:

The easiest solution is to create a new variable to receive the modified value:

x = torch.tensor([1])
y = bn(x)

To solve this problem, you can avoid using in-place operations, or use copies instead of the original tensor to perform these operations.

Guess you like

Origin blog.csdn.net/m0_47256162/article/details/130678460