Problem Description
This problem occurs when using PyTorch
the trained network.RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [3008, 128]], which is output 0 of AsStridedBackward0, is at version 4; expected version 3 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).
Traceback (most recent call last):
File "E:\Code\PyCharm\Contrast learning\main_real.py", line 57, in <module>
main()
File "E:\Code\PyCharm\Contrast learning\main_real.py", line 51, in main
train_causal_real(dataset, model_func, args, file)
File "E:\Code\PyCharm\Contrast learning\train_causal.py", line 128, in train_causal_real
args)
File "E:\Code\PyCharm\Contrast learning\train_causal.py", line 582, in train_causal_epoch
loss.backward()
File "D:\anaconda3\envs\graph\lib\site-packages\torch\_tensor.py", line 363, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "D:\anaconda3\envs\graph\lib\site-packages\torch\autograd\__init__.py", line 175, in backward
allow_unreachable=True, accumulate_grad=True) # Calls into the C++ engine to run the backward pass
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [3008, 128]], which is output 0 of AsStridedBackward0, is at version 4; expected version 3 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).
Cause Analysis:
This error usually occurs when performing automatic differentiation PyTorch
in . This error is usually (inplace operation)
caused , which means modifying the value of the original variable directly, rather than creating a new variable to store the result. PyTorch
In , in-place operations corrupt the computation graph, making it impossible to compute gradients.
The problem where I went wrong was that I used BatchNorm
layers :
x = torch.tensor([1])
x = bn(x)
solution:
The easiest solution is to create a new variable to receive the modified value:
x = torch.tensor([1])
y = bn(x)
To solve this problem, you can avoid using in-place operations, or use copies instead of the original tensor to perform these operations.