Pytorch 几个采坑点

  1. tensor.detach() creates a tensor that shares storage with tensor that does not requires grad. This will remove a tensor from a computation graph.
    tensor.clone() creates a copy of tensor that imitates the original tensor's requires_grad field. This still keeps the copy as a part of the computational graph it came from.
    tensor.data returns a new tensor that shares storage with tensor. But it always has requires_grad=False.
  2. gradient 可以理解成一阶近似,所以梯度可以理解成某个变量

猜你喜欢

转载自www.cnblogs.com/lifengfan/p/10367911.html