In this pytorch code:
import torch a = torch.tensor([2.], requires_grad=True) y = torch.zeros((10)) gt = torch.zeros((10)) y[0] = a y[1] = y[0] * 2 y.retain_grad() loss = torch.sum((y-gt) ** 2) loss.backward() print(y.grad)
I want y[0]’s gradient to consist 2 parts:
- loss backward to y[0] itself.
- y[0] is used to calculate y[1], so it should have the part of y[1]’s gradient.
but when I run this code, there is only part 1 in y[0]’s gradient.
So how to make y[0]’s gradient to have all 2 parts?
edit: the output is:
tensor([4., 8., 0., 0., 0., 0., 0., 0., 0., 0.])
but I expect:
tensor([20., 8., 0., 0., 0., 0., 0., 0., 0., 0.])
Advertisement
Answer
y[0]
and y[1]
are two different elements, therefore they have different grad
. The only thing that “binds” them is the underlying relation to a
. If you inspect the grad of a
, you’ll see:
print(a.grad)
tensor([20.])
That is, the two parts of the gradients are combined in a.grad
.