Skip to content
Advertisement

In pytorch, how to calculate gradient for a element in a tensor when it is used to calculate another element in this tensor?

In this pytorch code:

import torch
a = torch.tensor([2.], requires_grad=True)
y = torch.zeros((10))
gt = torch.zeros((10))

y[0] = a
y[1] = y[0] * 2
y.retain_grad()

loss = torch.sum((y-gt) ** 2)
loss.backward()
print(y.grad)

I want y[0]’s gradient to consist 2 parts:

  1. loss backward to y[0] itself.
  2. y[0] is used to calculate y[1], so it should have the part of y[1]’s gradient.

but when I run this code, there is only part 1 in y[0]’s gradient.

So how to make y[0]’s gradient to have all 2 parts?

edit: the output is:

tensor([4., 8., 0., 0., 0., 0., 0., 0., 0., 0.])

but I expect:

tensor([20., 8., 0., 0., 0., 0., 0., 0., 0., 0.])

Advertisement

Answer

y[0] and y[1] are two different elements, therefore they have different grad. The only thing that “binds” them is the underlying relation to a. If you inspect the grad of a, you’ll see:

print(a.grad)
tensor([20.])

That is, the two parts of the gradients are combined in a.grad.

User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement