In this pytorch code: I want y[0]'s gradient to consist 2 parts: loss backward to y[0] itself. y[0] is used to calculate y[1], so it should have the part of y[1]'s gradient. but when I run this code, there is only part 1 in y[0]'s gradient. So how to make y[0]'s gradient to have all 2 parts? edit: the output

In pytorch, how to calculate gradient for a element in a tensor when it is used to calculate another element in this tensor?

In this pytorch code:

import torch
a = torch.tensor([2.], requires_grad=True)
y = torch.zeros((10))
gt = torch.zeros((10))

y[0] = a
y[1] = y[0] * 2
y.retain_grad()

loss = torch.sum((y-gt) ** 2)
loss.backward()
print(y.grad)

JavaScript
​x
 
import torch
a = torch.tensor([2.], requires_grad=True)
y = torch.zeros((10))
gt = torch.zeros((10))
​
y[0] = a
y[1] = y[0] * 2
y.retain_grad()
​
loss = torch.sum((y-gt) ** 2)
loss.backward()
print(y.grad)
​

I want y[0]’s gradient to consist 2 parts:

loss backward to y[0] itself.
y[0] is used to calculate y[1], so it should have the part of y[1]’s gradient.

but when I run this code, there is only part 1 in y[0]’s gradient.

So how to make y[0]’s gradient to have all 2 parts?

edit: the output is:

tensor([4., 8., 0., 0., 0., 0., 0., 0., 0., 0.])

JavaScript
 
tensor([4., 8., 0., 0., 0., 0., 0., 0., 0., 0.])
​

but I expect:

tensor([20., 8., 0., 0., 0., 0., 0., 0., 0., 0.])

JavaScript
 
tensor([20., 8., 0., 0., 0., 0., 0., 0., 0., 0.])
​

Answer

y[0] and y[1] are two different elements, therefore they have different grad. The only thing that “binds” them is the underlying relation to a. If you inspect the grad of a, you’ll see:

print(a.grad)

JavaScript
 
print(a.grad)
​

tensor([20.])
JavaScript
1
2
1
tensor([20.])
2

That is, the two parts of the gradients are combined in a.grad.

Advertisement

Answer