Skip to content

Tag: gradient-descent

In PyTorch, how do I update a neural network via the average gradient from a list of losses?

I have a toy reinforcement learning project based on the REINFORCE algorithm (here’s PyTorch’s implementation) that I would like to add batch updates to. In RL, the “target” can only be created after a “prediction” has been made, so standard batching techniques do not apply. As such, I accrue losses for each episode and append them to a list l_losses

SGDRegressor() constantly not increasing validation performance

The model fit of my SGDRegressor wont increase or decrease its performance on the validation set (test) after around 20’000 training records. Even if I try to switch penalty, early_stopping (True/False) or alpha,eta0 to extremely high or low levels, there is no change in the behavior of the “stuck” validation score test. I used StandardScaler and shuffled the data for

Why do we need to call zero_grad() in PyTorch?

Why does zero_grad() need to be called during training? Answer In PyTorch, for every mini-batch during the training phase, we typically want to explicitly set the gradients to zero before starting to do backpropragation (i.e., updating the Weights and biases) because PyTorch accumulates the gradients on subsequent backward passes. This accumulating behaviour is convenient while training RNNs or when we
