Skip to content
Advertisement

What is the relation between a learning rate scheduler and an optimizer?

If I have a model:

JavaScript

And then I’m defining my inputs, optimizer (with lr=0.1), scheduler (with base_lr=1e-3), and training:

JavaScript

The optimizer seems to take the learning rate of the scheduler

JavaScript

Does the learning rate scheduler overwrite the optimizer? How does it connect to it? Trying to understand the relation between them (i.e how they interact, etc.)

Advertisement

Answer

TL;DR: The LR scheduler contains the optimizer as a member and alters its parameters learning rates explicitly.


As mentioned in PyTorch Official Documentations, the learning rate scheduler receives the optimizer as a parameter in its constructor, and thus has access to its parameters.

The common use is to update the LR after every epoch:

JavaScript

All optimizers inherit from a common parent class torch.nn.Optimizer and are updated using the step method implemented for each of them.

Similarly, all LR schedulers (besides ReduceLROnPlateau) inherit from a common parent class named _LRScheduler. Observing its source code uncovers that in the step method the class indeed changes the LR of the parameters of the optimizer:

JavaScript
User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement