Skip to content
This repository has been archived by the owner on Jul 1, 2024. It is now read-only.

Auto scale learning rate based on batch size #287

Open
vreis opened this issue Dec 5, 2019 · 1 comment
Open

Auto scale learning rate based on batch size #287

vreis opened this issue Dec 5, 2019 · 1 comment
Labels
enhancement New feature or request

Comments

@vreis
Copy link
Contributor

vreis commented Dec 5, 2019

🚀 Feature

Auto scale learning rate based on batch size

Motivation

Changing the number of workers in distributed training requires adjusting hyperparameters. https://arxiv.org/abs/1706.02677 proposed a linear scaling rule to adjust the learning rate based on the batch size

Pitch

ClassificationTask should have a flag (default True), that would rescale the learning rate based on the batch size. The task is a natural place to put this since we don't want all parameter schedulers to reimplement the same logic. We could consider having the same in the optimizer instead, but I have a sense it'll require more boilerplate.

Alternatives

Hydra (http://hydra.cc) would enable a different solution for this problem: the config file could have a "rescale" parameter for the learning rate, and we could use the "interpolation" feature to rescale by "1/{batch_size}", where batch_size is defined elsewhere in the config.

@vreis vreis added the enhancement New feature or request label Dec 5, 2019
@omry
Copy link

omry commented Jan 9, 2020

Interpolation does not support arithmetic operations (there is an enhancement request in OmegaConf that I will consider in the future).

For now, you could use to get the batch size into the model, and do the auto scaling in code.

model:
   params:
      ...
      batch_size: ${batch_size}

and do the division in the code.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants