Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about the difference between the paper and the code implementation. #45

Open
epoch599 opened this issue Apr 19, 2022 · 1 comment

Comments

@epoch599
Copy link

epoch599 commented Apr 19, 2022

Hi Antreas.
Thanks for the great work MAML++! This will be very helpful!
I have the following questions.

  1. The LSLR mentioned in the paper is not reflected in the code. The learning rate of each layer is fixed, which is no different from setting an inner loop learning rate directly.
  2. The CA mentioned in the paper, but the same value is set in args.meta_learning_rate and args.min_learning_rate. At this time, the learning rate of the outer loop is fixed and does not play the role of cosine annealing.

Thank you in advance for your time!

@AntreasAntoniou
Copy link
Owner

  1. The inner loop learning rates are learnable parameters, and not fixed.
  2. Yes, the cosine annealing offered further improvements originally in 2019, but I found that as time went on and other improvements happened in the commonly used layers it offered less usefulness. Still, it's something worth considering going forward into new projects -- but for replicating the results in the paper for the full MAML++ result you don't really need it, but using it could potentially improve things further. The ablation table showcases the individual contributions it can have in a setup.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants