Questions about the difference between the paper and the code implementation. #45

epoch599 · 2022-04-19T03:04:49Z

Hi Antreas.
Thanks for the great work MAML++! This will be very helpful!
I have the following questions.

The LSLR mentioned in the paper is not reflected in the code. The learning rate of each layer is fixed, which is no different from setting an inner loop learning rate directly.
The CA mentioned in the paper, but the same value is set in args.meta_learning_rate and args.min_learning_rate. At this time, the learning rate of the outer loop is fixed and does not play the role of cosine annealing.

Thank you in advance for your time!

The text was updated successfully, but these errors were encountered:

AntreasAntoniou · 2023-10-18T11:13:00Z

The inner loop learning rates are learnable parameters, and not fixed.
Yes, the cosine annealing offered further improvements originally in 2019, but I found that as time went on and other improvements happened in the commonly used layers it offered less usefulness. Still, it's something worth considering going forward into new projects -- but for replicating the results in the paper for the full MAML++ result you don't really need it, but using it could potentially improve things further. The ablation table showcases the individual contributions it can have in a setup.

Provide feedback