Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Debug Flash GPT2 #4

Open
wants to merge 19 commits into
base: main
Choose a base branch
from

Conversation

abhi-mosaic
Copy link

@abhi-mosaic abhi-mosaic commented Oct 19, 2022

This PR cleans up the HF Flash model and also makes it easy to train a vanilla HF GPT2LMHeadModel.

All FSDP related code is now moved out of the HF model class, and is fully contained in def prepare_hf_gpt2_model_for_fsdp(model):

The Composer version has also been upgraded a bit, will tag it to v0.11 as soon as I can (ideally end of week).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant