Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend to working with batches of variable length prompts #11

Open
rchan26 opened this issue Aug 16, 2024 · 0 comments
Open

Extend to working with batches of variable length prompts #11

rchan26 opened this issue Aug 16, 2024 · 0 comments

Comments

@rchan26
Copy link
Collaborator

rchan26 commented Aug 16, 2024

Currently, the implementation only works for batches of prompts of the same input size (input is a tensor and generation starts after it), but it would be nice to input list of prompts of variable lengths.

This requires a masking strategy and a way to let the model know to only save generated tokens because if prompts in a batch are of different lengths, we'd have to start generating from the smallest prompt size position and in other batches those already have tokens in those positions. We need a way to only save the generated tokens and keep the known tokens.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant