Extend to working with batches of variable length prompts #11

rchan26 · 2024-08-16T08:55:33Z

Currently, the implementation only works for batches of prompts of the same input size (input is a tensor and generation starts after it), but it would be nice to input list of prompts of variable lengths.

This requires a masking strategy and a way to let the model know to only save generated tokens because if prompts in a batch are of different lengths, we'd have to start generating from the smallest prompt size position and in other batches those already have tokens in those positions. We need a way to only save the generated tokens and keep the known tokens.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extend to working with batches of variable length prompts #11

Extend to working with batches of variable length prompts #11

rchan26 commented Aug 16, 2024

Extend to working with batches of variable length prompts #11

Extend to working with batches of variable length prompts #11

Comments

rchan26 commented Aug 16, 2024