Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Guided Decoding Schema Cache Store #8902

Open
1 task done
berniwal opened this issue Sep 27, 2024 · 1 comment
Open
1 task done

[Feature]: Guided Decoding Schema Cache Store #8902

berniwal opened this issue Sep 27, 2024 · 1 comment

Comments

@berniwal
Copy link

berniwal commented Sep 27, 2024

🚀 The feature, motivation and pitch

Problem

I am currently working with structured outputs and experimented a little with VLLM + Outlines. Since our JSON Schemas can get quite complex the generation of the FSM can take around 2 Minutes per Schema. It would be great to have a feature where you can provide a Schema-Store to save your generated schemas over time in a local file and reload them when you restart your deployment. Ideally this would be implemented as flag in the vllm serve arguments:

https://docs.vllm.ai/en/latest/models/engine_args.html

Current Implementation

I assume that this is currently not supported and the code to not recompute the schema is handled with the @cache() decorator here:
Screenshot 2024-09-27 134948

Alternatives

Alternative solution would probably be to create custom python code to handle this for my use-case and use the VLLM python functions for generation instead of the "VLLM serve" command. However not sure how you could handle this with the API Deployment.

Additional context

PS: Happy to contribute to this feature if this is something that can be useful to other people / makes also sense for the people who understand the code base better.

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
@simon-mo
Copy link
Collaborator

Yes contribution welcomed. However, I believe outlines already have a schema cache nowadays, it might be a better idea to first investigate why that didn't work, or how to get that schema cache working with configurable path

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants