You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am currently working with structured outputs and experimented a little with VLLM + Outlines. Since our JSON Schemas can get quite complex the generation of the FSM can take around 2 Minutes per Schema. It would be great to have a feature where you can provide a Schema-Store to save your generated schemas over time in a local file and reload them when you restart your deployment. Ideally this would be implemented as flag in the vllm serve arguments:
I assume that this is currently not supported and the code to not recompute the schema is handled with the @cache() decorator here:
Alternatives
Alternative solution would probably be to create custom python code to handle this for my use-case and use the VLLM python functions for generation instead of the "VLLM serve" command. However not sure how you could handle this with the API Deployment.
Additional context
PS: Happy to contribute to this feature if this is something that can be useful to other people / makes also sense for the people who understand the code base better.
Before submitting a new issue...
Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
The text was updated successfully, but these errors were encountered:
Yes contribution welcomed. However, I believe outlines already have a schema cache nowadays, it might be a better idea to first investigate why that didn't work, or how to get that schema cache working with configurable path
🚀 The feature, motivation and pitch
Problem
I am currently working with structured outputs and experimented a little with VLLM + Outlines. Since our JSON Schemas can get quite complex the generation of the FSM can take around 2 Minutes per Schema. It would be great to have a feature where you can provide a Schema-Store to save your generated schemas over time in a local file and reload them when you restart your deployment. Ideally this would be implemented as flag in the vllm serve arguments:
https://docs.vllm.ai/en/latest/models/engine_args.html
Current Implementation
I assume that this is currently not supported and the code to not recompute the schema is handled with the @cache() decorator here:
Alternatives
Alternative solution would probably be to create custom python code to handle this for my use-case and use the VLLM python functions for generation instead of the "VLLM serve" command. However not sure how you could handle this with the API Deployment.
Additional context
PS: Happy to contribute to this feature if this is something that can be useful to other people / makes also sense for the people who understand the code base better.
Before submitting a new issue...
The text was updated successfully, but these errors were encountered: