Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Support ray serve engine #35

Open
stikkireddy opened this issue Sep 11, 2024 · 1 comment
Open

[FEATURE] Support ray serve engine #35

stikkireddy opened this issue Sep 11, 2024 · 1 comment
Labels
enhancement New feature or request

Comments

@stikkireddy
Copy link
Owner

stikkireddy commented Sep 11, 2024

Ray Serve is a phenomenal serving engine that abstracts serving and some throughput optimization features like batching, async execution, pipelining, etc. Supports torch and other popular frameworks. This can be used for the following models:

  1. custom embedding models with post processors
  2. standard embedding models
  3. encoder decoder models like whisper
  4. diffusion models
  5. multi model serving
@stikkireddy stikkireddy added the enhancement New feature or request label Sep 11, 2024
@stikkireddy
Copy link
Owner Author

some common embedding models:

  • Multimodal text-image embedding: clip-ViT-B-32
  • Hebrew embeddings: dicta-il/dictabert-joint
  • French embeddings: almanach/camembert-base

@stikkireddy stikkireddy changed the title [FEATURE] Support bentoml engine [FEATURE] Support ray serve engine Sep 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant