Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluate vLLM / SGLang benchmarks for baseline when making EzDeploy config changes #5

Open
stikkireddy opened this issue Sep 5, 2024 · 0 comments

Comments

@stikkireddy
Copy link
Owner

stikkireddy commented Sep 5, 2024

  • use one of the benchmarks on either 1xA100 or 2xA100 and confirm improvements in benchmarks before the following:

    • Upgrading versions of vllm or sglang
    • Modifying ez-config changes for quantization, spec decode, prefix caching, or block size defaults
  • tests should be simple notebooks that can be scheduled on a job running the base engines on a single node vm.

  • the goal is not to benchmark e2e latency but identify perf issues when making config changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant