vllm-project / vllm Public

Notifications You must be signed in to change notification settings
Fork 4.1k
Star 27.6k

Code
Issues 1.6k
Pull requests 416
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Issues: vllm-project/vllm

[Roadmap] vLLM Roadmap Q3 2024

#5805 opened Jun 25, 2024 by simon-mo

Open 41

vLLM's V2 Engine Architecture

#8779 opened Sep 24, 2024 by simon-mo

Open 5

Hardware Backend Deprecation Policy

#8932 opened Sep 29, 2024 by youkaichao

Open 2

Labels 49 Milestones 0

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1,610 Open 3,023 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

[Bug]: vllm serve --config.yaml - Order of arguments matters? bug

Something isn't working

#8947 opened Sep 29, 2024 by FloWsnr

1 task done

[Question]: Apply LoRA adapter on quantized model misc

#8945 opened Sep 29, 2024 by Tejaswgupta

1 task done

[Feature]: Qwen2.5 bitsandbytes support feature request

#8941 opened Sep 29, 2024 by hanan9m

1 task done

[New Model]: Molmo support new model

Requests to new models

#8940 opened Sep 29, 2024 by win4r

1 task done

[Usage]: caching with different batches usage

How to use vllm

#8939 opened Sep 29, 2024 by KevinZeng08

1 task done

[Bug]: Error when using tensor_parallel in v0.6.1.post1 or 0.6.2 bug

Something isn't working

#8937 opened Sep 29, 2024 by ruleGreen

1 task done

[Bug]: Vllm0.6.2 UserWarning: resource_tracker: There appear to be 1 leaked shared_memory objects to clean up at shutdown bug

Something isn't working

#8933 opened Sep 29, 2024 by Clint-chan

1 task done

Hardware Backend Deprecation Policy misc

#8932 opened Sep 29, 2024 by youkaichao

1 task done

[Feature]: Get logits instead of lobprobs for distillation feature request

#8926 opened Sep 28, 2024 by nivibilla

1 task done

[Bug]: Model multimodal config initialisation unhandled and irrelevant error when no architectures found bug

Something isn't working

#8923 opened Sep 28, 2024 by AminAlam

1 task done

[Performance] TTFT regression from v0.5.4 to 0.6.2 performance

Performance-related issues

#8918 opened Sep 27, 2024 by rickyyx

1 task done

[RFC]: QuantizationConfig and QuantizeMethodBase Refactor for Simplifying Kernel Integrations RFC

#8913 opened Sep 27, 2024 by LucasWilkinson

1 task done

[Usage]: LLM with tensor_parallel_size larger than n. gpus in one node usage

How to use vllm

#8908 opened Sep 27, 2024 by gpucce

1 task done

[Usage]: guided_regex in offline model usage

How to use vllm

#8907 opened Sep 27, 2024 by RonanKMcGovern

1 task done

[Bug]: Tokenization Mismatch Between HuggingFace and vLLM bug

Something isn't working

#8904 opened Sep 27, 2024 by rafapi

1 task done

[Feature]: Guided Decoding Schema Cache Store feature request

#8902 opened Sep 27, 2024 by berniwal

1 task done

[Performance]: Talk about the model parallelism performance

Performance-related issues

#8898 opened Sep 27, 2024 by baifanxxx

1 task done

[Bug]: RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method usage

How to use vllm

#8893 opened Sep 27, 2024 by Hothan01

[Bug]: Variance Between Mutiple Prefix Cache Example runs bug

Something isn't working

#8890 opened Sep 27, 2024 by Imss27

1 task done

[Installation]: FAILED: CMakeFiles/_C.dir/csrc/quantization/machete/generated/machete_mm_bf16u4_impl_part0.cu.o installation

Installation problems

#8889 opened Sep 27, 2024 by wangshuai09

1 task done

[Bug]: assert len(self._async_stopped) == 0 bug

Something isn't working

#8881 opened Sep 27, 2024 by sfc-gh-zhwang

1 task done

[Bug]: --quantization=awq Using the quantized startup parameters will cause a restart bug

Something isn't working

#8877 opened Sep 27, 2024 by SongXiaoMao

1 task done

[Bug]: Server - aqlm fails with --cpu-offload-gb bug

Something isn't working

#8873 opened Sep 26, 2024 by JMPSequeira

1 task done

[Feature]: Add model context information to chat template feature request

#8869 opened Sep 26, 2024 by maxdebayser

1 task done

[Performance]: Slowdown compared to Gradio performance

Performance-related issues

#8866 opened Sep 26, 2024 by theoren

1 task done

Previous 1 2 3 4 5 … 64 65 Next

Previous Next

ProTip! Exclude everything labeled bug with -label:bug.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly