vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

[view on github]last commit: Apr 3, 2026
stars
75,186
7d
-
30d
-
90d
-
## star history
## found in