ai-serving

Tag

#ai-serving

vllm-project/vllm v0.21.0rc1

GitHub Releases Watchlist ↗ · yesterday Cached

vLLM v0.21.0rc1 is a pre-release update for the high-performance LLM inference and serving library, featuring optimizations for throughput, quantization, and hardware support.

0 favorites 0 likes

← Back to home

ai-serving

vllm-project/vllm v0.21.0rc1

Submit Feedback