ai-serving

Tag

Cards List
#ai-serving

vllm-project/vllm v0.21.0rc1

GitHub Releases Watchlist · yesterday Cached

vLLM v0.21.0rc1 is a pre-release update for the high-performance LLM inference and serving library, featuring optimizations for throughput, quantization, and hardware support.

0 favorites 0 likes
← Back to home

Submit Feedback