vllm-project/vllm v0.19.2rc0: [Bugfix] Fix k_proj's bias for GLM-ASR (#40160)
Summary
vLLM v0.19.2rc0 release candidate includes a bugfix for k_proj's bias handling in GLM-ASR models, addressing a specific compatibility issue in the LLM serving framework.
View Cached Full Text
Cached at: 04/20/26, 08:36 AM
Easy, fast, and cheap LLM serving for everyone
| Documentation | Blog | Paper | Twitter/X | User Forum | Developer Slack |
Similar Articles
vllm-project/vllm v0.19.1rc0: [Misc] Clean up Gemma4 implementation (#38872)
vLLM v0.19.1rc0 release includes cleanup of Gemma4 implementation as part of routine maintenance and optimization of the popular open-source LLM inference and serving library.
vllm-project/vllm v0.21.0rc1
vLLM v0.21.0rc1 is a pre-release update for the high-performance LLM inference and serving library, featuring optimizations for throughput, quantization, and hardware support.
vllm-project/vllm v0.20.1
vLLM v0.20.1 is a minor version update for the popular open-source LLM inference and serving library, maintaining its focus on high-throughput and efficient memory management.
vllm-project/vllm v0.20.0rc1
vLLM 0.20.0rc1 releases with major throughput, quantization, speculative decoding, and multi-hardware support enhancements for scalable LLM serving.
vllm-project/vllm v0.19.1
vLLM v0.19.1 release - a fast and easy-to-use open-source library for LLM inference and serving with state-of-the-art throughput, supporting 200+ model architectures and diverse hardware including NVIDIA/AMD GPUs and CPUs.