Tag
Baseten announces the world's fastest API for the GLM-5.2 open model, achieving over 280 tokens per second via NVFP4 quantization, disaggregated inference, and other optimizations.
Kimi K2.6 is released as an open-weight model with strong agentic capabilities, accessible via FireworksAI’s fast inference APIs.