Tag
KwaiKeye releases Keye VL 2.0-30B-A3B, a multimodal model with 30B total / 3B active parameters, 256K context via DeepSeek Sparse Attention, and Apache 2.0 license, claiming it matches Qwen3 VL and Gemini 3 in accuracy.
Quantized 27B Qwen3.6 model achieves 200 tok/s peak (136 avg) with 256k context and 10 agents on a single 49W GB10 GPU using Dflash+DDTree optimizations.