256k-context

#256k-context

@AdinaYakup: Keye VL 2.0-30B-A3B New multimodal model from @KwaiKeye 30B/3B active - Apache 2.0 256K context via DeepSeek Sparse Att…

X AI KOLs Following ↗ · 2026-06-01 Cached

KwaiKeye releases Keye VL 2.0-30B-A3B, a multimodal model with 30B total / 3B active parameters, 256K context via DeepSeek Sparse Attention, and Apache 2.0 license, claiming it matches Qwen3 VL and Gemini 3 in accuracy.

0 favorites 0 likes

#256k-context

@iotcoi: Qwen3.6-27B-FP8 + Dflash + DDTree, 256k context, 10 agents ~200 tokens/sec max decode 136t/s average on a single tiny G…

X AI KOLs Timeline ↗ · 2026-04-22 Cached

Quantized 27B Qwen3.6 model achieves 200 tok/s peak (136 avg) with 256k context and 10 agents on a single 49W GB10 GPU using Dflash+DDTree optimizations.

0 favorites 0 likes

256k-context

@AdinaYakup: Keye VL 2.0-30B-A3B New multimodal model from @KwaiKeye 30B/3B active - Apache 2.0 256K context via DeepSeek Sparse Att…

@iotcoi: Qwen3.6-27B-FP8 + Dflash + DDTree, 256k context, 10 agents ~200 tokens/sec max decode 136t/s average on a single tiny G…

Submit Feedback