fp8-quantization

Tag

Cards List
#fp8-quantization

@iotcoi: Qwen3.6-27B-FP8 + Dflash + DDTree, 256k context, 10 agents ~200 tokens/sec max decode 136t/s average on a single tiny G…

X AI KOLs Timeline · 2026-04-22 Cached

Quantized 27B Qwen3.6 model achieves 200 tok/s peak (136 avg) with 256k context and 10 agents on a single 49W GB10 GPU using Dflash+DDTree optimizations.

0 favorites 0 likes
#fp8-quantization

Qwen/Qwen3.6-27B-FP8

Hugging Face Models Trending · 2026-04-21 Cached

Alibaba releases Qwen3.6-27B-FP8, a 27B FP8-quantized model with strong agentic coding and reasoning benchmarks, now available on Hugging Face.

0 favorites 0 likes
← Back to home

Submit Feedback