int4

#int4

40+tok/s - optimized recipe for Qwen 3.5 122B Int4 on a single DGX Spark with vLLM

Reddit r/LocalLLaMA ↗ · 2026-05-20

User shares an optimized recipe for running Qwen 3.5 122B Int4 on a single DGX Spark with vLLM, achieving over 40 tokens per second. They invite others to try and further optimize it.

0 favorites 0 likes

#int4

Qwen3.6-27B KLDs - INTs and NVFPs

Reddit r/LocalLLaMA ↗ · 2026-04-22

Reddit post compares quantized Qwen3.6-27B variants (INT4, NVFP4, BF16-INT4) showing trade-offs between memory size and accuracy for different use-cases.

0 favorites 0 likes

int4

40+tok/s - optimized recipe for Qwen 3.5 122B Int4 on a single DGX Spark with vLLM

Qwen3.6-27B KLDs - INTs and NVFPs

Submit Feedback