int4

Tag

Cards List
#int4

40+tok/s - optimized recipe for Qwen 3.5 122B Int4 on a single DGX Spark with vLLM

Reddit r/LocalLLaMA · 2026-05-20

User shares an optimized recipe for running Qwen 3.5 122B Int4 on a single DGX Spark with vLLM, achieving over 40 tokens per second. They invite others to try and further optimize it.

0 favorites 0 likes
#int4

Qwen3.6-27B KLDs - INTs and NVFPs

Reddit r/LocalLLaMA · 2026-04-22

Reddit post compares quantized Qwen3.6-27B variants (INT4, NVFP4, BF16-INT4) showing trade-offs between memory size and accuracy for different use-cases.

0 favorites 0 likes
← Back to home

Submit Feedback