Cheapest hardware for Qwen 3.6: both 27B and 35B-A3B

Reddit r/LocalLLaMA 06/15/26, 09:36 PM Tools

hardware qwen rtx-3090 v100 inference cost-optimization ai-hardware

Summary

Discusses the cheapest hardware options for running Qwen 3.6 models, comparing RTX 3090 and Tesla V100 GPUs, and provides a detailed cost breakdown for a system at around $2000.

\- "Qwen 3.6/3.5 27b > Qwen 3.6/3.5 35b > Gemma4 31b > Qwen 3.5 9b > Gemma4 12b > Gemma4 26b", people say \- "Qwen 3.6 for coding & Agentic, Gemma4 for human sounding text", people say  So I have been eyeing the RTX 3090 24 GB (or sometimes its cheaper Chinese companion RTX 3080 20 GB), and the controversial Tesla v100 32 GB.  Target: 40 tok/s for both these Qwen 3.6  It seems the RTX 3090 24 GB might have a brighter future, when (1) the v100 32GB (both the PCIe and SXM2) will soon be discontinued in support, (2) China will soon release Mythos/Fable equivalent in End 2026-Mid 2027.  Alibaba asks me $2000 for a Single RTX 3090 system that is upgradable to dual RTX 3090 later.  Is there a cheaper way somewhere?  \------------------- | Component | Model | Price | |--------------|--------------------------------|-----------| | CPU | Ryzen 5 5600X | $132.25 | | GPU | MSI RTX 3090 VENTUS 3X 24G | $1,088.15 | | Motherboard | ASUS TUF X570-PLUS | $108.81 | | RAM | Kingston FURY Beast 32GB DDR4 | $251.11 | | SSD | Kingston NV3 1TB NVMe | $131.41 | | PSU | Great Wall 1650W 80+ Gold | $130.41 | | Cooler | Valkyrie AQ125 ARGB | $14.90 | | Case | Phanteks PK620 Full Tower | $120.54 | | Fans | ARGB 120mm ×12 | $18.06 | | \*\*TOTAL\*\* | | \*\*$1,995.65\*\* |  

Original Article

Cheapest hardware for Qwen 3.6: both 27B and 35B-A3B

Similar Articles

@DeepTechTR: Qwen 3.6 27B is incredibly fast with 16 GB VRAM! The impact of Pure Quant The era of the 27B model that runs seamlessly…

Running Qwen3.6 35b a3b on 8gb vram and 32gb ram ~190k context

Qwen 3.6 is actually useful for vibe-coding, and way cheaper than Claude

$1800 (in GPU cost running with P2P running Qwen/Qwen3.6-27b-FP8 with 262K context and BF16 KV cache at 55 tok/s

RTX Pro 4500 Blackwell - Qwen 3.6 27B?

Submit Feedback

Similar Articles

@DeepTechTR: Qwen 3.6 27B is incredibly fast with 16 GB VRAM! The impact of Pure Quant The era of the 27B model that runs seamlessly…

Running Qwen3.6 35b a3b on 8gb vram and 32gb ram ~190k context

Qwen 3.6 is actually useful for vibe-coding, and way cheaper than Claude

$1800 (in GPU cost running with P2P running Qwen/Qwen3.6-27b-FP8 with 262K context and BF16 KV cache at 55 tok/s

RTX Pro 4500 Blackwell - Qwen 3.6 27B?