262k-context

#262k-context

$1800 (in GPU cost running with P2P running Qwen/Qwen3.6-27b-FP8 with 262K context and BF16 KV cache at 55 tok/s

Reddit r/LocalLLaMA ↗ · 2d ago

A user shares a configuration of 4x RTX 5060 Ti 16GB with P2P to run Qwen3.6-27B-FP8 at 55 tok/s with 262K context, highlighting the low cost of about $1800 for single-user inference.

0 favorites 0 likes

262k-context

$1800 (in GPU cost running with P2P running Qwen/Qwen3.6-27b-FP8 with 262K context and BF16 KV cache at 55 tok/s

Submit Feedback