gpu-vram-mod

#gpu-vram-mod

2 old RTX 2080 Ti with 22GB vram each Qwen3.6 27B at 38 token/s with f16 kv cache

Reddit r/LocalLLaMA ↗ · 12h ago

A user shares their setup using two modded RTX 2080 Ti GPUs with 22GB VRAM each to run Qwen 3.6 27B at 38 tokens/s with llama.cpp, including tips on power limiting, tensor split mode, and KV cache settings.

0 favorites 0 likes

gpu-vram-mod

2 old RTX 2080 Ti with 22GB vram each Qwen3.6 27B at 38 token/s with f16 kv cache

Submit Feedback