@ItsmeAjayKV: Achievement Unlocked: Running Qwen3.6-27b dense Thanks to the RTX 3090, now I can do this. Running @Alibaba_Qwen Qwen 3…

X AI KOLs Timeline News

Summary

User benchmarks Qwen3.6-27B on an RTX 3090 using llama.cpp, achieving 35 tok/s generation and 1247 tok/s prompt processing.

Achievement Unlocked: Running Qwen3.6-27b dense Thanks to the RTX 3090, now I can do this. Running @Alibaba_Qwen Qwen 3.6 27B (Q5_K_XL from @UnslothAI) quick llama.cpp benchmark results (without MTP): - 1,247 tok/s prompt processing (512 token prompt) - 35 tok/s generation At ~65K context: - 897 tok/s prompt processing - 34 tok/s generation results are already looking good , qwen 3.6 35b will be flying on this setup, brb.
Original Article
View Cached Full Text

Cached at: 06/17/26, 06:01 PM

Achievement Unlocked: Running Qwen3.6-27b dense

Thanks to the RTX 3090, now I can do this. Running @Alibaba_Qwen Qwen 3.6 27B (Q5_K_XL from @UnslothAI)

quick llama.cpp benchmark results (without MTP):

  • 1,247 tok/s prompt processing (512 token prompt)
  • 35 tok/s generation

At ~65K context:

  • 897 tok/s prompt processing
  • 34 tok/s generation

results are already looking good , qwen 3.6 35b will be flying on this setup, brb.

Similar Articles

Wow! Qwen 3.6:35b-a3b on a 3090... pretty amazing.

Reddit r/artificial

A user shares impressive results running a quantized Qwen 3.6:35b-a3b model on a used RTX 3090, achieving 160 tokens per second output after fitting the model into VRAM, and demonstrates vision capabilities with a 75-second video processing time.