RTX 5080 and RTX 3090 Setup: 80 Tok/s on Qwen 3.6 27B Q8

Hacker News Top Tools

Summary

A setup using RTX 5080 and RTX 3090 GPUs achieves 80 tokens per second on the Qwen 3.6 27B Q8 model.

No content available
Original Article

Similar Articles

A100 slow Qwen3.6-27B-FP8

Reddit r/LocalLLaMA

The Qwen3.6-27B-FP8 model exhibits slow performance when running on an A100 GPU.