RTX 5080 and RTX 3090 Setup: 80 Tok/s on Qwen 3.6 27B Q8

Hacker News Top 06/13/26, 09:55 AM Tools

rtx-5080 rtx-3090 qwen tokens-per-second performance-benchmark gpu-setup ai-inference

Summary

A setup using RTX 5080 and RTX 3090 GPUs achieves 80 tokens per second on the Qwen 3.6 27B Q8 model.

No content available

Original Article

Similar Articles

Good results fine tuning a local LLM like Qwen 3:0.6B to categorize questions

Hacker News Top

A developer fine-tunes a small Qwen 3 0.6B model using the Unsloth framework to categorize household questions, achieving good results with only 850 training examples.

@losterror501: with 2dgx sparks getting 25tok/sec with 1 session and it peaks to 152tok/sec with 8 sessions. Actually insane...

X AI KOLs Timeline

Announcement of Qwable-v1, an open-weights model distilled from Claude Fable-5, along with performance benchmarks on 2dgx sparks hardware achieving 25 tok/sec (single session) and 152 tok/sec (8 sessions).

A100 slow Qwen3.6-27B-FP8

Reddit r/LocalLLaMA

The Qwen3.6-27B-FP8 model exhibits slow performance when running on an A100 GPU.

Qwen 27B for planning, Qwen 35B-A3B for execution?

Reddit r/LocalLLaMA

Discusses using Qwen 27B for planning tasks and Qwen 35B-A3B for execution tasks, suggesting a specialized model approach.

Best local model for vision - 2nd benchmark update - 21 Jun 2026

Reddit r/LocalLLaMA

This post presents the second update of a benchmark for local vision language models, comparing 23 models across 30 images with revised settings, and provides performance recommendations for different VRAM tiers. Key findings include that thinking mode hurts vision performance and that MoE models underperform dense models for perception tasks.

Similar Articles

Good results fine tuning a local LLM like Qwen 3:0.6B to categorize questions

@losterror501: with 2dgx sparks getting 25tok/sec with 1 session and it peaks to 152tok/sec with 8 sessions. Actually insane...

A100 slow Qwen3.6-27B-FP8

Qwen 27B for planning, Qwen 35B-A3B for execution?

Best local model for vision - 2nd benchmark update - 21 Jun 2026

Submit Feedback